|
|
Chromium Code Reviews|
Created:
4 years, 9 months ago by aluebs-webrtc Modified:
4 years, 9 months ago CC:
webrtc-reviews_webrtc.org, Andrew MacDonald, tterriberry_mozilla.com, audio-team_agora.io, hlundin-webrtc, the sun, aluebs-webrtc, bjornv1 Base URL:
https://chromium.googlesource.com/external/webrtc.git@master Target Ref:
refs/pending/heads/master Project:
webrtc Visibility:
Public. |
DescriptionRemove usage of fmaf in IntelligibilityEnhancer
This produces bit-exact output and doesn't have the performance sensitivity to vectorisation, giving a complexity decrease of the IntelligibilityEnhancer of about 30x in my local machine.
This performance issue was put in evidence by this CL: https://codereview.webrtc.org/1693823004/
BUG=590998
Committed: https://crrev.com/a2abdf2fbea1320ac8ad94a81b40dfe6f36100d3
Cr-Commit-Position: refs/heads/master@{#11851}
Patch Set 1 #
Messages
Total messages: 16 (6 generated)
Description was changed from ========== Remove usage of fmaf in IntelligibilityEnhancer BUG=590998 ========== to ========== Remove usage of fmaf in IntelligibilityEnhancer This produces bit-exact output and doesn't have the performance sensitivity to vectorisation, giving a complexity decrease of the IntelligibilityEnhancer of about 30x in my local machine. BUG=590998 ==========
aluebs@webrtc.org changed reviewers: + henrik.lundin@webrtc.org, minyue@webrtc.org, peah@webrtc.org, turaj@webrtc.org
Description was changed from ========== Remove usage of fmaf in IntelligibilityEnhancer This produces bit-exact output and doesn't have the performance sensitivity to vectorisation, giving a complexity decrease of the IntelligibilityEnhancer of about 30x in my local machine. BUG=590998 ========== to ========== Remove usage of fmaf in IntelligibilityEnhancer This produces bit-exact output and doesn't have the performance sensitivity to vectorisation, giving a complexity decrease of the IntelligibilityEnhancer of about 30x in my local machine. This performance issue was put in evidence by this CL: https://codereview.webrtc.org/1693823004/ BUG=590998 ==========
lgtm
lgtm, but... Since I had to look up fmaf to see what it did (http://en.cppreference.com/w/c/numeric/math/fma) I found that on some hardware platforms, the fma operations can be expected to be much faster. On these platforms, FP_FAST_FMAF should be defined. If you want to, you could define different variants of the code depending on FP_FAST_FMAF. It is up to you.
On 2016/03/02 07:02:42, hlundin-webrtc wrote: > lgtm, but... > > Since I had to look up fmaf to see what it did > (http://en.cppreference.com/w/c/numeric/math/fma) I found that on some hardware > platforms, the fma operations can be expected to be much faster. On these > platforms, FP_FAST_FMAF should be defined. If you want to, you could define > different variants of the code depending on FP_FAST_FMAF. It is up to you. Alex, could you try fmaf with FP_FAST_FMAF and see if there is still a benefit.
On 2016/03/02 07:02:42, hlundin-webrtc wrote: > lgtm, but... > > Since I had to look up fmaf to see what it did > (http://en.cppreference.com/w/c/numeric/math/fma) I found that on some hardware > platforms, the fma operations can be expected to be much faster. On these > platforms, FP_FAST_FMAF should be defined. If you want to, you could define > different variants of the code depending on FP_FAST_FMAF. It is up to you. At a wild guess, on such platforms a competent compiler would fuse the operations anyway. The page you linked to says "As with all floating-point expressions, the expression (x*y) + z may be compiled as a fused mutiply-add unless the #pragma STDC FP_CONTRACT is off." and the docs for #pragma STDC FP_CONTRACT (http://en.cppreference.com/w/cpp/preprocessor/impl) say "Allows contracting of floating-point expressions, that is optimizations that omit rounding errors and floating-point exceptions that would be observed if the expression was evaluated exactly as written. For example, allows the implementation of (x*y) + z with a single fused multiply-add CPU instruction. The default value is implementation-defined, usually ON." If you wanted to, you could define your own inline functions that did fma or not depending on the value of FP_FAST_FMA*, but I'm guessing it's not worth the decreased readability of the code.
On 2016/03/02 09:20:24, kwiberg-webrtc wrote: > On 2016/03/02 07:02:42, hlundin-webrtc wrote: > > lgtm, but... > > > > Since I had to look up fmaf to see what it did > > (http://en.cppreference.com/w/c/numeric/math/fma) I found that on some > hardware > > platforms, the fma operations can be expected to be much faster. On these > > platforms, FP_FAST_FMAF should be defined. If you want to, you could define > > different variants of the code depending on FP_FAST_FMAF. It is up to you. > > At a wild guess, on such platforms a competent compiler would fuse the > operations anyway. The page you linked to says > > "As with all floating-point expressions, the expression (x*y) + z may be > compiled as a fused mutiply-add unless the #pragma STDC FP_CONTRACT is off." > > and the docs for #pragma STDC FP_CONTRACT > (http://en.cppreference.com/w/cpp/preprocessor/impl) say > > "Allows contracting of floating-point expressions, that is optimizations that > omit rounding errors and floating-point exceptions that would be observed if the > expression was evaluated exactly as written. For example, allows the > implementation of (x*y) + z with a single fused multiply-add CPU instruction. > The default value is implementation-defined, usually ON." > > If you wanted to, you could define your own inline functions that did fma or not > depending on the value of FP_FAST_FMA*, but I'm guessing it's not worth the > decreased readability of the code. lgtm
From kwiberg's comment I am going to land this as is. Plus knowing that "fmaf weird performance" is the 3rd search result for "fmaf c++" and this could increase complexity by 30x and the fact that it would be hard to test this on all different platforms, I think it is not worth the risk.
The CQ bit was checked by aluebs@webrtc.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1755943002/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1755943002/1
Message was sent while issue was closed.
Description was changed from ========== Remove usage of fmaf in IntelligibilityEnhancer This produces bit-exact output and doesn't have the performance sensitivity to vectorisation, giving a complexity decrease of the IntelligibilityEnhancer of about 30x in my local machine. This performance issue was put in evidence by this CL: https://codereview.webrtc.org/1693823004/ BUG=590998 ========== to ========== Remove usage of fmaf in IntelligibilityEnhancer This produces bit-exact output and doesn't have the performance sensitivity to vectorisation, giving a complexity decrease of the IntelligibilityEnhancer of about 30x in my local machine. This performance issue was put in evidence by this CL: https://codereview.webrtc.org/1693823004/ BUG=590998 ==========
Message was sent while issue was closed.
Committed patchset #1 (id:1)
Message was sent while issue was closed.
Description was changed from ========== Remove usage of fmaf in IntelligibilityEnhancer This produces bit-exact output and doesn't have the performance sensitivity to vectorisation, giving a complexity decrease of the IntelligibilityEnhancer of about 30x in my local machine. This performance issue was put in evidence by this CL: https://codereview.webrtc.org/1693823004/ BUG=590998 ========== to ========== Remove usage of fmaf in IntelligibilityEnhancer This produces bit-exact output and doesn't have the performance sensitivity to vectorisation, giving a complexity decrease of the IntelligibilityEnhancer of about 30x in my local machine. This performance issue was put in evidence by this CL: https://codereview.webrtc.org/1693823004/ BUG=590998 Committed: https://crrev.com/a2abdf2fbea1320ac8ad94a81b40dfe6f36100d3 Cr-Commit-Position: refs/heads/master@{#11851} ==========
Message was sent while issue was closed.
Patchset 1 (id:??) landed as https://crrev.com/a2abdf2fbea1320ac8ad94a81b40dfe6f36100d3 Cr-Commit-Position: refs/heads/master@{#11851} |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
