|
|
Created:
3 years, 8 months ago by peah-webrtc Modified:
3 years, 7 months ago Reviewers:
ivoc CC:
webrtc-reviews_webrtc.org, AleBzk, peah-webrtc, Andrew MacDonald, aleloi, tterriberry_mozilla.com, audio-team_agora.io, hlundin-webrtc, kwiberg-webrtc, minyue-webrtc, the sun, aluebs-webrtc, bjornv1 Target Ref:
refs/heads/master Project:
webrtc Visibility:
Public. |
DescriptionAdded ARM Neon optimizations for AEC3
This CL adds Neon SIMD optimizations for AEC3 on ARM, resulting
in an 8 times complexity reduction. The optimizations are basically
identical to what was already in place for SSE2.
BUG=chromium:14993, webrtc:6018
Review-Url: https://codereview.webrtc.org/2834073005
Cr-Commit-Position: refs/heads/master@{#17993}
Committed: https://chromium.googlesource.com/external/webrtc/+/f246b91eba0e8d95bd3fee4634887fb6d3017811
Patch Set 1 : Changed the unittest #
Total comments: 6
Patch Set 2 : Changes in response to reviewer comments #Messages
Total messages: 73 (65 generated)
The CQ bit was checked by peah@webrtc.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: presubmit on master.tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/presubmit/builds/16345)
The CQ bit was checked by peah@webrtc.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: presubmit on master.tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/presubmit/builds/16352)
Patchset #11 (id:190001) has been deleted
Patchset #12 (id:230001) has been deleted
Patchset #11 (id:210001) has been deleted
Patchset #1 (id:1) has been deleted
Patchset #1 (id:20001) has been deleted
Patchset #1 (id:40001) has been deleted
Patchset #1 (id:60001) has been deleted
Patchset #1 (id:20002) has been deleted
Patchset #1 (id:90001) has been deleted
Patchset #1 (id:110001) has been deleted
Patchset #1 (id:130001) has been deleted
Patchset #3 (id:250001) has been deleted
Patchset #1 (id:150001) has been deleted
Description was changed from ========== Neon optimizations Neon optimizations BUG= ========== to ========== Added ARM Neon optimizations for AEC3 This CL adds Neon SIMD optimizations for AEC3 on ARM, resulting in an 8 times complexity reduction. The optimizations are basically identical to what was already in place for SSE2. BUG=chromium:14993, webrtc:6018 ==========
Patchset #14 (id:510001) has been deleted
Patchset #12 (id:470001) has been deleted
Patchset #11 (id:450001) has been deleted
Patchset #10 (id:430001) has been deleted
Patchset #9 (id:410001) has been deleted
Patchset #8 (id:390001) has been deleted
Patchset #8 (id:490001) has been deleted
The CQ bit was checked by peah@webrtc.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Patchset #1 (id:170001) has been deleted
Patchset #1 (id:270001) has been deleted
Patchset #1 (id:290001) has been deleted
Patchset #1 (id:310001) has been deleted
Patchset #1 (id:330001) has been deleted
Patchset #1 (id:350001) has been deleted
Patchset #1 (id:370001) has been deleted
Patchset #1 (id:530001) has been deleted
Patchset #1 (id:550001) has been deleted
Patchset #1 (id:570001) has been deleted
Patchset #1 (id:590001) has been deleted
Patchset #1 (id:610001) has been deleted
Patchset #1 (id:630001) has been deleted
Patchset #1 (id:650001) has been deleted
Patchset #1 (id:670001) has been deleted
Patchset #1 (id:690001) has been deleted
Patchset #1 (id:710001) has been deleted
Patchset #1 (id:730001) has been deleted
Patchset #1 (id:750001) has been deleted
Patchset #1 (id:770001) has been deleted
Patchset #1 (id:790001) has been deleted
Patchset #1 (id:810001) has been deleted
Patchset #1 (id:830001) has been deleted
Patchset #1 (id:850001) has been deleted
Patchset #1 (id:870001) has been deleted
Patchset #1 (id:890001) has been deleted
Patchset #1 (id:910001) has been deleted
Patchset #1 (id:930001) has been deleted
peah@webrtc.org changed reviewers: + ivoc@webrtc.org
Hi, Here is a CL with Neon optimizations for AEC3. Could you please take a look?
Nice! I like the multiply-add/multiply-subtract operations in NEON, too bad we can't use them in SSE (too recently added). See a few minor comments below. https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/aec3/adaptive_fir_filter.cc (right): https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/aec3/adaptive_fir_filter.cc:190: const float32x4_t X_re = vld1q_f32(&X->re[k]); This seems cache unfriendly, since data is not read sequentially from memory (the loop over k is the outer loop). The iteration order in the regular non-NEON version looks more cache friendly to me, would it be possible to use the same iteration order here? https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/aec3/adaptive_fir_filter_unittest.cc (right): https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/aec3/adaptive_fir_filter_unittest.cc:46: // Verifies that the optimized methods for filter adaptation are bitexact to Since the comparisons are made with EXPECT_NEAR, this shouldn't be called bitexact, right? https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/aec3/matched_filter_unittest.cc (right): https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/aec3/matched_filter_unittest.cc:47: // Verifies that the optimized methods for NEON are bitexact to their reference Since this uses EXPECT_NEAR, it is also not bitexact, right?
Thanks for the comments! I have uploaded a new patch with changes according to the comments. PTAL https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/aec3/adaptive_fir_filter.cc (right): https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/aec3/adaptive_fir_filter.cc:190: const float32x4_t X_re = vld1q_f32(&X->re[k]); On 2017/05/03 11:40:07, ivoc wrote: > This seems cache unfriendly, since data is not read sequentially from memory > (the loop over k is the outer loop). The iteration order in the regular non-NEON > version looks more cache friendly to me, would it be possible to use the same > iteration order here? I agree. The idea was to being able to keep G_re and G_im in memory. But I think I'm now convinced that it would make sense to do it the other way. Done. https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/aec3/adaptive_fir_filter_unittest.cc (right): https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/aec3/adaptive_fir_filter_unittest.cc:46: // Verifies that the optimized methods for filter adaptation are bitexact to On 2017/05/03 11:40:07, ivoc wrote: > Since the comparisons are made with EXPECT_NEAR, this shouldn't be called > bitexact, right? Good point! Done. https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/aec3/matched_filter_unittest.cc (right): https://codereview.webrtc.org/2834073005/diff/950001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/aec3/matched_filter_unittest.cc:47: // Verifies that the optimized methods for NEON are bitexact to their reference On 2017/05/03 11:40:07, ivoc wrote: > Since this uses EXPECT_NEAR, it is also not bitexact, right? Good point! Done.
LGTM!
Thanks for the quick review!!!
The CQ bit was checked by peah@webrtc.org
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
CQ is committing da patch. Bot data: {"patchset_id": 970001, "attempt_start_ts": 1493814472105320, "parent_rev": "fafd6d850d0710e7279901dda94b67cba5996aba", "commit_rev": "f246b91eba0e8d95bd3fee4634887fb6d3017811"}
Message was sent while issue was closed.
Description was changed from ========== Added ARM Neon optimizations for AEC3 This CL adds Neon SIMD optimizations for AEC3 on ARM, resulting in an 8 times complexity reduction. The optimizations are basically identical to what was already in place for SSE2. BUG=chromium:14993, webrtc:6018 ========== to ========== Added ARM Neon optimizations for AEC3 This CL adds Neon SIMD optimizations for AEC3 on ARM, resulting in an 8 times complexity reduction. The optimizations are basically identical to what was already in place for SSE2. BUG=chromium:14993, webrtc:6018 Review-Url: https://codereview.webrtc.org/2834073005 Cr-Commit-Position: refs/heads/master@{#17993} Committed: https://chromium.googlesource.com/external/webrtc/+/f246b91eba0e8d95bd3fee463... ==========
Message was sent while issue was closed.
Committed patchset #2 (id:970001) as https://chromium.googlesource.com/external/webrtc/+/f246b91eba0e8d95bd3fee463...
Message was sent while issue was closed.
A revert of this CL (patchset #2 id:970001) has been created in https://codereview.webrtc.org/2856113003/ by peah@webrtc.org. The reason for reverting is: The bug number for the chromium bug was wrong. . |