|
|
Created:
3 years, 8 months ago by peah-webrtc Modified:
3 years, 8 months ago CC:
webrtc-reviews_webrtc.org, AleBzk, peah-webrtc, Andrew MacDonald, aleloi, tterriberry_mozilla.com, audio-team_agora.io, hlundin-webrtc, kwiberg-webrtc, minyue-webrtc, the sun, aluebs-webrtc, bjornv1 Target Ref:
refs/heads/master Project:
webrtc Visibility:
Public. |
DescriptionFinalized the SSE2 optimizations for the matched filter in AEC3
The SSE2 optimizations of the filter core in the matched
filter was only half-done. This CL finalizes those.
In particular:
-It adds finalization of updating of the filter.
-It removes the manual loop unrolling in order to reduce and
simplify the code.
Note that the changes pass the bitexactness tests in an
external AEC3 test suite, and the test
MatchedFilter.TestOptimizations succeed.
BUG=webrtc:6018
Review-Url: https://codereview.webrtc.org/2813563003
Cr-Commit-Position: refs/heads/master@{#17655}
Committed: https://chromium.googlesource.com/external/webrtc/+/b213a16b2801a0ebdf0e5497d96b1feaff9fe9e6
Patch Set 1 #
Total comments: 4
Patch Set 2 : Changes in response to reviewer comments #
Total comments: 18
Patch Set 3 : Changes in response to reviewer comments #
Messages
Total messages: 16 (9 generated)
Patchset #1 (id:1) has been deleted
Description was changed from ========== Finalized the SSE2 optimizations for the matched filter in AEC3 BUG= ========== to ========== Finalized the SSE2 optimizations for the matched filter in AEC3 The SSE2 optimizations of the filter core in the matched filter was only half-done. This CL finalizes those. In particular: -It adds finalization of updating of the filter. -It removes the manual loop unrolling in order to reduce and simplify the code. BUG=webrtc:6018 ==========
Description was changed from ========== Finalized the SSE2 optimizations for the matched filter in AEC3 The SSE2 optimizations of the filter core in the matched filter was only half-done. This CL finalizes those. In particular: -It adds finalization of updating of the filter. -It removes the manual loop unrolling in order to reduce and simplify the code. BUG=webrtc:6018 ========== to ========== Finalized the SSE2 optimizations for the matched filter in AEC3 The SSE2 optimizations of the filter core in the matched filter was only half-done. This CL finalizes those. In particular: -It adds finalization of updating of the filter. -It removes the manual loop unrolling in order to reduce and simplify the code. Note that the changes pass the bitexactness tests in an external AEC3 test suite, and the test MatchedFilter.TestOptimizations pass. BUG=webrtc:6018 ==========
Description was changed from ========== Finalized the SSE2 optimizations for the matched filter in AEC3 The SSE2 optimizations of the filter core in the matched filter was only half-done. This CL finalizes those. In particular: -It adds finalization of updating of the filter. -It removes the manual loop unrolling in order to reduce and simplify the code. Note that the changes pass the bitexactness tests in an external AEC3 test suite, and the test MatchedFilter.TestOptimizations pass. BUG=webrtc:6018 ========== to ========== Finalized the SSE2 optimizations for the matched filter in AEC3 The SSE2 optimizations of the filter core in the matched filter was only half-done. This CL finalizes those. In particular: -It adds finalization of updating of the filter. -It removes the manual loop unrolling in order to reduce and simplify the code. Note that the changes pass the bitexactness tests in an external AEC3 test suite, and the test MatchedFilter.TestOptimizations succeed. BUG=webrtc:6018 ==========
peah@webrtc.org changed reviewers: + aleloi@webrtc.org, ivoc@webrtc.org
Hi, This is a CL that finalizes the SSE2 optimization of the filter core in the matched filter in AEC3. PTAL.
LGTM with 2 remarks. https://codereview.webrtc.org/2813563003/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/aec3/matched_filter.cc (right): https://codereview.webrtc.org/2813563003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:44: const float* x_p = &x[x_start_index]; A DCHECK to verify that x_start_index < x.size() would be good. https://codereview.webrtc.org/2813563003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:50: const int chunk1 = I think the code in this function is good, but some more comments would be nice (since SSE code is a bit hard to read).
Thanks for the comments! I've uploaded a new patch with the DCHECK and more detailed documented SSE2 code. https://codereview.webrtc.org/2813563003/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/aec3/matched_filter.cc (right): https://codereview.webrtc.org/2813563003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:44: const float* x_p = &x[x_start_index]; On 2017/04/10 13:29:06, ivoc wrote: > A DCHECK to verify that x_start_index < x.size() would be good. Good point! Done. https://codereview.webrtc.org/2813563003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:50: const int chunk1 = On 2017/04/10 13:29:06, ivoc wrote: > I think the code in this function is good, but some more comments would be nice > (since SSE code is a bit hard to read). Done.
LGTM, except for minor comments. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/aec3/matched_filter.cc (left): https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:115: // Apply the matched filter as filter * x. and compute x * x. Typo: replace period + extra space with comma. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/aec3/matched_filter.cc (right): https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:40: RTC_DCHECK_EQ(0, h.size() % 4); The DCHECK should be moved outside the loop, and h.size() be replaced with h_size. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:42: RTC_DCHECK_GT(x.size(), x_start_index); x.size() --> x_size https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:60: for (int n = 0; n < 2; ++n) { for (int limit : {chunk1, chunk2}) removes the need to update limit and introduce variable 'n'. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:64: // Load the data into 128 bit vectors.. .. --> . https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:75: for (int j = limit - limit_by_4 * 4; j > 0; --j, ++h_p, ++x_p) { The loop variable is 'j', but the samples are called x_k. Suggest rename j --> k or change x_k to x_j. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:107: for (int n = 0; n < 2; ++n) { Same as above, for (auto limit: {chunk1, chunk2}) https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:111: // Load the data into 128 bit vectors.. .. --> . https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:124: for (int j = limit - limit_by_4 * 4; j > 0; --j, ++h_p, ++x_p) { If j is renamed -->k in the similar loop above, this loop should have consistent naming.
Thanks for the comments! I've done changes accordingly and uploaded a new patch. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/aec3/matched_filter.cc (left): https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:115: // Apply the matched filter as filter * x. and compute x * x. On 2017/04/11 10:57:36, aleloi wrote: > Typo: replace period + extra space with comma. Done. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/aec3/matched_filter.cc (right): https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:40: RTC_DCHECK_EQ(0, h.size() % 4); On 2017/04/11 10:57:36, aleloi wrote: > The DCHECK should be moved outside the loop, and h.size() be replaced with > h_size. Done. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:42: RTC_DCHECK_GT(x.size(), x_start_index); On 2017/04/11 10:57:36, aleloi wrote: > x.size() --> x_size Done. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:60: for (int n = 0; n < 2; ++n) { On 2017/04/11 10:57:36, aleloi wrote: > for (int limit : {chunk1, chunk2}) removes the need to update limit and > introduce variable 'n'. Supernice! Done. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:64: // Load the data into 128 bit vectors.. On 2017/04/11 10:57:36, aleloi wrote: > .. --> . Done. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:75: for (int j = limit - limit_by_4 * 4; j > 0; --j, ++h_p, ++x_p) { On 2017/04/11 10:57:36, aleloi wrote: > The loop variable is 'j', but the samples are called x_k. Suggest rename j --> k > or change x_k to x_j. Done. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:107: for (int n = 0; n < 2; ++n) { On 2017/04/11 10:57:36, aleloi wrote: > Same as above, for (auto limit: {chunk1, chunk2}) Done. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:111: // Load the data into 128 bit vectors.. On 2017/04/11 10:57:36, aleloi wrote: > .. --> . Done. https://codereview.webrtc.org/2813563003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/aec3/matched_filter.cc:124: for (int j = limit - limit_by_4 * 4; j > 0; --j, ++h_p, ++x_p) { On 2017/04/11 10:57:36, aleloi wrote: > If j is renamed -->k in the similar loop above, this loop should have consistent > naming. Done.
The CQ bit was checked by peah@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from ivoc@webrtc.org, aleloi@webrtc.org Link to the patchset: https://codereview.webrtc.org/2813563003/#ps60001 (title: "Changes in response to reviewer comments")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
CQ is committing da patch. Bot data: {"patchset_id": 60001, "attempt_start_ts": 1491917159193500, "parent_rev": "c0d74d9684189a7a225d080198b7b93ed550b587", "commit_rev": "b213a16b2801a0ebdf0e5497d96b1feaff9fe9e6"}
Message was sent while issue was closed.
Description was changed from ========== Finalized the SSE2 optimizations for the matched filter in AEC3 The SSE2 optimizations of the filter core in the matched filter was only half-done. This CL finalizes those. In particular: -It adds finalization of updating of the filter. -It removes the manual loop unrolling in order to reduce and simplify the code. Note that the changes pass the bitexactness tests in an external AEC3 test suite, and the test MatchedFilter.TestOptimizations succeed. BUG=webrtc:6018 ========== to ========== Finalized the SSE2 optimizations for the matched filter in AEC3 The SSE2 optimizations of the filter core in the matched filter was only half-done. This CL finalizes those. In particular: -It adds finalization of updating of the filter. -It removes the manual loop unrolling in order to reduce and simplify the code. Note that the changes pass the bitexactness tests in an external AEC3 test suite, and the test MatchedFilter.TestOptimizations succeed. BUG=webrtc:6018 Review-Url: https://codereview.webrtc.org/2813563003 Cr-Commit-Position: refs/heads/master@{#17655} Committed: https://chromium.googlesource.com/external/webrtc/+/b213a16b2801a0ebdf0e5497d... ==========
Message was sent while issue was closed.
Committed patchset #3 (id:60001) as https://chromium.googlesource.com/external/webrtc/+/b213a16b2801a0ebdf0e5497d... |