|
|
Created:
5 years, 2 months ago by aluebs-webrtc Modified:
5 years, 2 months ago Reviewers:
Andrew MacDonald CC:
webrtc-reviews_webrtc.org, Andrew MacDonald, tterriberry_mozilla.com, hlundin-webrtc, kwiberg-webrtc, the sun, bjornv1 Base URL:
https://chromium.googlesource.com/external/webrtc.git@master Target Ref:
refs/pending/heads/master Project:
webrtc Visibility:
Public. |
DescriptionImplement new version of the NonlinearBeamformer
Sounds better according to a MUSHRA listening test.
The computational complexity is unaffected.
An empirically estimated gain was added to compensate for the attenuation introduced by the algorithm.
There are some TODOs, which I will address in follow up CLs.
It was tested in Hangouts without headphones and highest volume, to make sure it doesn't affect the AEC.
Committed: https://crrev.com/45daf7b26f49793c30e395f7ba7be30aa51936bb
Cr-Commit-Position: refs/heads/master@{#10308}
Patch Set 1 #
Total comments: 23
Patch Set 2 : Use std::norm instead of twice std::abs #Patch Set 3 : Widen beam and compensate attenuation #
Total comments: 6
Patch Set 4 : Calculate norm where it is used #Patch Set 5 : Fix float constant #
Dependent Patchsets: Messages
Total messages: 25 (10 generated)
aluebs@webrtc.org changed reviewers: + ajm@chromium.org
On 2015/10/02 23:27:00, aluebs-webrtc wrote: Drive by review: Exciting! Some questions: -It would be motivated to release this in an A/B manner in order to verify the results of the Mushra test on the larger, non-expert, user-base, have you considered that? -What kind of regression tests have been done in order to ensure that the new version does not cause any new problems for the AEC?
On 2015/10/03 07:08:41, peah-webrtc wrote: > On 2015/10/02 23:27:00, aluebs-webrtc wrote: > > Drive by review: > Exciting! Some questions: > -It would be motivated to release this in an A/B manner in order to verify the > results of the Mushra test on the larger, non-expert, user-base, have you > considered that? > -What kind of regression tests have been done in order to ensure that the new > version does not cause any new problems for the AEC? Thank you Per for your input. As discussed today in the meeting I don't think that having both beamformers for an A/B test is a good idea, since that would take a significant effort plus a potentially inconsistent experience for the user. Anyway, they are quite similar and one is just a second version of the same idea of the other. On the other hand, I have not yet tested how this new beamformer affects AEC, but you are right, I definitively will before landing this change.
aluebs@webrtc.org changed reviewers: + andrew@webrtc.org - ajm@chromium.org
https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:28: float norm(const ComplexMatrix<float>& x) { Norm Add a comment clarifying what it computes. This is only used at create-time, right? I ask because std::abs can be slow on complex values, see cr/101731720. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:30: size_t length = x.num_columns(); const https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:34: result += std::abs(elems[i]) * std::abs(elems[i]); Don't compute std::abs twice. Looks like what you want is std::norm. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/covariance_matrix_generator_unittest.cc (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator_unittest.cc:168: EXPECT_NEAR(actual_els[0][0].real(), 0.5f, kTolerance); Looks like these are scaled by 1/2 and the below by 1/3. Is that expected? Is it due to the number of mics? https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (left): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:417: if (denominator > mask_threshold) { Why don't we need the mask thresholds any longer? https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:61: // To handle the scenario mismatch. Can you expand this comment? Not sure what this means. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:67: const float kMaskTargetThreshold = 0.01f; We should probably have a way to tune this automatically. We could hand annotate a few files with ground truth target and interference, and then search for the optimal mask threshold. This will be a problem whenever we make significant changes here, and I fear forgetting to update it. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:239: rpsiws_[i].push_back(Norm(*interf_cov_mats_[i][j], delay_sum_masks_[i])); Is this Norm different from the one you've added in this CL? Can they be consolidated? https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:252: interf_angles_radians_.push_back(kTargetAngleRadians + kAway); This is all known at compile time, but I suppose your thought is that it won't be once the target angle is settable, right? https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:308: // Average matrices. Perhaps say "Weighted average of matrices." https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:74: // Calculates postfilter masks that minimize the mean-square error of our nit: mean squared error
Patchset #2 (id:20001) has been deleted
https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:28: float norm(const ComplexMatrix<float>& x) { On 2015/10/06 23:54:31, Andrew MacDonald wrote: > Norm > > Add a comment clarifying what it computes. > > This is only used at create-time, right? I ask because std::abs can be slow on > complex values, see cr/101731720. Changed naming and added documentation. Yes, it only is used at create-time, but will be used every time we steer the beam on the future. But we want to minimize this anyway, and it only is called 2 times (for each scenario) and then one time for each mic. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:30: size_t length = x.num_columns(); On 2015/10/06 23:54:31, Andrew MacDonald wrote: > const Done. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:34: result += std::abs(elems[i]) * std::abs(elems[i]); On 2015/10/06 23:54:31, Andrew MacDonald wrote: > Don't compute std::abs twice. Looks like what you want is std::norm. Good point. Done. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/covariance_matrix_generator_unittest.cc (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator_unittest.cc:168: EXPECT_NEAR(actual_els[0][0].real(), 0.5f, kTolerance); On 2015/10/06 23:54:31, Andrew MacDonald wrote: > Looks like these are scaled by 1/2 and the below by 1/3. Is that expected? Is it > due to the number of mics? Exactly. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (left): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:417: if (denominator > mask_threshold) { On 2015/10/06 23:54:31, Andrew MacDonald wrote: > Why don't we need the mask thresholds any longer? I am not sure what you mean. The expression of the postfilter is completely different and so are the heuristics around it. If your fear is to have a small denominator, the minimum is going to be (1 - kCutOffConstant) == 0.0001. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:61: // To handle the scenario mismatch. On 2015/10/06 23:54:31, Andrew MacDonald wrote: > Can you expand this comment? Not sure what this means. Done. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:67: const float kMaskTargetThreshold = 0.01f; On 2015/10/06 23:54:31, Andrew MacDonald wrote: > We should probably have a way to tune this automatically. We could hand annotate > a few files with ground truth target and interference, and then search for the > optimal mask threshold. This will be a problem whenever we make significant > changes here, and I fear forgetting to update it. I added a comment that as to be updated every time the postfilter calculation is changed significantly. And also added a TODO to write a tool to tune the target threshold automatically based on files annotated with target and interference ground truth. But leaving that for another CL. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:239: rpsiws_[i].push_back(Norm(*interf_cov_mats_[i][j], delay_sum_masks_[i])); On 2015/10/06 23:54:31, Andrew MacDonald wrote: > Is this Norm different from the one you've added in this CL? Can they be > consolidated? Yes, it is the Norm defined in the paper and does conjugate(|norm_mat|) * |mat| * transpose(|norm_mat|). https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:252: interf_angles_radians_.push_back(kTargetAngleRadians + kAway); On 2015/10/06 23:54:31, Andrew MacDonald wrote: > This is all known at compile time, but I suppose your thought is that it won't > be once the target angle is settable, right? Exactly. In a followup CL I will make the target scenario settable, and then this will depend on that. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:308: // Average matrices. On 2015/10/06 23:54:31, Andrew MacDonald wrote: > Perhaps say "Weighted average of matrices." Done. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:74: // Calculates postfilter masks that minimize the mean-square error of our On 2015/10/06 23:54:32, Andrew MacDonald wrote: > nit: mean squared error Done.
I widened the beam and compensated the gain for the attenuation introduced to a low-noise and low-reverberation signal from broadside. PTAL.
lgtm % minor changes Can you update the CL description with the beam widening and gain compensation changes? Also note the AEC testing we did to address Per's question. https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:67: const float kMaskTargetThreshold = 0.01f; On 2015/10/07 22:08:05, aluebs-webrtc wrote: > On 2015/10/06 23:54:31, Andrew MacDonald wrote: > > We should probably have a way to tune this automatically. We could hand > annotate > > a few files with ground truth target and interference, and then search for the > > optimal mask threshold. This will be a problem whenever we make significant > > changes here, and I fear forgetting to update it. > > I added a comment that as to be updated every time the postfilter calculation is > changed significantly. And also added a TODO to write a tool to tune the target > threshold automatically based on files annotated with target and interference > ground truth. But leaving that for another CL. Yep, sg. https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc (right): https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:83: complex<float> norm_factor = Norm(interf_cov_vector); This is only used on the below line. Compute directly there. https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right): https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:80: // recording from broadside, since if both channels are exactly the same no Perhaps drop the "since if both channels..." part, or elaborate further. I understand what you mean of course, but I don't think a new reader would. https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:258: const float kAway = 0.5; Does this reduce interferer suppression as well, or just have the intended effect of widening the beam? Also, this is in radians, right? Can you add the unit to the name?
I didn't add the widening to the description, since this CL maintains the same width as the original nlbf0. It only is wider than the original nlbf3 Matlab version. https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc (right): https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:83: complex<float> norm_factor = Norm(interf_cov_vector); On 2015/10/13 21:55:16, Andrew MacDonald wrote: > This is only used on the below line. Compute directly there. Done. https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right): https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:80: // recording from broadside, since if both channels are exactly the same no On 2015/10/13 21:55:16, Andrew MacDonald wrote: > Perhaps drop the "since if both channels..." part, or elaborate further. I > understand what you mean of course, but I don't think a new reader would. I dropped it. https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:258: const float kAway = 0.5; On 2015/10/13 21:55:16, Andrew MacDonald wrote: > Does this reduce interferer suppression as well, or just have the intended > effect of widening the beam? > > Also, this is in radians, right? Can you add the unit to the name? The suppression gradual moving away from the center of the beam. So widening lets more interference through, but I think it is a nice balance. And yes, it is in radians. Added units.
The CQ bit was checked by aluebs@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from andrew@webrtc.org Link to the patchset: https://codereview.webrtc.org/1378973003/#ps80001 (title: "Calculate norm where it is used")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1378973003/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1378973003/80001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: win_x64_gn_dbg on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/win_x64_gn_dbg/builds/4581)
The CQ bit was checked by aluebs@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from andrew@webrtc.org Link to the patchset: https://codereview.webrtc.org/1378973003/#ps100001 (title: "Fix float constant")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1378973003/100001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1378973003/100001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: android_arm64_rel on tryserver.webrtc (JOB_TIMED_OUT, no build URL) android_dbg on tryserver.webrtc (JOB_TIMED_OUT, no build URL) android_rel on tryserver.webrtc (JOB_TIMED_OUT, no build URL)
The CQ bit was checked by aluebs@webrtc.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1378973003/100001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1378973003/100001
Message was sent while issue was closed.
Committed patchset #5 (id:100001)
Message was sent while issue was closed.
Patchset 5 (id:??) landed as https://crrev.com/45daf7b26f49793c30e395f7ba7be30aa51936bb Cr-Commit-Position: refs/heads/master@{#10308} |