Issue 1378973003: Implement new version of the NonlinearBeamformer

Issue 1378973003: Implement new version of the NonlinearBeamformer (Closed)

Created:
5 years, 2 months ago by aluebs-webrtc

Modified:
5 years, 2 months ago

Reviewers:
Andrew MacDonald

CC:
webrtc-reviews_webrtc.org, Andrew MacDonald, tterriberry_mozilla.com, hlundin-webrtc, kwiberg-webrtc, the sun, bjornv1

Base URL:
https://chromium.googlesource.com/external/webrtc.git@master

Target Ref:
refs/pending/heads/master

Project:
webrtc

Visibility:
Public.

More Reviews

Description

Implement new version of the NonlinearBeamformer Sounds better according to a MUSHRA listening test. The computational complexity is unaffected. An empirically estimated gain was added to compensate for the attenuation introduced by the algorithm. There are some TODOs, which I will address in follow up CLs. It was tested in Hangouts without headphones and highest volume, to make sure it doesn't affect the AEC. Committed: https://crrev.com/45daf7b26f49793c30e395f7ba7be30aa51936bb Cr-Commit-Position: refs/heads/master@{#10308}

Patch Set 1 #

Total comments: 23

Patch Set 2 : Use std::norm instead of twice std::abs #

Patch Set 3 : Widen beam and compensate attenuation #

Total comments: 6

Patch Set 4 : Calculate norm where it is used #

Patch Set 5 : Fix float constant #

Created: 5 years, 2 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+122 lines, -106 lines)			Patch
M	webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc	View	1 2 3	3 chunks	+14 lines, -2 lines	0 comments	Download
M	webrtc/modules/audio_processing/beamformer/covariance_matrix_generator_unittest.cc	View		2 chunks	+21 lines, -21 lines	0 comments	Download
M	webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h	View	1	5 chunks	+14 lines, -17 lines	0 comments	Download
M	webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc	View	1 2 3 4	8 chunks	+73 lines, -66 lines	0 comments	Download

Dependent Patchsets:

Issue 1388033002 Patch 80001

Issue 1388033002 Patch 100001

Issue 1388033002 Patch 120001

Issue 1388033002 Patch 140001

Issue 1388033002 Patch 160001

Messages

Total messages: 25 (10 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

peah-webrtc

On 2015/10/02 23:27:00, aluebs-webrtc wrote: Drive by review: Exciting! Some questions: -It would be motivated ...

5 years, 2 months ago (2015-10-03 07:08:41 UTC) #3

aluebs-webrtc

On 2015/10/03 07:08:41, peah-webrtc wrote: > On 2015/10/02 23:27:00, aluebs-webrtc wrote: > > Drive by ...

5 years, 2 months ago (2015-10-05 21:31:56 UTC) #4

Andrew MacDonald

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc File webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc (right): https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc#newcode28 webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:28: float norm(const ComplexMatrix<float>& x) { Norm Add a comment ...

5 years, 2 months ago (2015-10-06 23:54:32 UTC) #6

aluebs-webrtc

5 years, 2 months ago (2015-10-07 22:08:05 UTC) #8

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc
(right):

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:28:
float norm(const ComplexMatrix<float>& x) {
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> Norm
> 
> Add a comment clarifying what it computes.
> 
> This is only used at create-time, right? I ask because std::abs can be slow on
> complex values, see cr/101731720.

Changed naming and added documentation.
Yes, it only is used at create-time, but will be used every time we steer the
beam on the future. But we want to minimize this anyway, and it only is called 2
times (for each scenario) and then one time for each mic.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:30:
size_t length = x.num_columns();
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> const

Done.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:34:
result += std::abs(elems[i]) * std::abs(elems[i]);
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> Don't compute std::abs twice. Looks like what you want is std::norm.

Good point. Done.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
File
webrtc/modules/audio_processing/beamformer/covariance_matrix_generator_unittest.cc
(right):

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/covariance_matrix_generator_unittest.cc:168:
EXPECT_NEAR(actual_els[0][0].real(), 0.5f, kTolerance);
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> Looks like these are scaled by 1/2 and the below by 1/3. Is that expected? Is
it
> due to the number of mics?

Exactly.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (left):

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:417: if
(denominator > mask_threshold) {
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> Why don't we need the mask thresholds any longer?

I am not sure what you mean. The expression of the postfilter is completely
different and so are the heuristics around it. If your fear is to have a small
denominator, the minimum is going to be (1 - kCutOffConstant) == 0.0001.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:61: // To
handle the scenario mismatch.
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> Can you expand this comment? Not sure what this means.

Done.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:67: const
float kMaskTargetThreshold = 0.01f;
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> We should probably have a way to tune this automatically. We could hand
annotate
> a few files with ground truth target and interference, and then search for the
> optimal mask threshold. This will be a problem whenever we make significant
> changes here, and I fear forgetting to update it.

I added a comment that as to be updated every time the postfilter calculation is
changed significantly. And also added a TODO to write a tool to tune the target
threshold automatically based on files annotated with target and interference
ground truth. But leaving that for another CL.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:239:
rpsiws_[i].push_back(Norm(*interf_cov_mats_[i][j], delay_sum_masks_[i]));
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> Is this Norm different from the one you've added in this CL? Can they be
> consolidated?

Yes, it is the Norm defined in the paper and does conjugate(|norm_mat|) * |mat|
* transpose(|norm_mat|).

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:252:
interf_angles_radians_.push_back(kTargetAngleRadians + kAway);
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> This is all known at compile time, but I suppose your thought is that it won't
> be once the target angle is settable, right?

Exactly. In a followup CL I will make the target scenario settable, and then
this will depend on that.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:308: //
Average matrices.
On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> Perhaps say "Weighted average of matrices."

Done.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h (right):

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:74: //
Calculates postfilter masks that minimize the mean-square error of our
On 2015/10/06 23:54:32, Andrew MacDonald wrote:
> nit: mean squared error

Done.

aluebs-webrtc

I widened the beam and compensated the gain for the attenuation introduced to a low-noise ...

5 years, 2 months ago (2015-10-08 22:12:38 UTC) #9

Andrew MacDonald

lgtm % minor changes Can you update the CL description with the beam widening and ...

5 years, 2 months ago (2015-10-13 21:55:16 UTC) #10

lgtm % minor changes

Can you update the CL description with the beam widening and gain compensation
changes? Also note the AEC testing we did to address Per's question.

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1378973003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:67: const
float kMaskTargetThreshold = 0.01f;
On 2015/10/07 22:08:05, aluebs-webrtc wrote:
> On 2015/10/06 23:54:31, Andrew MacDonald wrote:
> > We should probably have a way to tune this automatically. We could hand
> annotate
> > a few files with ground truth target and interference, and then search for
the
> > optimal mask threshold. This will be a problem whenever we make significant
> > changes here, and I fear forgetting to update it.
> 
> I added a comment that as to be updated every time the postfilter calculation
is
> changed significantly. And also added a TODO to write a tool to tune the
target
> threshold automatically based on files annotated with target and interference
> ground truth. But leaving that for another CL.

Yep, sg.

https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc
(right):

https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/covariance_matrix_generator.cc:83:
complex<float> norm_factor = Norm(interf_cov_vector);
This is only used on the below line. Compute directly there.

https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:80: //
recording from broadside, since if both channels are exactly the same no
Perhaps drop the "since if both channels..." part, or elaborate further. I
understand what you mean of course, but I don't think a new reader would.

https://codereview.webrtc.org/1378973003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:258: const
float kAway = 0.5;
Does this reduce interferer suppression as well, or just have the intended
effect of widening the beam?

Also, this is in radians, right? Can you add the unit to the name?

aluebs-webrtc

I didn't add the widening to the description, since this CL maintains the same width ...

5 years, 2 months ago (2015-10-16 16:41:14 UTC) #11