Issue 1982183002: Pull out the PostFilter to its own NonlinearBeamformer API

aluebs-webrtc

aluebs@webrtc.org changed reviewers: + peah@webrtc.org

4 years, 7 months ago (2016-05-17 02:12:49 UTC) #1

aluebs-webrtc

On 2016/05/18 15:09:00, aluebs-webrtc wrote: > Friendly ping. peah, could you PTAL. peah, could you ...

4 years, 7 months ago (2016-05-20 15:28:23 UTC) #4

peah-webrtc

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h File webrtc/modules/audio_processing/beamformer/beamformer.h (right): https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h#newcode37 webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void PostFilter(const ChannelBuffer<float>& input, Why not change this ...

4 years, 7 months ago (2016-05-22 21:06:48 UTC) #5

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
Why not change this to have one input/output parameter as that is how it is
used? That would mean less code and would also save one memcopy.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
Is there a reason why this is not implemented in the .cc file?

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:183: //
static
Please correct this comment as well while you are at it.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:186: class
PostFilterTransform : public LappedTransform::Callback {
This class is a way to be able to use the callback functionality of the
LappedTransform for two callbacks in the same class, right?

It makes the code more complicated than it needs to, I think. 
Why not skip the friend class approach, and instead create a new class that
implements the Callback interface? Then you could call that to apply the
postfilter effect in such a way that the kCompensationGain and  final_mask_
variables are passed as input to that call.
That would achieve a separation of concerns, data and functionality
encapsulation and would avoid the need of the friend class approach.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:192: //
Process one frequency-domain block of audio. This is where the fun
Please describe this more thoroughly. I'm not sure the comment is needed but if
there is to be a comment it should be more descriptive, e.g., what kind of
processing is done, how does the callback work, etc.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:248: float
freq_hz = (static_cast<float>(i) / kFftSize) * sample_rate_hz_;
Please change to use a precomputed 1/kFftSize * sample_rate_hz_. This approach
results in kNumFreqBins unnecessary divisions.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:249:
wave_numbers_[i] = 2 * M_PI * freq_hz / kSpeedOfSoundMeterSeconds;
Please change to use a precomputed 2*pi/kSpeedOfSoundMeterSeconds. This approach
results in kNumFreqBins unnecessary divisions.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:403: // Copy
over only the first channel of each band.
Have you checked the impact on the signal when this is only done in the lowest
band. In theory the perfect reconstruction property of the signal could be
destroyed through this , causing aliasing distortions. Furthermore,  this could
in theory cause an sharp increase in the signal power at the first band-split
frequency.

Why don't you perform the linear beamforming in the time-domain instead, as you
anyway do not pre-steer the signals before applying that step (right?)? 
That would have the benefits
1) Be cheaper as you don't need to do the IFFT to go back to the time-domain.
2) Have lower delay, as the filter-bank delay is avoided. 
3) Have zero impact on the perfect-reconstruction property of the splitting
filter.
4) Have no edge effect at the first band-split frequency.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:407:
memcpy(output->channels(i)[0],
what happens if the output is dual channel? Since the ProcessChunk method locks
the content of the output to be single-channel, it should either assert that the
number of output channels allready is one, or explicitly set the number of
channels to 1 in the output.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:419: // Ramp
up/down for smoothing. 1 mask per 10ms results in audible
I guess, what you mean is that smoothing is needed in order to avoid
discontinuities in the transitions between 10 ms frames, right?. I.e., the
problem is not 1 mask per 10 ms, but rather that the transition between two
masks cannot be done instantaneously.

If the above holds, this comment is a bit misleading.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:59: // Applies
the postfilter mask to one chunk of audio. The audio is expected to
I think that this description is longer than required.

The description of the band splitting should not be needed here. It is rather a
description of ChannelBuffer and if it needs to be clarified further it would be
better to do it in a unittest which illustrates the usage of the method.

I think the requirements on the frame size and the number of channels are better
illustrated by a DCHECK inside the code.

aluebs-webrtc

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h File webrtc/modules/audio_processing/beamformer/beamformer.h (right): https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h#newcode37 webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void PostFilter(const ChannelBuffer<float>& input, On 2016/05/22 21:06:48, peah-webrtc ...

4 years, 6 months ago (2016-05-26 01:04:45 UTC) #6

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
On 2016/05/22 21:06:48, peah-webrtc wrote:
> Why not change this to have one input/output parameter as that is how it is
> used? That would mean less code and would also save one memcopy.

To be consistent with the ProcessChunk interface and give the user the freedom
to *not* override the input. But I see your point, so I changed it.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
On 2016/05/22 21:06:48, peah-webrtc wrote:
> Is there a reason why this is not implemented in the .cc file?

Beamformer is an interface, it has no .cc file.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:183: //
static
On 2016/05/22 21:06:48, peah-webrtc wrote:
> Please correct this comment as well while you are at it.

What is wrong with it?

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:186: class
PostFilterTransform : public LappedTransform::Callback {
On 2016/05/22 21:06:48, peah-webrtc wrote:
> This class is a way to be able to use the callback functionality of the
> LappedTransform for two callbacks in the same class, right?
> 
> It makes the code more complicated than it needs to, I think. 
> Why not skip the friend class approach, and instead create a new class that
> implements the Callback interface? Then you could call that to apply the
> postfilter effect in such a way that the kCompensationGain and  final_mask_
> variables are passed as input to that call.
> That would achieve a separation of concerns, data and functionality
> encapsulation and would avoid the need of the friend class approach.

I don't see how this simplifies the code, but I agree that it avoids the friend
class which is not desired. Anyway, I implemented it.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:192: //
Process one frequency-domain block of audio. This is where the fun
On 2016/05/22 21:06:48, peah-webrtc wrote:
> Please describe this more thoroughly. I'm not sure the comment is needed but
if
> there is to be a comment it should be more descriptive, e.g., what kind of
> processing is done, how does the callback work, etc. 

Removed comment. It was just to be consistent with the other callback.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:248: float
freq_hz = (static_cast<float>(i) / kFftSize) * sample_rate_hz_;
On 2016/05/22 21:06:48, peah-webrtc wrote:
> Please change to use a precomputed 1/kFftSize * sample_rate_hz_. This approach
> results in kNumFreqBins unnecessary divisions.

Adds unrelated changes to the CL, but if you think that is ok. Done.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:249:
wave_numbers_[i] = 2 * M_PI * freq_hz / kSpeedOfSoundMeterSeconds;
On 2016/05/22 21:06:48, peah-webrtc wrote:
> Please change to use a precomputed 2*pi/kSpeedOfSoundMeterSeconds. This
approach
> results in kNumFreqBins unnecessary divisions.

Adds unrelated changes to the CL, but if you think that is ok. Done.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:403: // Copy
over only the first channel of each band.
On 2016/05/22 21:06:48, peah-webrtc wrote:
> Have you checked the impact on the signal when this is only done in the lowest
> band. In theory the perfect reconstruction property of the signal could be
> destroyed through this , causing aliasing distortions. Furthermore,  this
could
> in theory cause an sharp increase in the signal power at the first band-split
> frequency.
> 
> Why don't you perform the linear beamforming in the time-domain instead, as
you
> anyway do not pre-steer the signals before applying that step (right?)? 
> That would have the benefits
> 1) Be cheaper as you don't need to do the IFFT to go back to the time-domain.
> 2) Have lower delay, as the filter-bank delay is avoided. 
> 3) Have zero impact on the perfect-reconstruction property of the splitting
> filter.
> 4) Have no edge effect at the first band-split frequency.
>  

This is no longer relevant, since we decided offline to not do any linear
beamforming because it is negligible compared to the postfilter. But a few
points just to be clear:
* The perfect reconstruction of the filter bank is lost with any processing
done.
* The worst case 3dB increase of power in the higher bands is probably not
perceivable. And I am not sure what the problem with a "sharp" edge here is.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:407:
memcpy(output->channels(i)[0],
On 2016/05/22 21:06:48, peah-webrtc wrote:
> what happens if the output is dual channel? Since the ProcessChunk method
locks
> the content of the output to be single-channel, it should either assert that
the
> number of output channels allready is one, or explicitly set the number of
> channels to 1 in the output.

This can't be done as is, since the input and output can be the same
ChannelBuffer and the ChannelBuffer has a fixed number of channels.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:419: // Ramp
up/down for smoothing. 1 mask per 10ms results in audible
On 2016/05/22 21:06:48, peah-webrtc wrote:
> I guess, what you mean is that smoothing is needed in order to avoid
> discontinuities in the transitions between 10 ms frames, right?. I.e., the
> problem is not 1 mask per 10 ms, but rather that the transition between two
> masks cannot be done instantaneously.
> 
> If the above holds, this comment is a bit misleading.

Adds unrelated changes to the CL, but if you think that is ok.
I find the comment to be easy to understand, having "1 [constant] mask per 10ms
results in audible discontinuities [in the transitions]". But if you find it
confusing, I am happy to change it. WDYT of the current one?

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:59: // Applies
the postfilter mask to one chunk of audio. The audio is expected to
On 2016/05/22 21:06:48, peah-webrtc wrote:
> I think that this description is longer than required.
> 
> The description of the band splitting should not be needed here. It is rather
a
> description of ChannelBuffer and if it needs to be clarified further it would
be
> better to do it in a unittest which illustrates the usage of the method.
> 
> I think the requirements on the frame size and the number of channels are
better
> illustrated by a DCHECK inside the code.
> 

I personally think adding more documentation than required is not a problem. It
is also consistent (band splitting and requirements on input) with the
ProcessChunk documentation. There is a DCHECK on place as well.

peah-webrtc

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h File webrtc/modules/audio_processing/beamformer/beamformer.h (right): https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h#newcode37 webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void PostFilter(const ChannelBuffer<float>& input, On 2016/05/26 01:04:45, aluebs-webrtc ...

4 years, 6 months ago (2016-05-26 08:48:53 UTC) #7

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > Is there a reason why this is not implemented in the .cc file?
> 
> Beamformer is an interface, it has no .cc file.

My skills on this are limited, but to me, an interface is something that the
user of the interface needs to implement. If it has a default implementation, it
is not a pure interface. Is there a reason for why this method cannot be
declared pure virtual so that this is a proper interface that enforces the
implementer of the interface to supply a interface implementation?

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:183: //
static
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > Please correct this comment as well while you are at it.
> 
> What is wrong with it?

It is not a proper sentence, and not terminated by a period.  But maybe this is
a standard way of commenting statics definitions?

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:186: class
PostFilterTransform : public LappedTransform::Callback {
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > This class is a way to be able to use the callback functionality of the
> > LappedTransform for two callbacks in the same class, right?
> > 
> > It makes the code more complicated than it needs to, I think. 
> > Why not skip the friend class approach, and instead create a new class that
> > implements the Callback interface? Then you could call that to apply the
> > postfilter effect in such a way that the kCompensationGain and  final_mask_
> > variables are passed as input to that call.
> > That would achieve a separation of concerns, data and functionality
> > encapsulation and would avoid the need of the friend class approach.
> 
> I don't see how this simplifies the code, but I agree that it avoids the
friend
> class which is not desired. Anyway, I implemented it.

The simplification is separation of concerns and data and functionality
encapsulation, as well as avoiding the use of an interface.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:403: // Copy
over only the first channel of each band.
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > Have you checked the impact on the signal when this is only done in the
lowest
> > band. In theory the perfect reconstruction property of the signal could be
> > destroyed through this , causing aliasing distortions. Furthermore,  this
> could
> > in theory cause an sharp increase in the signal power at the first
band-split
> > frequency.
> > 
> > Why don't you perform the linear beamforming in the time-domain instead, as
> you
> > anyway do not pre-steer the signals before applying that step (right?)? 
> > That would have the benefits
> > 1) Be cheaper as you don't need to do the IFFT to go back to the
time-domain.
> > 2) Have lower delay, as the filter-bank delay is avoided. 
> > 3) Have zero impact on the perfect-reconstruction property of the splitting
> > filter.
> > 4) Have no edge effect at the first band-split frequency.
> >  
> 
> This is no longer relevant, since we decided offline to not do any linear
> beamforming because it is negligible compared to the postfilter. But a few
> points just to be clear:
> * The perfect reconstruction of the filter bank is lost with any processing
> done.
> * The worst case 3dB increase of power in the higher bands is probably not
> perceivable. And I am not sure what the problem with a "sharp" edge here is.

I think the perfect reconstruction may actually be preserved if you average two
PR filterbanks in the time-domain (which the linear beamforming is identical to
if no pre-steering is performed).

A 3 dB "edge" is perceivable, but I doubt as well if it will matter in general
for most noise types (at least the one with tilted spectral shapes). The sharp
edge is dependent on the sharpness of the splitting filter, as the subsequent
merge will merge two bands where the noise levels differ by 3 dB after the
beamforming.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:59: // Applies
the postfilter mask to one chunk of audio. The audio is expected to
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > I think that this description is longer than required.
> > 
> > The description of the band splitting should not be needed here. It is
rather
> a
> > description of ChannelBuffer and if it needs to be clarified further it
would
> be
> > better to do it in a unittest which illustrates the usage of the method.
> > 
> > I think the requirements on the frame size and the number of channels are
> better
> > illustrated by a DCHECK inside the code.
> > 
> 
> I personally think adding more documentation than required is not a problem.
It
> is also consistent (band splitting and requirements on input) with the
> ProcessChunk documentation. There is a DCHECK on place as well.

Great!

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->ProcessChunk(*ca->split_data_f(),
If the ProcessChunk is made input-only,  you should be able to pass
ca->split_bands_const_f here instead, right?

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:685:
ca->set_num_channels(1);
Please add a comment that this operation discards all channels by the leftmost
one.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/beamformer.h:36: virtual void
PostFilter(ChannelBuffer<float>* data) {};
I'm not that familiar with interfaces, but why can't you make this method pure
virtual similar to the other interface methods?

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:383:
ChannelBuffer<float>* output) {
Since the linear beamforming is no longer done, output could be removed, right?
Alternative, if you want to maintain this method for when the linear beamforming
is to be used, I think it would make sense to add another method which is
input-only.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:392: for
(size_t i = 0; i < input.num_bands(); ++i) {
If the output is removed from ProcessChunk, these memcopies can be removed.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:484:
memcpy(output[0], input[0], kNumFreqBins * sizeof(input[0][0]));
This memcopy is not neccessary, why not instead discard the output parameter
from this method? The memcopy done here, is actually in the calling method
reversed by another num_bands memcopies that copies the output back to the
input, so discarding the output parameter would save (1+num_bands) memcopys.
Furthermore it would avoid the delay of the filterbank transform that is imposed
by these two memcopys (and that delay is > 7ms, right)?

aluebs-webrtc

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h File webrtc/modules/audio_processing/beamformer/beamformer.h (right): https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h#newcode37 webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void PostFilter(const ChannelBuffer<float>& input, On 2016/05/26 08:48:52, peah-webrtc ...

4 years, 6 months ago (2016-05-28 03:00:00 UTC) #8

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
On 2016/05/26 08:48:52, peah-webrtc wrote:
> On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > Is there a reason why this is not implemented in the .cc file?
> > 
> > Beamformer is an interface, it has no .cc file.
> 
> My skills on this are limited, but to me, an interface is something that the
> user of the interface needs to implement. If it has a default implementation,
it
> is not a pure interface. Is there a reason for why this method cannot be
> declared pure virtual so that this is a proper interface that enforces the
> implementer of the interface to supply a interface implementation?

It already is not a pure interface, because IsInBeam() has a default
implementation. This has a default empty implementation because not all
Beamformers have postfilters.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:183: //
static
On 2016/05/26 08:48:52, peah-webrtc wrote:
> On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > Please correct this comment as well while you are at it.
> > 
> > What is wrong with it?
> 
> It is not a proper sentence, and not terminated by a period.  But maybe this
is
> a standard way of commenting statics definitions?

I think the static definition is clearer like this than in a sentence. Also, I
haven't touched this in this CL, so it would add one more unrelated change. So I
will leave it as is.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:186: class
PostFilterTransform : public LappedTransform::Callback {
On 2016/05/26 08:48:52, peah-webrtc wrote:
> On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > This class is a way to be able to use the callback functionality of the
> > > LappedTransform for two callbacks in the same class, right?
> > > 
> > > It makes the code more complicated than it needs to, I think. 
> > > Why not skip the friend class approach, and instead create a new class
that
> > > implements the Callback interface? Then you could call that to apply the
> > > postfilter effect in such a way that the kCompensationGain and 
final_mask_
> > > variables are passed as input to that call.
> > > That would achieve a separation of concerns, data and functionality
> > > encapsulation and would avoid the need of the friend class approach.
> > 
> > I don't see how this simplifies the code, but I agree that it avoids the
> friend
> > class which is not desired. Anyway, I implemented it.
> 
> The simplification is separation of concerns and data and functionality
> encapsulation, as well as avoiding the use of an interface.

I think the separation/encapsulation is almost the same, but the call order is
more complicated since you have to pass in the needed arguments. But I see that
having friend classes is not desirable, so I implemented it already.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:403: // Copy
over only the first channel of each band.
On 2016/05/26 08:48:52, peah-webrtc wrote:
> On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > Have you checked the impact on the signal when this is only done in the
> lowest
> > > band. In theory the perfect reconstruction property of the signal could be
> > > destroyed through this , causing aliasing distortions. Furthermore,  this
> > could
> > > in theory cause an sharp increase in the signal power at the first
> band-split
> > > frequency.
> > > 
> > > Why don't you perform the linear beamforming in the time-domain instead,
as
> > you
> > > anyway do not pre-steer the signals before applying that step (right?)? 
> > > That would have the benefits
> > > 1) Be cheaper as you don't need to do the IFFT to go back to the
> time-domain.
> > > 2) Have lower delay, as the filter-bank delay is avoided. 
> > > 3) Have zero impact on the perfect-reconstruction property of the
splitting
> > > filter.
> > > 4) Have no edge effect at the first band-split frequency.
> > >  
> > 
> > This is no longer relevant, since we decided offline to not do any linear
> > beamforming because it is negligible compared to the postfilter. But a few
> > points just to be clear:
> > * The perfect reconstruction of the filter bank is lost with any processing
> > done.
> > * The worst case 3dB increase of power in the higher bands is probably not
> > perceivable. And I am not sure what the problem with a "sharp" edge here is.
> 
> I think the perfect reconstruction may actually be preserved if you average
two
> PR filterbanks in the time-domain (which the linear beamforming is identical
to
> if no pre-steering is performed).
> 
> A 3 dB "edge" is perceivable, but I doubt as well if it will matter in general
> for most noise types (at least the one with tilted spectral shapes). The sharp
> edge is dependent on the sharpness of the splitting filter, as the subsequent
> merge will merge two bands where the noise levels differ by 3 dB after the
> beamforming.

I meant that other components are already non-linear, losing the perfect
reconstruction. I don't think the 3dB (in the worst case) would be perceivable
in the higher bands at all. But in any case it would sound slightly highpass,
but the sharpness is not something that would matter too much. Anyways, now I am
only taking one channel, so this is no longer relevant.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->ProcessChunk(*ca->split_data_f(),
On 2016/05/26 08:48:53, peah-webrtc wrote:
> If the ProcessChunk is made input-only,  you should be able to pass
> ca->split_bands_const_f here instead, right?

Because other Beamformers might do some processing (even all of it) in
ProcessChunk(), I think we should leave it as is.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:685:
ca->set_num_channels(1);
On 2016/05/26 08:48:53, peah-webrtc wrote:
> Please add a comment that this operation discards all channels by the leftmost
> one.

Done.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/beamformer.h:36: virtual void
PostFilter(ChannelBuffer<float>* data) {};
On 2016/05/26 08:48:53, peah-webrtc wrote:
> I'm not that familiar with interfaces, but why can't you make this method pure
> virtual similar to the other interface methods?

Addressed in the other comment.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:383:
ChannelBuffer<float>* output) {
On 2016/05/26 08:48:53, peah-webrtc wrote:
> Since the linear beamforming is no longer done, output could be removed,
right?
> Alternative, if you want to maintain this method for when the linear
beamforming
> is to be used, I think it would make sense to add another method which is
> input-only.

Adding yet another interface, increases code complexity. I think we want to
leave it as is: one processing method and a non-linear postfilter method.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:392: for
(size_t i = 0; i < input.num_bands(); ++i) {
On 2016/05/26 08:48:53, peah-webrtc wrote:
> If the output is removed from ProcessChunk, these memcopies can be removed.

See above on why I think the output should not be removed.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:484:
memcpy(output[0], input[0], kNumFreqBins * sizeof(input[0][0]));
On 2016/05/26 08:48:53, peah-webrtc wrote:
> This memcopy is not neccessary, why not instead discard the output parameter
> from this method? The memcopy done here, is actually in the calling method
> reversed by another num_bands memcopies that copies the output back to the
> input, so discarding the output parameter would save (1+num_bands) memcopys.
> Furthermore it would avoid the delay of the filterbank transform that is
imposed
> by these two memcopys (and that delay is > 7ms, right)?

See above why I don't think we should drop the output parameter. It was
necessary when input and output was the same ChannelBuffer (like in APM),
because else this would have put zeros in the lower band. But now I added a
dummy buffer to avoid this memcpy.

peah-webrtc

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h File webrtc/modules/audio_processing/beamformer/beamformer.h (right): https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h#newcode37 webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void PostFilter(const ChannelBuffer<float>& input, On 2016/05/28 03:00:00, aluebs-webrtc ...

4 years, 6 months ago (2016-05-30 11:49:25 UTC) #9

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> On 2016/05/26 08:48:52, peah-webrtc wrote:
> > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > Is there a reason why this is not implemented in the .cc file?
> > > 
> > > Beamformer is an interface, it has no .cc file.
> > 
> > My skills on this are limited, but to me, an interface is something that the
> > user of the interface needs to implement. If it has a default
implementation,
> it
> > is not a pure interface. Is there a reason for why this method cannot be
> > declared pure virtual so that this is a proper interface that enforces the
> > implementer of the interface to supply a interface implementation?
> 
> It already is not a pure interface, because IsInBeam() has a default
> implementation. This has a default empty implementation because not all
> Beamformers have postfilters.

Do you mean that there are more beamformers that use this interface? 

Would it not be better then to let the actual beamformer implementing the
interface to explicitly provide an empty implementation, rather to have it
pre-provided?

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:183: //
static
On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> On 2016/05/26 08:48:52, peah-webrtc wrote:
> > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > Please correct this comment as well while you are at it.
> > > 
> > > What is wrong with it?
> > 
> > It is not a proper sentence, and not terminated by a period.  But maybe this
> is
> > a standard way of commenting statics definitions?
> 
> I think the static definition is clearer like this than in a sentence. Also, I
> haven't touched this in this CL, so it would add one more unrelated change. So
I
> will leave it as is.

I think the guidelines should be applied regardless of that. But I won't insist
any further on this.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:186: class
PostFilterTransform : public LappedTransform::Callback {
On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> On 2016/05/26 08:48:52, peah-webrtc wrote:
> > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > This class is a way to be able to use the callback functionality of the
> > > > LappedTransform for two callbacks in the same class, right?
> > > > 
> > > > It makes the code more complicated than it needs to, I think. 
> > > > Why not skip the friend class approach, and instead create a new class
> that
> > > > implements the Callback interface? Then you could call that to apply the
> > > > postfilter effect in such a way that the kCompensationGain and 
> final_mask_
> > > > variables are passed as input to that call.
> > > > That would achieve a separation of concerns, data and functionality
> > > > encapsulation and would avoid the need of the friend class approach.
> > > 
> > > I don't see how this simplifies the code, but I agree that it avoids the
> > friend
> > > class which is not desired. Anyway, I implemented it.
> > 
> > The simplification is separation of concerns and data and functionality
> > encapsulation, as well as avoiding the use of an interface.
> 
> I think the separation/encapsulation is almost the same, but the call order is
> more complicated since you have to pass in the needed arguments. But I see
that
> having friend classes is not desirable, so I implemented it already.

Acknowledged.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:192: //
Process one frequency-domain block of audio. This is where the fun
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > Please describe this more thoroughly. I'm not sure the comment is needed but
> if
> > there is to be a comment it should be more descriptive, e.g., what kind of
> > processing is done, how does the callback work, etc. 
> 
> Removed comment. It was just to be consistent with the other callback.

Acknowledged.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:248: float
freq_hz = (static_cast<float>(i) / kFftSize) * sample_rate_hz_;
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > Please change to use a precomputed 1/kFftSize * sample_rate_hz_. This
approach
> > results in kNumFreqBins unnecessary divisions.
> 
> Adds unrelated changes to the CL, but if you think that is ok. Done.

Acknowledged.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:249:
wave_numbers_[i] = 2 * M_PI * freq_hz / kSpeedOfSoundMeterSeconds;
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > Please change to use a precomputed 2*pi/kSpeedOfSoundMeterSeconds. This
> approach
> > results in kNumFreqBins unnecessary divisions.
> 
> Adds unrelated changes to the CL, but if you think that is ok. Done.

Acknowledged.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:403: // Copy
over only the first channel of each band.
On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> On 2016/05/26 08:48:52, peah-webrtc wrote:
> > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > Have you checked the impact on the signal when this is only done in the
> > lowest
> > > > band. In theory the perfect reconstruction property of the signal could
be
> > > > destroyed through this , causing aliasing distortions. Furthermore, 
this
> > > could
> > > > in theory cause an sharp increase in the signal power at the first
> > band-split
> > > > frequency.
> > > > 
> > > > Why don't you perform the linear beamforming in the time-domain instead,
> as
> > > you
> > > > anyway do not pre-steer the signals before applying that step (right?)? 
> > > > That would have the benefits
> > > > 1) Be cheaper as you don't need to do the IFFT to go back to the
> > time-domain.
> > > > 2) Have lower delay, as the filter-bank delay is avoided. 
> > > > 3) Have zero impact on the perfect-reconstruction property of the
> splitting
> > > > filter.
> > > > 4) Have no edge effect at the first band-split frequency.
> > > >  
> > > 
> > > This is no longer relevant, since we decided offline to not do any linear
> > > beamforming because it is negligible compared to the postfilter. But a few
> > > points just to be clear:
> > > * The perfect reconstruction of the filter bank is lost with any
processing
> > > done.
> > > * The worst case 3dB increase of power in the higher bands is probably not
> > > perceivable. And I am not sure what the problem with a "sharp" edge here
is.
> > 
> > I think the perfect reconstruction may actually be preserved if you average
> two
> > PR filterbanks in the time-domain (which the linear beamforming is identical
> to
> > if no pre-steering is performed).
> > 
> > A 3 dB "edge" is perceivable, but I doubt as well if it will matter in
general
> > for most noise types (at least the one with tilted spectral shapes). The
sharp
> > edge is dependent on the sharpness of the splitting filter, as the
subsequent
> > merge will merge two bands where the noise levels differ by 3 dB after the
> > beamforming.
> 
> I meant that other components are already non-linear, losing the perfect
> reconstruction. I don't think the 3dB (in the worst case) would be perceivable
> in the higher bands at all. But in any case it would sound slightly highpass,
> but the sharpness is not something that would matter too much. Anyways, now I
am
> only taking one channel, so this is no longer relevant.

Acknowledged.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:407:
memcpy(output->channels(i)[0],
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > what happens if the output is dual channel? Since the ProcessChunk method
> locks
> > the content of the output to be single-channel, it should either assert that
> the
> > number of output channels allready is one, or explicitly set the number of
> > channels to 1 in the output.
> 
> This can't be done as is, since the input and output can be the same
> ChannelBuffer and the ChannelBuffer has a fixed number of channels.

So this means that the input could have 2 channels, and then the output gets
only 1 channel, but the meta-information (about the number of channels) in the
output does still not reflect that, which means that the caller needs to know
this and explicitly set the number of channels in the meta-information to 1!
That sounds error prone and more complicated than necessary.

It would be better then to replace this ProcessChunk call by two other calls,
One similar to this one, but that does not allow in-place calls, and another
that is explicitly in-place and takes only one ChannelBuffer pointer.
That would allow the ProcessChunk method to change the number of channels.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:419: // Ramp
up/down for smoothing. 1 mask per 10ms results in audible
On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> On 2016/05/22 21:06:48, peah-webrtc wrote:
> > I guess, what you mean is that smoothing is needed in order to avoid
> > discontinuities in the transitions between 10 ms frames, right?. I.e., the
> > problem is not 1 mask per 10 ms, but rather that the transition between two
> > masks cannot be done instantaneously.
> > 
> > If the above holds, this comment is a bit misleading.
> 
> Adds unrelated changes to the CL, but if you think that is ok.
> I find the comment to be easy to understand, having "1 [constant] mask per
10ms
> results in audible discontinuities [in the transitions]". But if you find it
> confusing, I am happy to change it. WDYT of the current one?

Sounds awesome!

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->ProcessChunk(*ca->split_data_f(),
On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> On 2016/05/26 08:48:53, peah-webrtc wrote:
> > If the ProcessChunk is made input-only,  you should be able to pass
> > ca->split_bands_const_f here instead, right?
> 
> Because other Beamformers might do some processing (even all of it) in
> ProcessChunk(), I think we should leave it as is.

Are there other beamformer implementations, apart from the one reviewed in this
CL, that could be called here?

I don't think it is a good idea to keep it like this, because it gives the
impression that the beamformer is actually changing the signal, which it is not.
Furthermore, it unnecessarily makes the APM sensitive to any bugs in the
beamformer implementation that accidentally changes the signal.

If there are other beamformer implementations that could actually change the
signal, they should not be allowed to be used together with the AEC, so in that
case it would make sense to place a DCHECK that the AEC is not activated if they
are active.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:383:
ChannelBuffer<float>* output) {
On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> On 2016/05/26 08:48:53, peah-webrtc wrote:
> > Since the linear beamforming is no longer done, output could be removed,
> right?
> > Alternative, if you want to maintain this method for when the linear
> beamforming
> > is to be used, I think it would make sense to add another method which is
> > input-only.
> 
> Adding yet another interface, increases code complexity. I think we want to
> leave it as is: one processing method and a non-linear postfilter method. 

That depends, as there is currently no beamformer (?) that uses the output
functionality of this method, right? And if there is, if another method is added
to the interface, the memcopy done below could be removed so I think the amount
of code would possibly decrease with another interface.

Furthermore, there are clear benefits with another interface that could be
argued outweights the potential drawback of code complexity for the beamformer:
1) increases code readability for APM as it is clear that this is an input-only
functionality.
2) increases code security for the APM, as adaptive processing of the signal
prior to the AEC is explicitly prevented.
3) reduces computational complexity as the below memcopy is not needed and
furthermore the APM can assume that the signal is not modified by the call to
the beamformer.

https://codereview.webrtc.org/1982183002/diff/40001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:389:
process_transform_->ProcessChunk(input.channels(0), dummy_out->channels(0));
I think changing to interface would be much preferably rather than to avoid this
by using a dummy output. Apart from avoiding to changing/extending the interface
I cannot see the point of this.

And changing the interface to reflect how the beamformer actually performs seems
like the natural way to address this issue rather than to mock a beamformer
behavior that is not there.

https://codereview.webrtc.org/1982183002/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:439: size_t
num_output_channels,
With this implementation, the last two parameters are unused. Is there no other
way to achieve this? Can the LappedTransform only be used in a way that assumes
an input-output manner? With this solution, this is more complex than needed,
since it actually does the synthesis unnecessarily on dummy data.

https://codereview.webrtc.org/1982183002/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:443:
RTC_CHECK_EQ(1u, num_output_channels);
I guess this CHECK can be removed, right?

aluebs-webrtc

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h File webrtc/modules/audio_processing/beamformer/beamformer.h (right): https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h#newcode37 webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void PostFilter(const ChannelBuffer<float>& input, On 2016/05/30 11:49:25, peah-webrtc ...

4 years, 6 months ago (2016-06-01 00:16:34 UTC) #10

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
On 2016/05/30 11:49:25, peah-webrtc wrote:
> On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > On 2016/05/26 08:48:52, peah-webrtc wrote:
> > > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > > Is there a reason why this is not implemented in the .cc file?
> > > > 
> > > > Beamformer is an interface, it has no .cc file.
> > > 
> > > My skills on this are limited, but to me, an interface is something that
the
> > > user of the interface needs to implement. If it has a default
> implementation,
> > it
> > > is not a pure interface. Is there a reason for why this method cannot be
> > > declared pure virtual so that this is a proper interface that enforces the
> > > implementer of the interface to supply a interface implementation?
> > 
> > It already is not a pure interface, because IsInBeam() has a default
> > implementation. This has a default empty implementation because not all
> > Beamformers have postfilters.
> 
> Do you mean that there are more beamformers that use this interface? 
> 
> Would it not be better then to let the actual beamformer implementing the
> interface to explicitly provide an empty implementation, rather to have it
> pre-provided?
> 

At this point there is only the non-linear Beamformer in WebRTC, but there were
plans of porting other types if we think  they provide a benefit. I think having
a default implementation is more intuitive and adds less code-complexity. First,
because it doesn't force all Beamformers, even the ones that have nothing to do
with postfilters, implement it. But also, because this means that every
Beamformer needs it's own empty implementation.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:183: //
static
On 2016/05/30 11:49:25, peah-webrtc wrote:
> On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > On 2016/05/26 08:48:52, peah-webrtc wrote:
> > > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > > Please correct this comment as well while you are at it.
> > > > 
> > > > What is wrong with it?
> > > 
> > > It is not a proper sentence, and not terminated by a period.  But maybe
this
> > is
> > > a standard way of commenting statics definitions?
> > 
> > I think the static definition is clearer like this than in a sentence. Also,
I
> > haven't touched this in this CL, so it would add one more unrelated change.
So
> I
> > will leave it as is.
> 
> I think the guidelines should be applied regardless of that. But I won't
insist
> any further on this.

Acknowledged.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:407:
memcpy(output->channels(i)[0],
On 2016/05/30 11:49:25, peah-webrtc wrote:
> On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > what happens if the output is dual channel? Since the ProcessChunk method
> > locks
> > > the content of the output to be single-channel, it should either assert
that
> > the
> > > number of output channels allready is one, or explicitly set the number of
> > > channels to 1 in the output.
> > 
> > This can't be done as is, since the input and output can be the same
> > ChannelBuffer and the ChannelBuffer has a fixed number of channels.
> 
> So this means that the input could have 2 channels, and then the output gets
> only 1 channel, but the meta-information (about the number of channels) in the
> output does still not reflect that, which means that the caller needs to know
> this and explicitly set the number of channels in the meta-information to 1!
> That sounds error prone and more complicated than necessary.
> 
> It would be better then to replace this ProcessChunk call by two other calls,
> One similar to this one, but that does not allow in-place calls, and another
> that is explicitly in-place and takes only one ChannelBuffer pointer.
> That would allow the ProcessChunk method to change the number of channels.

I don't think that adding an additional interface is a good idea. It only
increases code complexity. But, modifying the ChannelBuffer behavior, we can set
the number of output channels to 1.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->ProcessChunk(*ca->split_data_f(),
On 2016/05/30 11:49:25, peah-webrtc wrote:
> On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > On 2016/05/26 08:48:53, peah-webrtc wrote:
> > > If the ProcessChunk is made input-only,  you should be able to pass
> > > ca->split_bands_const_f here instead, right?
> > 
> > Because other Beamformers might do some processing (even all of it) in
> > ProcessChunk(), I think we should leave it as is.
> 
> Are there other beamformer implementations, apart from the one reviewed in
this
> CL, that could be called here?
> 
> I don't think it is a good idea to keep it like this, because it gives the
> impression that the beamformer is actually changing the signal, which it is
not.
> Furthermore, it unnecessarily makes the APM sensitive to any bugs in the
> beamformer implementation that accidentally changes the signal.
> 
> If there are other beamformer implementations that could actually change the
> signal, they should not be allowed to be used together with the AEC, so in
that
> case it would make sense to place a DCHECK that the AEC is not activated if
they
> are active. 

There are no other beamformers in WebRTC at the moment, but there were plans to
support multiple different ones if there is a use case for them. If going from
stereo to mono is to modify the signal, it actually does modify it. And also,
other linear beamformers that actually change the signal here should totally be
allowed to be used with the AEC.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:383:
ChannelBuffer<float>* output) {
On 2016/05/30 11:49:25, peah-webrtc wrote:
> On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > On 2016/05/26 08:48:53, peah-webrtc wrote:
> > > Since the linear beamforming is no longer done, output could be removed,
> > right?
> > > Alternative, if you want to maintain this method for when the linear
> > beamforming
> > > is to be used, I think it would make sense to add another method which is
> > > input-only.
> > 
> > Adding yet another interface, increases code complexity. I think we want to
> > leave it as is: one processing method and a non-linear postfilter method. 
> 
> That depends, as there is currently no beamformer (?) that uses the output
> functionality of this method, right? And if there is, if another method is
added
> to the interface, the memcopy done below could be removed so I think the
amount
> of code would possibly decrease with another interface.
> 
> 
> Furthermore, there are clear benefits with another interface that could be
> argued outweights the potential drawback of code complexity for the
beamformer:
> 1) increases code readability for APM as it is clear that this is an
input-only
> functionality.
> 2) increases code security for the APM, as adaptive processing of the signal
> prior to the AEC is explicitly prevented.
> 3) reduces computational complexity as the below memcopy is not needed and
> furthermore the APM can assume that the signal is not modified by the call to
> the beamformer.
> 

I think a good balance is to have only one input-output parameters, which let us
get rid of the memcpy, but let it be modified for other types of beamformers.

https://codereview.webrtc.org/1982183002/diff/40001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:389:
process_transform_->ProcessChunk(input.channels(0), dummy_out->channels(0));
On 2016/05/30 11:49:25, peah-webrtc wrote:
> I think changing to interface would be much preferably rather than to avoid
this
> by using a dummy output. Apart from avoiding to changing/extending the
interface
> I cannot see the point of this.
> 
> And changing the interface to reflect how the beamformer actually performs
seems
> like the natural way to address this issue rather than to mock a beamformer
> behavior that is not there. 

The necessity of having a dummy_out has nothing to do with the Beamformer
interface, but with the LappedTransform. But changing that class (which I agree
we probably should) should probably be in its own CL.

https://codereview.webrtc.org/1982183002/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:439: size_t
num_output_channels,
On 2016/05/30 11:49:25, peah-webrtc wrote:
> With this implementation, the last two parameters are unused. Is there no
other
> way to achieve this? Can the LappedTransform only be used in a way that
assumes
> an input-output manner? With this solution, this is more complex than needed,
> since it actually does the synthesis unnecessarily on dummy data. 

Good point! Using 0 output channels achieves this.

https://codereview.webrtc.org/1982183002/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:443:
RTC_CHECK_EQ(1u, num_output_channels);
On 2016/05/30 11:49:25, peah-webrtc wrote:
> I guess this CHECK can be removed, right?

I changed it to 0 to make sure it only does synthesis.

peah-webrtc

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h File webrtc/modules/audio_processing/beamformer/beamformer.h (right): https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h#newcode37 webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void PostFilter(const ChannelBuffer<float>& input, On 2016/06/01 00:16:34, aluebs-webrtc ...

4 years, 6 months ago (2016-06-01 14:51:02 UTC) #11

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
On 2016/06/01 00:16:34, aluebs-webrtc wrote:
> On 2016/05/30 11:49:25, peah-webrtc wrote:
> > On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > > On 2016/05/26 08:48:52, peah-webrtc wrote:
> > > > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > > > Is there a reason why this is not implemented in the .cc file?
> > > > > 
> > > > > Beamformer is an interface, it has no .cc file.
> > > > 
> > > > My skills on this are limited, but to me, an interface is something that
> the
> > > > user of the interface needs to implement. If it has a default
> > implementation,
> > > it
> > > > is not a pure interface. Is there a reason for why this method cannot be
> > > > declared pure virtual so that this is a proper interface that enforces
the
> > > > implementer of the interface to supply a interface implementation?
> > > 
> > > It already is not a pure interface, because IsInBeam() has a default
> > > implementation. This has a default empty implementation because not all
> > > Beamformers have postfilters.
> > 
> > Do you mean that there are more beamformers that use this interface? 
> > 
> > Would it not be better then to let the actual beamformer implementing the
> > interface to explicitly provide an empty implementation, rather to have it
> > pre-provided?
> > 
> 
> At this point there is only the non-linear Beamformer in WebRTC, but there
were
> plans of porting other types if we think  they provide a benefit. I think
having
> a default implementation is more intuitive and adds less code-complexity.
First,
> because it doesn't force all Beamformers, even the ones that have nothing to
do
> with postfilters, implement it. But also, because this means that every
> Beamformer needs it's own empty implementation.

If there are no other beamformer solutions that should implement this interface
at the moment, I really don't think we should imposing restrictions on the
interface or add any code at all at the moment based on the possible behavior of
future beamformer  solutions. 

As it is now, the default implementation only adds code-complexity (as it is not
needed). It may possibly reduce code complexity in a future implementation, but
that we don't know until we have seen that implementation.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:407:
memcpy(output->channels(i)[0],
On 2016/06/01 00:16:34, aluebs-webrtc wrote:
> On 2016/05/30 11:49:25, peah-webrtc wrote:
> > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > what happens if the output is dual channel? Since the ProcessChunk
method
> > > locks
> > > > the content of the output to be single-channel, it should either assert
> that
> > > the
> > > > number of output channels allready is one, or explicitly set the number
of
> > > > channels to 1 in the output.
> > > 
> > > This can't be done as is, since the input and output can be the same
> > > ChannelBuffer and the ChannelBuffer has a fixed number of channels.
> > 
> > So this means that the input could have 2 channels, and then the output gets
> > only 1 channel, but the meta-information (about the number of channels) in
the
> > output does still not reflect that, which means that the caller needs to
know
> > this and explicitly set the number of channels in the meta-information to 1!
> > That sounds error prone and more complicated than necessary.
> > 
> > It would be better then to replace this ProcessChunk call by two other
calls,
> > One similar to this one, but that does not allow in-place calls, and another
> > that is explicitly in-place and takes only one ChannelBuffer pointer.
> > That would allow the ProcessChunk method to change the number of channels.
> 
> I don't think that adding an additional interface is a good idea. It only
> increases code complexity. But, modifying the ChannelBuffer behavior, we can
set
> the number of output channels to 1.

It actually does not add code complexity, as it seems that only one method is
needed by the current beamformer. And I don't think we should keep any code only
to support (possible) future beamformers. That code could instead then be added
when those are introduced.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->ProcessChunk(*ca->split_data_f(),
On 2016/06/01 00:16:34, aluebs-webrtc wrote:
> On 2016/05/30 11:49:25, peah-webrtc wrote:
> > On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > > On 2016/05/26 08:48:53, peah-webrtc wrote:
> > > > If the ProcessChunk is made input-only,  you should be able to pass
> > > > ca->split_bands_const_f here instead, right?
> > > 
> > > Because other Beamformers might do some processing (even all of it) in
> > > ProcessChunk(), I think we should leave it as is.
> > 
> > Are there other beamformer implementations, apart from the one reviewed in
> this
> > CL, that could be called here?
> > 
> > I don't think it is a good idea to keep it like this, because it gives the
> > impression that the beamformer is actually changing the signal, which it is
> not.
> > Furthermore, it unnecessarily makes the APM sensitive to any bugs in the
> > beamformer implementation that accidentally changes the signal.
> > 
> > If there are other beamformer implementations that could actually change the
> > signal, they should not be allowed to be used together with the AEC, so in
> that
> > case it would make sense to place a DCHECK that the AEC is not activated if
> they
> > are active. 
> 
> There are no other beamformers in WebRTC at the moment, but there were plans
to
> support multiple different ones if there is a use case for them. If going from
> stereo to mono is to modify the signal, it actually does modify it. And also,
> other linear beamformers that actually change the signal here should totally
be
> allowed to be used with the AEC.

I don't think we should write/keep code to support possibly upcoming beamformer
solutions. That code can instead be (re-)added at that point.

You are correct that going from mono to stereo is a signal change, but that I
would then prefer to keep outside of the beamformer as it is more explicit. 
To me, downmixing from stereo to mono is not something that is that is the task
of a beamformer, in particular when this is the only signal modification done by
a beamformer.
Another drawback with that solution is that the downmixing unnecessarily forces
the rest of the audio pipeline to be mono. This is not at all needed and if this
restriction is imposed by the beamformer, the beamformer will need to be turned
off whenever a stereo signal is to be processed by the rest of the components.

I'm not sure what you mean with the linear beamformer and the AEC? Do you mean a
fixed linear beamformer? That I definitely agree should be fine, but until such
a beamformer is developed, I don't think it makes sense to keep/add code that
supports that. Adding that code is very easily done when that beamformer
solution is developed.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:383:
ChannelBuffer<float>* output) {
On 2016/06/01 00:16:34, aluebs-webrtc wrote:
> On 2016/05/30 11:49:25, peah-webrtc wrote:
> > On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > > On 2016/05/26 08:48:53, peah-webrtc wrote:
> > > > Since the linear beamforming is no longer done, output could be removed,
> > > right?
> > > > Alternative, if you want to maintain this method for when the linear
> > > beamforming
> > > > is to be used, I think it would make sense to add another method which
is
> > > > input-only.
> > > 
> > > Adding yet another interface, increases code complexity. I think we want
to
> > > leave it as is: one processing method and a non-linear postfilter method. 
> > 
> > That depends, as there is currently no beamformer (?) that uses the output
> > functionality of this method, right? And if there is, if another method is
> added
> > to the interface, the memcopy done below could be removed so I think the
> amount
> > of code would possibly decrease with another interface.
> > 
> > 
> > Furthermore, there are clear benefits with another interface that could be
> > argued outweights the potential drawback of code complexity for the
> beamformer:
> > 1) increases code readability for APM as it is clear that this is an
> input-only
> > functionality.
> > 2) increases code security for the APM, as adaptive processing of the signal
> > prior to the AEC is explicitly prevented.
> > 3) reduces computational complexity as the below memcopy is not needed and
> > furthermore the APM can assume that the signal is not modified by the call
to
> > the beamformer.
> > 
> 
> I think a good balance is to have only one input-output parameters, which let
us
> get rid of the memcpy, but let it be modified for other types of beamformers.

Since ProcessChunk does not alter the signal, I see no reason for not having it
input-only. The exception is supporting upcoming beamformer solutions but then
the code for that can be altered at that point. The likelihood is anyway very
high that any code that is now customized to match upcoming beamformer solutions
will anyway need to be changed according to new requirements in these.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/chann...
File webrtc/common_audio/channel_buffer.h (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/chann...
webrtc/common_audio/channel_buffer.h:92: // 0 <= channel <
|num_official_channels_|
This seems very confusing to me. What is the difference between
num_official_channels and num_channels?

Could you please explain further why this change in ChannelBuffer is needed?

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_buffer.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_buffer.cc:187:
data_->set_num_channels(num_proc_channels_);
Could you please explain why this change is needed?

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_buffer.cc:353: if (split_data_.get()) {
Could you please explain why this change is needed?

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/splitting_filter.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:75:
RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size());
Could you please explain why this change is needed?

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:76: for (size_t i = 0; i <
data->num_channels(); ++i) {
Could you please explain why this change is needed?

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:98:
RTC_DCHECK_LE(data->num_channels(), three_band_filter_banks_.size());
Could you please explain why this change is needed?

aluebs-webrtc

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h File webrtc/modules/audio_processing/beamformer/beamformer.h (right): https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processing/beamformer/beamformer.h#newcode37 webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void PostFilter(const ChannelBuffer<float>& input, On 2016/06/01 14:51:01, peah-webrtc ...

4 years, 6 months ago (2016-06-01 22:13:21 UTC) #12

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/beamformer.h:37: virtual void
PostFilter(const ChannelBuffer<float>& input,
On 2016/06/01 14:51:01, peah-webrtc wrote:
> On 2016/06/01 00:16:34, aluebs-webrtc wrote:
> > On 2016/05/30 11:49:25, peah-webrtc wrote:
> > > On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > > > On 2016/05/26 08:48:52, peah-webrtc wrote:
> > > > > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > > > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > > > > Is there a reason why this is not implemented in the .cc file?
> > > > > > 
> > > > > > Beamformer is an interface, it has no .cc file.
> > > > > 
> > > > > My skills on this are limited, but to me, an interface is something
that
> > the
> > > > > user of the interface needs to implement. If it has a default
> > > implementation,
> > > > it
> > > > > is not a pure interface. Is there a reason for why this method cannot
be
> > > > > declared pure virtual so that this is a proper interface that enforces
> the
> > > > > implementer of the interface to supply a interface implementation?
> > > > 
> > > > It already is not a pure interface, because IsInBeam() has a default
> > > > implementation. This has a default empty implementation because not all
> > > > Beamformers have postfilters.
> > > 
> > > Do you mean that there are more beamformers that use this interface? 
> > > 
> > > Would it not be better then to let the actual beamformer implementing the
> > > interface to explicitly provide an empty implementation, rather to have it
> > > pre-provided?
> > > 
> > 
> > At this point there is only the non-linear Beamformer in WebRTC, but there
> were
> > plans of porting other types if we think  they provide a benefit. I think
> having
> > a default implementation is more intuitive and adds less code-complexity.
> First,
> > because it doesn't force all Beamformers, even the ones that have nothing to
> do
> > with postfilters, implement it. But also, because this means that every
> > Beamformer needs it's own empty implementation.
> 
> If there are no other beamformer solutions that should implement this
interface
> at the moment, I really don't think we should imposing restrictions on the
> interface or add any code at all at the moment based on the possible behavior
of
> future beamformer  solutions. 
> 
> As it is now, the default implementation only adds code-complexity (as it is
not
> needed). It may possibly reduce code complexity in a future implementation,
but
> that we don't know until we have seen that implementation. 

I think we should keep the code scalable, specially if the "code-complexity
addition" is literally "{}". But if you insist, I dropped the whole interface,
since it makes no sense to have one for only one Beamformer implementation in
WebRTC.

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:407:
memcpy(output->channels(i)[0],
On 2016/06/01 14:51:01, peah-webrtc wrote:
> On 2016/06/01 00:16:34, aluebs-webrtc wrote:
> > On 2016/05/30 11:49:25, peah-webrtc wrote:
> > > On 2016/05/26 01:04:45, aluebs-webrtc wrote:
> > > > On 2016/05/22 21:06:48, peah-webrtc wrote:
> > > > > what happens if the output is dual channel? Since the ProcessChunk
> method
> > > > locks
> > > > > the content of the output to be single-channel, it should either
assert
> > that
> > > > the
> > > > > number of output channels allready is one, or explicitly set the
number
> of
> > > > > channels to 1 in the output.
> > > > 
> > > > This can't be done as is, since the input and output can be the same
> > > > ChannelBuffer and the ChannelBuffer has a fixed number of channels.
> > > 
> > > So this means that the input could have 2 channels, and then the output
gets
> > > only 1 channel, but the meta-information (about the number of channels) in
> the
> > > output does still not reflect that, which means that the caller needs to
> know
> > > this and explicitly set the number of channels in the meta-information to
1!
> > > That sounds error prone and more complicated than necessary.
> > > 
> > > It would be better then to replace this ProcessChunk call by two other
> calls,
> > > One similar to this one, but that does not allow in-place calls, and
another
> > > that is explicitly in-place and takes only one ChannelBuffer pointer.
> > > That would allow the ProcessChunk method to change the number of channels.
> > 
> > I don't think that adding an additional interface is a good idea. It only
> > increases code complexity. But, modifying the ChannelBuffer behavior, we can
> set
> > the number of output channels to 1.
> 
> It actually does not add code complexity, as it seems that only one method is
> needed by the current beamformer. And I don't think we should keep any code
only
> to support (possible) future beamformers. That code could instead then be
added
> when those are introduced.

Interface removed and changed to what input-only.

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->ProcessChunk(*ca->split_data_f(),
On 2016/06/01 14:51:01, peah-webrtc wrote:
> On 2016/06/01 00:16:34, aluebs-webrtc wrote:
> > On 2016/05/30 11:49:25, peah-webrtc wrote:
> > > On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > > > On 2016/05/26 08:48:53, peah-webrtc wrote:
> > > > > If the ProcessChunk is made input-only,  you should be able to pass
> > > > > ca->split_bands_const_f here instead, right?
> > > > 
> > > > Because other Beamformers might do some processing (even all of it) in
> > > > ProcessChunk(), I think we should leave it as is.
> > > 
> > > Are there other beamformer implementations, apart from the one reviewed in
> > this
> > > CL, that could be called here?
> > > 
> > > I don't think it is a good idea to keep it like this, because it gives the
> > > impression that the beamformer is actually changing the signal, which it
is
> > not.
> > > Furthermore, it unnecessarily makes the APM sensitive to any bugs in the
> > > beamformer implementation that accidentally changes the signal.
> > > 
> > > If there are other beamformer implementations that could actually change
the
> > > signal, they should not be allowed to be used together with the AEC, so in
> > that
> > > case it would make sense to place a DCHECK that the AEC is not activated
if
> > they
> > > are active. 
> > 
> > There are no other beamformers in WebRTC at the moment, but there were plans
> to
> > support multiple different ones if there is a use case for them. If going
from
> > stereo to mono is to modify the signal, it actually does modify it. And
also,
> > other linear beamformers that actually change the signal here should totally
> be
> > allowed to be used with the AEC.
> 
> I don't think we should write/keep code to support possibly upcoming
beamformer
> solutions. That code can instead be (re-)added at that point.
> 
> You are correct that going from mono to stereo is a signal change, but that I
> would then prefer to keep outside of the beamformer as it is more explicit. 
> To me, downmixing from stereo to mono is not something that is that is the
task
> of a beamformer, in particular when this is the only signal modification done
by
> a beamformer.
> Another drawback with that solution is that the downmixing unnecessarily
forces
> the rest of the audio pipeline to be mono. This is not at all needed and if
this
> restriction is imposed by the beamformer, the beamformer will need to be
turned
> off whenever a stereo signal is to be processed by the rest of the components.
> 
> I'm not sure what you mean with the linear beamformer and the AEC? Do you mean
a
> fixed linear beamformer? That I definitely agree should be fine, but until
such
> a beamformer is developed, I don't think it makes sense to keep/add code that
> supports that. Adding that code is very easily done when that beamformer
> solution is developed.

This whole CL is only to support "possibly upcoming AEC", since the current
setup seems to be working perfectly fine with today's overly aggressive AEC. I
also don't find it hard to follow your comments that are contradicting: should I
or should I not force the output ChannelBuffer to be mono?

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:383:
ChannelBuffer<float>* output) {
On 2016/06/01 14:51:01, peah-webrtc wrote:
> On 2016/06/01 00:16:34, aluebs-webrtc wrote:
> > On 2016/05/30 11:49:25, peah-webrtc wrote:
> > > On 2016/05/28 03:00:00, aluebs-webrtc wrote:
> > > > On 2016/05/26 08:48:53, peah-webrtc wrote:
> > > > > Since the linear beamforming is no longer done, output could be
removed,
> > > > right?
> > > > > Alternative, if you want to maintain this method for when the linear
> > > > beamforming
> > > > > is to be used, I think it would make sense to add another method which
> is
> > > > > input-only.
> > > > 
> > > > Adding yet another interface, increases code complexity. I think we want
> to
> > > > leave it as is: one processing method and a non-linear postfilter
method. 
> > > 
> > > That depends, as there is currently no beamformer (?) that uses the output
> > > functionality of this method, right? And if there is, if another method is
> > added
> > > to the interface, the memcopy done below could be removed so I think the
> > amount
> > > of code would possibly decrease with another interface.
> > > 
> > > 
> > > Furthermore, there are clear benefits with another interface that could be
> > > argued outweights the potential drawback of code complexity for the
> > beamformer:
> > > 1) increases code readability for APM as it is clear that this is an
> > input-only
> > > functionality.
> > > 2) increases code security for the APM, as adaptive processing of the
signal
> > > prior to the AEC is explicitly prevented.
> > > 3) reduces computational complexity as the below memcopy is not needed and
> > > furthermore the APM can assume that the signal is not modified by the call
> to
> > > the beamformer.
> > > 
> > 
> > I think a good balance is to have only one input-output parameters, which
let
> us
> > get rid of the memcpy, but let it be modified for other types of
beamformers.
> 
> Since ProcessChunk does not alter the signal, I see no reason for not having
it
> input-only. The exception is supporting upcoming beamformer solutions but then
> the code for that can be altered at that point. The likelihood is anyway very
> high that any code that is now customized to match upcoming beamformer
solutions
> will anyway need to be changed according to new requirements in these.

If we would have used my initial suggestion, upcoming Beamformers would not have
needed to change the "customizations", since they didn't need to implement the
PostFilter method (default empty) and could directly implement the ProcessChunk
with an input and output as any normal Beamformer. But with your suggestions of
specializing the whole interface to this specific implementation a new
Beamformer would be a completely different component.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/chann...
File webrtc/common_audio/channel_buffer.h (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/chann...
webrtc/common_audio/channel_buffer.h:92: // 0 <= channel <
|num_official_channels_|
On 2016/06/01 14:51:01, peah-webrtc wrote:
> This seems very confusing to me. What is the difference between
> num_official_channels and num_channels?
> 
> Could you please explain further why this change in ChannelBuffer is needed?

It is needed to change the number of channels of a ChannelBuffer dynamically.
The num_channels is the actual space in memory and the num_official_channels is
the one surfaced to the users of ChannelBuffer.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_buffer.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_buffer.cc:187:
data_->set_num_channels(num_proc_channels_);
On 2016/06/01 14:51:01, peah-webrtc wrote:
> Could you please explain why this change is needed?

To re-init the ChannelBuffers to their original number of channels.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_buffer.cc:353: if (split_data_.get()) {
On 2016/06/01 14:51:01, peah-webrtc wrote:
> Could you please explain why this change is needed?

To change the ChannelBuffers to the new number of channels when the AudioBuffer
number of channels is changed.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/splitting_filter.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:75:
RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size());
On 2016/06/01 14:51:01, peah-webrtc wrote:
> Could you please explain why this change is needed?

When there is a change in number of channels between splitting and merging, the
size and channels in the ChannelBuffer are not the same, but it only needs to be
lower or equal than the former.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:76: for (size_t i = 0; i <
data->num_channels(); ++i) {
On 2016/06/01 14:51:01, peah-webrtc wrote:
> Could you please explain why this change is needed?

Please see above.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:98:
RTC_DCHECK_LE(data->num_channels(), three_band_filter_banks_.size());
On 2016/06/01 14:51:01, peah-webrtc wrote:
> Could you please explain why this change is needed?

Please see above.

peah-webrtc

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/channel_buffer.h File webrtc/common_audio/channel_buffer.h (right): https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/channel_buffer.h#newcode92 webrtc/common_audio/channel_buffer.h:92: // 0 <= channel < |num_official_channels_| On 2016/06/01 22:13:20, ...

4 years, 6 months ago (2016-06-08 12:04:55 UTC) #15

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/chann...
File webrtc/common_audio/channel_buffer.h (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/chann...
webrtc/common_audio/channel_buffer.h:92: // 0 <= channel <
|num_official_channels_|
On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> On 2016/06/01 14:51:01, peah-webrtc wrote:
> > This seems very confusing to me. What is the difference between
> > num_official_channels and num_channels?
> > 
> > Could you please explain further why this change in ChannelBuffer is needed?
> 
> It is needed to change the number of channels of a ChannelBuffer dynamically.
> The num_channels is the actual space in memory and the num_official_channels
is
> the one surfaced to the users of ChannelBuffer.

I still don't see why this change is here. But I'll post the question in the
latest patch instead.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_buffer.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_buffer.cc:187:
data_->set_num_channels(num_proc_channels_);
On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> On 2016/06/01 14:51:01, peah-webrtc wrote:
> > Could you please explain why this change is needed?
> 
> To re-init the ChannelBuffers to their original number of channels.

I still don't understand why this change is needed in this CL. But I'll post the
question in the latest patch instead.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/splitting_filter.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:75:
RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size());
On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> On 2016/06/01 14:51:01, peah-webrtc wrote:
> > Could you please explain why this change is needed?
> 
> When there is a change in number of channels between splitting and merging,
the
> size and channels in the ChannelBuffer are not the same, but it only needs to
be
> lower or equal than the former.

Do you mean that before this CL, the size in the channels was changed by the
beamformer, but that the number of channels needed to be changed inside APM by
setting it to 1?

And with the changes in this CL, which no longer applies downmixing inside the
beamformer, the downmixing instead needs to be done as part of the synthesis?

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/common_audio/chann...
File webrtc/common_audio/channel_buffer.h (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/common_audio/chann...
webrtc/common_audio/channel_buffer.h:51: num_channels_(num_channels),
I see that the changes in audio_buffer are needed for the changes in
lapped_transform.
What I cannot see though, is why these changes are needed for this CL?

As I see it, the difference with this in this CL is that the downmixing has been
moved from within the beamformer to being done inside APM (but outside the
beamformer).

I thought before this CL this downmixing was partly done by the call
ca->set_num_channels(1); done inside APM and partly by the linear beamforming
done inside the beamformer, right?

And the way I interpret it, these changes that are now done in the audio_buffer,
channel_buffer and splitting_filter are needed to perform all of the downmixing
at the call to ca->set_num_channels(1); 
Is that correct?

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->AnalyzeChunk(*ca->split_data_f());
What has changed here, is that now all the downmixing is done by the
set_num_channels call, in contrast to that before the downmixing was done inside
the beamformer, but the set_num_channels call was needed to correct for the new
number of channels? Is that correct?

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:191:
RTC_CHECK_EQ(1u, num_input_channels);
As it is now, the beamformer will crash the application if the number of input
and output channels is other than 1. Is it necessary to have this restriction
here?
There is not really anything that prevents final_mask to be applied on
multichannel input/output, right? If needed, it would be nice to place it higher
up in the beamformer code, and also to DCHECK instead of CHECK.

If possible, it would be nice to remove the constraint of the postfilter as the
other components do not have the single-channel restriction and there is no
theoretical constraint that limits the number of channels to 1.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:233:
dummy_out_.reset(new ChannelBuffer<float>(chunk_length_, 0u));
Why do we need to allocate a dummy output of a full buffer length? Is it not
possible to allocate it with a zero length, and zero number of channels, as the
zero number of channels should cause no output to be produced, and therefore a
zero output buffer length would be feasible?

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:388:
process_transform_->ProcessChunk(data.channels(0), dummy_out_->channels(0));
Do you really need to pass a dummy output buffer here? Would it not be
sufficient with a null pointer, or a pointer to a local scalar? (What I'm
looking at is whether it would be possible to move away from having the
dummy_out member variable which is only used to make the input/output
functionality input-only in the lapped transform.)

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:405:
data->channels(i)[0][j] *= smoothed_mask;
This code assumes single-channel processing, right? But the CHECKS for that is
only in the ProcessAudioBlock. Since mono-channel is assumed hereing, it would
be nice to have the CHECKS here as well.

Or even better, adjust the postfilter to support multichannel data. Afaics that
should be quite easily done. WDYT?

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:154:
std::unique_ptr<ChannelBuffer<float>> dummy_out_;
If possible, it would be nice to avoid this one. Please see my other comments
about that.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer_test.cc
(right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer_test.cc:74:
bf.PostFilter(&buf);
Does this really work? The AnalyzeChunk does not downmix the signal, right? But
afaics the PostFilter assumes a mono signal and CHECKS otherwise. Or am I
missing something?

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer_unittest.cc
(right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer_unittest.cc:66:
capture_audio_buffer->set_num_channels(1);
With the changes in this CL, the set_num_channels(1) calls performs a downmix,
right?

aluebs-webrtc

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/channel_buffer.h File webrtc/common_audio/channel_buffer.h (right): https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/channel_buffer.h#newcode92 webrtc/common_audio/channel_buffer.h:92: // 0 <= channel < |num_official_channels_| On 2016/06/08 12:04:55, ...

4 years, 6 months ago (2016-06-09 02:11:46 UTC) #17

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/chann...
File webrtc/common_audio/channel_buffer.h (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/common_audio/chann...
webrtc/common_audio/channel_buffer.h:92: // 0 <= channel <
|num_official_channels_|
On 2016/06/08 12:04:55, peah-webrtc wrote:
> On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> > On 2016/06/01 14:51:01, peah-webrtc wrote:
> > > This seems very confusing to me. What is the difference between
> > > num_official_channels and num_channels?
> > > 
> > > Could you please explain further why this change in ChannelBuffer is
needed?
> > 
> > It is needed to change the number of channels of a ChannelBuffer
dynamically.
> > The num_channels is the actual space in memory and the num_official_channels
> is
> > the one surfaced to the users of ChannelBuffer.
> 
> I still don't see why this change is here. But I'll post the question in the
> latest patch instead.

Acknowledged.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_buffer.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_buffer.cc:187:
data_->set_num_channels(num_proc_channels_);
On 2016/06/08 12:04:55, peah-webrtc wrote:
> On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> > On 2016/06/01 14:51:01, peah-webrtc wrote:
> > > Could you please explain why this change is needed?
> > 
> > To re-init the ChannelBuffers to their original number of channels.
> 
> I still don't understand why this change is needed in this CL. But I'll post
the
> question in the latest patch instead.

Acknowledged.

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/splitting_filter.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:75:
RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size());
On 2016/06/08 12:04:55, peah-webrtc wrote:
> On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> > On 2016/06/01 14:51:01, peah-webrtc wrote:
> > > Could you please explain why this change is needed?
> > 
> > When there is a change in number of channels between splitting and merging,
> the
> > size and channels in the ChannelBuffer are not the same, but it only needs
to
> be
> > lower or equal than the former.
> 
> Do you mean that before this CL, the size in the channels was changed by the
> beamformer, but that the number of channels needed to be changed inside APM by
> setting it to 1?
> 
> And with the changes in this CL, which no longer applies downmixing inside the
> beamformer, the downmixing instead needs to be done as part of the synthesis?
> 

No. This was a bug. Before this CL the number of channels of the AudioBuffer was
changed, but this was not reflected in the number of channels of the
ChannelBuffers, so the synthesis filter was merging the original number of
channels, which were later discarded in CopyTo (or InterleaveTo). So the bug had
no effects except adding complexity. Now that the number of channels of
AudioBuffer is also reflected in the ChannelBuffer, it triggered this overly
strict CHECK. So I fixed it.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/common_audio/chann...
File webrtc/common_audio/channel_buffer.h (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/common_audio/chann...
webrtc/common_audio/channel_buffer.h:51: num_channels_(num_channels),
On 2016/06/08 12:04:55, peah-webrtc wrote:
> I see that the changes in audio_buffer are needed for the changes in
> lapped_transform.
> What I cannot see though, is why these changes are needed for this CL?
> 
> As I see it, the difference with this in this CL is that the downmixing has
been
> moved from within the beamformer to being done inside APM (but outside the
> beamformer).
> 
> I thought before this CL this downmixing was partly done by the call
> ca->set_num_channels(1); done inside APM and partly by the linear beamforming
> done inside the beamformer, right?
> 
> And the way I interpret it, these changes that are now done in the
audio_buffer,
> channel_buffer and splitting_filter are needed to perform all of the
downmixing
> at the call to ca->set_num_channels(1); 
> Is that correct?
> 

Please see my other comment about the bug this solves.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->AnalyzeChunk(*ca->split_data_f());
On 2016/06/08 12:04:55, peah-webrtc wrote:
> What has changed here, is that now all the downmixing is done by the
> set_num_channels call, in contrast to that before the downmixing was done
inside
> the beamformer, but the set_num_channels call was needed to correct for the
new
> number of channels? Is that correct?

What do you mean with downmixing? Before there was a frequency domain fixed
delay and sum beamformer and then the number of channels was corrected in the
AudioBuffer. Now all channels except the first one are discarded.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:191:
RTC_CHECK_EQ(1u, num_input_channels);
On 2016/06/08 12:04:55, peah-webrtc wrote:
> As it is now, the beamformer will crash the application if the number of input
> and output channels is other than 1. Is it necessary to have this restriction
> here?
> There is not really anything that prevents final_mask to be applied on
> multichannel input/output, right? If needed, it would be nice to place it
higher
> up in the beamformer code, and also to DCHECK instead of CHECK.
> 
> If possible, it would be nice to remove the constraint of the postfilter as
the
> other components do not have the single-channel restriction and there is no
> theoretical constraint that limits the number of channels to 1.

Constraint removed. Changed to DCHECK.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:233:
dummy_out_.reset(new ChannelBuffer<float>(chunk_length_, 0u));
On 2016/06/08 12:04:55, peah-webrtc wrote:
> Why do we need to allocate a dummy output of a full buffer length? Is it not
> possible to allocate it with a zero length, and zero number of channels, as
the
> zero number of channels should cause no output to be produced, and therefore a
> zero output buffer length would be feasible?

It doesn't allocate anything, since if there are 0 number of channels it doesn't
matter the length.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:388:
process_transform_->ProcessChunk(data.channels(0), dummy_out_->channels(0));
On 2016/06/08 12:04:55, peah-webrtc wrote:
> Do you really need to pass a dummy output buffer here? Would it not be
> sufficient with a null pointer, or a pointer to a local scalar? (What I'm
> looking at is whether it would be possible to move away from having the
> dummy_out member variable which is only used to make the input/output
> functionality input-only in the lapped transform.)

Good point, since I modified lapped_transform to accept 0 as a number of
channels I can pass in a nullptr safely. Done.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.cc:405:
data->channels(i)[0][j] *= smoothed_mask;
On 2016/06/08 12:04:55, peah-webrtc wrote:
> This code assumes single-channel processing, right? But the CHECKS for that is
> only in the ProcessAudioBlock. Since mono-channel is assumed hereing, it would
> be nice to have the CHECKS here as well.
> 
> Or even better, adjust the postfilter to support multichannel data. Afaics
that
> should be quite easily done. WDYT?

Adjusted the postfilter to support multichannel data.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:154:
std::unique_ptr<ChannelBuffer<float>> dummy_out_;
On 2016/06/08 12:04:55, peah-webrtc wrote:
> If possible, it would be nice to avoid this one. Please see my other comments
> about that.

Removed.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer_test.cc
(right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer_test.cc:74:
bf.PostFilter(&buf);
On 2016/06/08 12:04:55, peah-webrtc wrote:
> Does this really work? The AnalyzeChunk does not downmix the signal, right?
But
> afaics the PostFilter assumes a mono signal and CHECKS otherwise. Or am I
> missing something?

It did, because it doesn't use the number of channels of the ChannelBuffer at
all. Instead it uses the number of channels set in the constructor. But added a
DCHECK to make it more restrictive.

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer_unittest.cc
(right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer_unittest.cc:66:
capture_audio_buffer->set_num_channels(1);
On 2016/06/08 12:04:55, peah-webrtc wrote:
> With the changes in this CL, the set_num_channels(1) calls performs a downmix,
> right? 

If by downmix you mean it just discards all channels except the first one, then
yes.

peah-webrtc

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_processing/splitting_filter.cc File webrtc/modules/audio_processing/splitting_filter.cc (right): https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_processing/splitting_filter.cc#newcode75 webrtc/modules/audio_processing/splitting_filter.cc:75: RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size()); On 2016/06/09 02:11:46, aluebs-webrtc wrote: > On ...

4 years, 6 months ago (2016-06-09 09:36:30 UTC) #18

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/splitting_filter.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:75:
RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size());
On 2016/06/09 02:11:46, aluebs-webrtc wrote:
> On 2016/06/08 12:04:55, peah-webrtc wrote:
> > On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> > > On 2016/06/01 14:51:01, peah-webrtc wrote:
> > > > Could you please explain why this change is needed?
> > > 
> > > When there is a change in number of channels between splitting and
merging,
> > the
> > > size and channels in the ChannelBuffer are not the same, but it only needs
> to
> > be
> > > lower or equal than the former.
> > 
> > Do you mean that before this CL, the size in the channels was changed by the
> > beamformer, but that the number of channels needed to be changed inside APM
by
> > setting it to 1?
> > 
> > And with the changes in this CL, which no longer applies downmixing inside
the
> > beamformer, the downmixing instead needs to be done as part of the
synthesis?
> > 
> 
> No. This was a bug. Before this CL the number of channels of the AudioBuffer
was
> changed, but this was not reflected in the number of channels of the
> ChannelBuffers, so the synthesis filter was merging the original number of
> channels, which were later discarded in CopyTo (or InterleaveTo). So the bug
had
> no effects except adding complexity. Now that the number of channels of
> AudioBuffer is also reflected in the ChannelBuffer, it triggered this overly
> strict CHECK. So I fixed it.

Great find!! It would, however, be good to have that bugfix as a separate CL as
it
1) Is not at all related to the beamformer.
2) Will also affect all calls where the beamformer is not active.
3) Will change the complexity (reduce).

Would it be possible to extract that part into another CL, and land it before
this CL is landed so that the changes in the first CL can be rebased into the
basis for this CL?

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->AnalyzeChunk(*ca->split_data_f());
On 2016/06/09 02:11:46, aluebs-webrtc wrote:
> On 2016/06/08 12:04:55, peah-webrtc wrote:
> > What has changed here, is that now all the downmixing is done by the
> > set_num_channels call, in contrast to that before the downmixing was done
> inside
> > the beamformer, but the set_num_channels call was needed to correct for the
> new
> > number of channels? Is that correct?
> 
> What do you mean with downmixing? Before there was a frequency domain fixed
> delay and sum beamformer and then the number of channels was corrected in the
> AudioBuffer. Now all channels except the first one are discarded.

By downmixing I mean going from 2 to 1 channel, so discarding the leftmost
channel definitely counts as that.

aluebs-webrtc

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_processing/splitting_filter.cc File webrtc/modules/audio_processing/splitting_filter.cc (right): https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_processing/splitting_filter.cc#newcode75 webrtc/modules/audio_processing/splitting_filter.cc:75: RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size()); On 2016/06/09 09:36:30, peah-webrtc wrote: > On ...

4 years, 6 months ago (2016-06-09 19:21:42 UTC) #19

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/splitting_filter.cc (right):

https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/splitting_filter.cc:75:
RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size());
On 2016/06/09 09:36:30, peah-webrtc wrote:
> On 2016/06/09 02:11:46, aluebs-webrtc wrote:
> > On 2016/06/08 12:04:55, peah-webrtc wrote:
> > > On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> > > > On 2016/06/01 14:51:01, peah-webrtc wrote:
> > > > > Could you please explain why this change is needed?
> > > > 
> > > > When there is a change in number of channels between splitting and
> merging,
> > > the
> > > > size and channels in the ChannelBuffer are not the same, but it only
needs
> > to
> > > be
> > > > lower or equal than the former.
> > > 
> > > Do you mean that before this CL, the size in the channels was changed by
the
> > > beamformer, but that the number of channels needed to be changed inside
APM
> by
> > > setting it to 1?
> > > 
> > > And with the changes in this CL, which no longer applies downmixing inside
> the
> > > beamformer, the downmixing instead needs to be done as part of the
> synthesis?
> > > 
> > 
> > No. This was a bug. Before this CL the number of channels of the AudioBuffer
> was
> > changed, but this was not reflected in the number of channels of the
> > ChannelBuffers, so the synthesis filter was merging the original number of
> > channels, which were later discarded in CopyTo (or InterleaveTo). So the bug
> had
> > no effects except adding complexity. Now that the number of channels of
> > AudioBuffer is also reflected in the ChannelBuffer, it triggered this overly
> > strict CHECK. So I fixed it.
> 
> Great find!! It would, however, be good to have that bugfix as a separate CL
as
> it
> 1) Is not at all related to the beamformer.
> 2) Will also affect all calls where the beamformer is not active.
> 3) Will change the complexity (reduce).
> 
> Would it be possible to extract that part into another CL, and land it before
> this CL is landed so that the changes in the first CL can be rebased into the
> basis for this CL?

1) It is related to the Beamformer, since...
2) It is only triggered when there is a num_channels change, which only happens
when Beamformer is enabled.

But I agree it is better to break it out in a different CL. Done here:
https://codereview.webrtc.org/2053773002/

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:683:
private_submodules_->beamformer->AnalyzeChunk(*ca->split_data_f());
On 2016/06/09 09:36:30, peah-webrtc wrote:
> On 2016/06/09 02:11:46, aluebs-webrtc wrote:
> > On 2016/06/08 12:04:55, peah-webrtc wrote:
> > > What has changed here, is that now all the downmixing is done by the
> > > set_num_channels call, in contrast to that before the downmixing was done
> > inside
> > > the beamformer, but the set_num_channels call was needed to correct for
the
> > new
> > > number of channels? Is that correct?
> > 
> > What do you mean with downmixing? Before there was a frequency domain fixed
> > delay and sum beamformer and then the number of channels was corrected in
the
> > AudioBuffer. Now all channels except the first one are discarded.
> 
> By downmixing I mean going from 2 to 1 channel, so discarding the leftmost
> channel definitely counts as that.
> 

Personally I find "downmixing" to have a blending implication. But acknowledged.

peah-webrtc

On 2016/06/09 19:21:42, aluebs-webrtc wrote: > https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_processing/splitting_filter.cc > File webrtc/modules/audio_processing/splitting_filter.cc (right): > > https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_processing/splitting_filter.cc#newcode75 > ...

4 years, 6 months ago (2016-06-10 11:42:44 UTC) #20

On 2016/06/09 19:21:42, aluebs-webrtc wrote:
>
https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
> File webrtc/modules/audio_processing/splitting_filter.cc (right):
> 
>
https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
> webrtc/modules/audio_processing/splitting_filter.cc:75:
> RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size());
> On 2016/06/09 09:36:30, peah-webrtc wrote:
> > On 2016/06/09 02:11:46, aluebs-webrtc wrote:
> > > On 2016/06/08 12:04:55, peah-webrtc wrote:
> > > > On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> > > > > On 2016/06/01 14:51:01, peah-webrtc wrote:
> > > > > > Could you please explain why this change is needed?
> > > > > 
> > > > > When there is a change in number of channels between splitting and
> > merging,
> > > > the
> > > > > size and channels in the ChannelBuffer are not the same, but it only
> needs
> > > to
> > > > be
> > > > > lower or equal than the former.
> > > > 
> > > > Do you mean that before this CL, the size in the channels was changed by
> the
> > > > beamformer, but that the number of channels needed to be changed inside
> APM
> > by
> > > > setting it to 1?
> > > > 
> > > > And with the changes in this CL, which no longer applies downmixing
inside
> > the
> > > > beamformer, the downmixing instead needs to be done as part of the
> > synthesis?
> > > > 
> > > 
> > > No. This was a bug. Before this CL the number of channels of the
AudioBuffer
> > was
> > > changed, but this was not reflected in the number of channels of the
> > > ChannelBuffers, so the synthesis filter was merging the original number of
> > > channels, which were later discarded in CopyTo (or InterleaveTo). So the
bug
> > had
> > > no effects except adding complexity. Now that the number of channels of
> > > AudioBuffer is also reflected in the ChannelBuffer, it triggered this
overly
> > > strict CHECK. So I fixed it.
> > 
> > Great find!! It would, however, be good to have that bugfix as a separate CL
> as
> > it
> > 1) Is not at all related to the beamformer.
> > 2) Will also affect all calls where the beamformer is not active.
> > 3) Will change the complexity (reduce).
> > 
> > Would it be possible to extract that part into another CL, and land it
before
> > this CL is landed so that the changes in the first CL can be rebased into
the
> > basis for this CL?
> 
> 1) It is related to the Beamformer, since...
> 2) It is only triggered when there is a num_channels change, which only
happens
> when Beamformer is enabled.
> 
> But I agree it is better to break it out in a different CL. Done here:
> https://codereview.webrtc.org/2053773002/
> 
>
https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
> File webrtc/modules/audio_processing/audio_processing_impl.cc (right):
> 
>
https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
> webrtc/modules/audio_processing/audio_processing_impl.cc:683:
> private_submodules_->beamformer->AnalyzeChunk(*ca->split_data_f());
> On 2016/06/09 09:36:30, peah-webrtc wrote:
> > On 2016/06/09 02:11:46, aluebs-webrtc wrote:
> > > On 2016/06/08 12:04:55, peah-webrtc wrote:
> > > > What has changed here, is that now all the downmixing is done by the
> > > > set_num_channels call, in contrast to that before the downmixing was
done
> > > inside
> > > > the beamformer, but the set_num_channels call was needed to correct for
> the
> > > new
> > > > number of channels? Is that correct?
> > > 
> > > What do you mean with downmixing? Before there was a frequency domain
fixed
> > > delay and sum beamformer and then the number of channels was corrected in
> the
> > > AudioBuffer. Now all channels except the first one are discarded.
> > 
> > By downmixing I mean going from 2 to 1 channel, so discarding the leftmost
> > channel definitely counts as that.
> > 
> 
> Personally I find "downmixing" to have a blending implication. But
acknowledged.

What is the status of this CL? Will it be updated once the preceding CL is
landed? Or will there be another CL that is linked to the newly created CL, and
thereby excludes the code changes in that CL?

aluebs-webrtc

On 2016/06/10 11:42:44, peah-webrtc wrote: > On 2016/06/09 19:21:42, aluebs-webrtc wrote: > > > https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_processing/splitting_filter.cc ...

4 years, 6 months ago (2016-06-13 22:59:44 UTC) #22

On 2016/06/10 11:42:44, peah-webrtc wrote:
> On 2016/06/09 19:21:42, aluebs-webrtc wrote:
> >
>
https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
> > File webrtc/modules/audio_processing/splitting_filter.cc (right):
> > 
> >
>
https://codereview.webrtc.org/1982183002/diff/60001/webrtc/modules/audio_proc...
> > webrtc/modules/audio_processing/splitting_filter.cc:75:
> > RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size());
> > On 2016/06/09 09:36:30, peah-webrtc wrote:
> > > On 2016/06/09 02:11:46, aluebs-webrtc wrote:
> > > > On 2016/06/08 12:04:55, peah-webrtc wrote:
> > > > > On 2016/06/01 22:13:20, aluebs-webrtc wrote:
> > > > > > On 2016/06/01 14:51:01, peah-webrtc wrote:
> > > > > > > Could you please explain why this change is needed?
> > > > > > 
> > > > > > When there is a change in number of channels between splitting and
> > > merging,
> > > > > the
> > > > > > size and channels in the ChannelBuffer are not the same, but it only
> > needs
> > > > to
> > > > > be
> > > > > > lower or equal than the former.
> > > > > 
> > > > > Do you mean that before this CL, the size in the channels was changed
by
> > the
> > > > > beamformer, but that the number of channels needed to be changed
inside
> > APM
> > > by
> > > > > setting it to 1?
> > > > > 
> > > > > And with the changes in this CL, which no longer applies downmixing
> inside
> > > the
> > > > > beamformer, the downmixing instead needs to be done as part of the
> > > synthesis?
> > > > > 
> > > > 
> > > > No. This was a bug. Before this CL the number of channels of the
> AudioBuffer
> > > was
> > > > changed, but this was not reflected in the number of channels of the
> > > > ChannelBuffers, so the synthesis filter was merging the original number
of
> > > > channels, which were later discarded in CopyTo (or InterleaveTo). So the
> bug
> > > had
> > > > no effects except adding complexity. Now that the number of channels of
> > > > AudioBuffer is also reflected in the ChannelBuffer, it triggered this
> overly
> > > > strict CHECK. So I fixed it.
> > > 
> > > Great find!! It would, however, be good to have that bugfix as a separate
CL
> > as
> > > it
> > > 1) Is not at all related to the beamformer.
> > > 2) Will also affect all calls where the beamformer is not active.
> > > 3) Will change the complexity (reduce).
> > > 
> > > Would it be possible to extract that part into another CL, and land it
> before
> > > this CL is landed so that the changes in the first CL can be rebased into
> the
> > > basis for this CL?
> > 
> > 1) It is related to the Beamformer, since...
> > 2) It is only triggered when there is a num_channels change, which only
> happens
> > when Beamformer is enabled.
> > 
> > But I agree it is better to break it out in a different CL. Done here:
> > https://codereview.webrtc.org/2053773002/
> > 
> >
>
https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
> > File webrtc/modules/audio_processing/audio_processing_impl.cc (right):
> > 
> >
>
https://codereview.webrtc.org/1982183002/diff/80001/webrtc/modules/audio_proc...
> > webrtc/modules/audio_processing/audio_processing_impl.cc:683:
> > private_submodules_->beamformer->AnalyzeChunk(*ca->split_data_f());
> > On 2016/06/09 09:36:30, peah-webrtc wrote:
> > > On 2016/06/09 02:11:46, aluebs-webrtc wrote:
> > > > On 2016/06/08 12:04:55, peah-webrtc wrote:
> > > > > What has changed here, is that now all the downmixing is done by the
> > > > > set_num_channels call, in contrast to that before the downmixing was
> done
> > > > inside
> > > > > the beamformer, but the set_num_channels call was needed to correct
for
> > the
> > > > new
> > > > > number of channels? Is that correct?
> > > > 
> > > > What do you mean with downmixing? Before there was a frequency domain
> fixed
> > > > delay and sum beamformer and then the number of channels was corrected
in
> > the
> > > > AudioBuffer. Now all channels except the first one are discarded.
> > > 
> > > By downmixing I mean going from 2 to 1 channel, so discarding the leftmost
> > > channel definitely counts as that.
> > > 
> > 
> > Personally I find "downmixing" to have a blending implication. But
> acknowledged.
> 
> What is the status of this CL? Will it be updated once the preceding CL is
> landed? Or will there be another CL that is linked to the newly created CL,
and
> thereby excludes the code changes in that CL?

I updated this CL to exclude the changes of the other one. It can land before or
after the other CL. So please let me know if you have any additional comments on
this change.

aluebs-webrtc

peah, do you have any additional comments on this?

4 years, 6 months ago (2016-06-21 17:28:19 UTC) #23

peah-webrtc

https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode691 webrtc/modules/audio_processing/audio_processing_impl.cc:691: ca->set_num_channels(1); The handling of the number of channels is ...

4 years, 5 months ago (2016-06-23 22:02:41 UTC) #24

aluebs-webrtc

https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode691 webrtc/modules/audio_processing/audio_processing_impl.cc:691: ca->set_num_channels(1); On 2016/06/23 22:02:40, peah-webrtc wrote: > The handling ...

4 years, 5 months ago (2016-06-24 03:00:54 UTC) #26

https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:691:
ca->set_num_channels(1);
On 2016/06/23 22:02:40, peah-webrtc wrote:
> The handling of the number of channels is a bit messy.
> 
> As it is now, all the pipeline supports multichannel signals (maybe not in
> practice, but at least in theory).
> 
> Would it not make sense to here set the number of channels equal to the
desired
> number of output channels? I.e.:
> formats_.api_format.output_stream().num_channels()?
> 
> As it is now, that format is overridden when the beamformer is activated which
> is not that nice. It may have been necessary to do it like that before this
CL,
> but now the pipeline supports multichannel processing which would motivate
> changing this.
> 
> WDYT?

That makes sense, but it should be a different CL. It has nothing to do with
pulling out the PostFilter from the Beamformer.

https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h (right):

https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:75: // Applies
the postfilter mask to one chunk of audio. The audio is expected to
On 2016/06/23 22:02:40, peah-webrtc wrote:
> Suggestion: Since the comments are quite long, I think the code would benefit
in
> readability if there is a space between AnalyzeChunk and the following
comment.

Done.

https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:78: //
channels is expected to be 1, since that is the output number of channels
On 2016/06/23 22:02:40, peah-webrtc wrote:
> Please rewrite this comment as the constructor takes num_postfilter_channels
as
> an input which means that the the number of channels could be more than 1.
> Furthermore, the ProcessChunk method is now called AnalyzeChunk and has no
> outputs.

Done.

peah-webrtc

On 2016/06/24 03:00:54, aluebs-webrtc wrote: > https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_processing/audio_processing_impl.cc > File webrtc/modules/audio_processing/audio_processing_impl.cc (right): > > https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode691 > ...

4 years, 5 months ago (2016-06-24 21:26:33 UTC) #27

On 2016/06/24 03:00:54, aluebs-webrtc wrote:
>
https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
> File webrtc/modules/audio_processing/audio_processing_impl.cc (right):
> 
>
https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
> webrtc/modules/audio_processing/audio_processing_impl.cc:691:
> ca->set_num_channels(1);
> On 2016/06/23 22:02:40, peah-webrtc wrote:
> > The handling of the number of channels is a bit messy.
> > 
> > As it is now, all the pipeline supports multichannel signals (maybe not in
> > practice, but at least in theory).
> > 
> > Would it not make sense to here set the number of channels equal to the
> desired
> > number of output channels? I.e.:
> > formats_.api_format.output_stream().num_channels()?
> > 
> > As it is now, that format is overridden when the beamformer is activated
which
> > is not that nice. It may have been necessary to do it like that before this
> CL,
> > but now the pipeline supports multichannel processing which would motivate
> > changing this.
> > 
> > WDYT?
> 
> That makes sense, but it should be a different CL. It has nothing to do with
> pulling out the PostFilter from the Beamformer.
> 
>
https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
> File webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h
(right):
> 
>
https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
> webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:75: //
Applies
> the postfilter mask to one chunk of audio. The audio is expected to
> On 2016/06/23 22:02:40, peah-webrtc wrote:
> > Suggestion: Since the comments are quite long, I think the code would
benefit
> in
> > readability if there is a space between AnalyzeChunk and the following
> comment.
> 
> Done.
> 
>
https://codereview.webrtc.org/1982183002/diff/200001/webrtc/modules/audio_pro...
> webrtc/modules/audio_processing/beamformer/nonlinear_beamformer.h:78: //
> channels is expected to be 1, since that is the output number of channels
> On 2016/06/23 22:02:40, peah-webrtc wrote:
> > Please rewrite this comment as the constructor takes num_postfilter_channels
> as
> > an input which means that the the number of channels could be more than 1.
> > Furthermore, the ProcessChunk method is now called AnalyzeChunk and has no
> > outputs.
> 
> Done.

lgtm

aluebs-webrtc

The patchset sent to the CQ was uploaded after l-g-t-m from peah@webrtc.org Link to the ...

4 years, 5 months ago (2016-06-27 18:21:24 UTC) #29

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/1982183002/280001

4 years, 5 months ago (2016-06-27 18:21:27 UTC) #30

aluebs-webrtc

aluebs@webrtc.org changed reviewers: + henrik.lundin@webrtc.org

4 years, 5 months ago (2016-06-27 18:22:52 UTC) #31

aluebs-webrtc

henrik.lundin, could you please have a look at webrtc/common_audio/lapped_transform.cc. Thank you very much!

4 years, 5 months ago (2016-06-27 18:22:53 UTC) #32

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 5 months ago (2016-06-27 18:32:40 UTC) #33

commit-bot: I haz the power

Try jobs failed on following builders: android_dbg on master.tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/android_dbg/builds/14522)

4 years, 5 months ago (2016-06-27 18:32:42 UTC) #34

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/1982183002/280001

4 years, 5 months ago (2016-06-28 16:18:32 UTC) #37

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 5 months ago (2016-06-28 16:25:41 UTC) #38

commit-bot: I haz the power

Try jobs failed on following builders: android_dbg on master.tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/android_dbg/builds/14565)

4 years, 5 months ago (2016-06-28 16:25:43 UTC) #39

aluebs-webrtc

Description was changed from ========== Pull out the PostFilter to its own NonlinearBeamformer API This ...

4 years, 5 months ago (2016-06-28 17:03:03 UTC) #40

commit-bot: I haz the power

Description was changed from ========== Pull out the PostFilter to its own NonlinearBeamformer API This ...

4 years, 5 months ago (2016-06-28 17:03:03 UTC) #41

aluebs-webrtc

Committed patchset #12 (id:280001) manually as b983112bc7686ed4276def4c9215c498444c6bdd (presubmit successful).

4 years, 5 months ago (2016-06-28 17:03:05 UTC) #42

commit-bot: I haz the power

4 years, 5 months ago (2016-06-28 17:03:05 UTC) #43

Patchset 12 (id:??) landed as
https://crrev.com/b983112bc7686ed4276def4c9215c498444c6bdd
Cr-Commit-Position: refs/heads/master@{#13314}

Issue 1982183002: Pull out the PostFilter to its own NonlinearBeamformer API (Closed)

Description

Patch Set 1 #

Patch Set 2 : Drop delay and sum #

Patch Set 3 : Add dummy out #

Patch Set 4 : Change Beamformer::ProcessChunk() to have one parameter #

Patch Set 5 : Drop Beamformer Interface #

Patch Set 6 : Rebasing #

Patch Set 7 : Add multichannel postfilter support #

Patch Set 8 : Rebasing #

Patch Set 9 : Remove ChannelBuffer fix #

Patch Set 10 : Rebasing #

Patch Set 11 : Fix comments #

Patch Set 12 : Make NonlinearBeamforer interface virtual #

Messages