Issue 1234463003: Integrate Intelligibility with APM

ekm

ekmeyerson@webrtc.org changed reviewers: + aluebs@webrtc.org, andrew@webrtc.org, henrik.lundin@webrtc.org, turaj@webrtc.org

5 years, 5 months ago (2015-07-11 07:28:13 UTC) #1

henrika_webrtc

I don't know any background here and don't know if I am the best person ...

5 years, 5 months ago (2015-07-13 07:58:17 UTC) #3

turaj

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode614 webrtc/modules/audio_processing/audio_processing_impl.cc:614: agc_manager_.get() ? agc_manager_->voice_probability() : 0.f; Shouldn't we turn activity ...

5 years, 5 months ago (2015-07-14 18:28:51 UTC) #5

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:614: agc_manager_.get()
? agc_manager_->voice_probability() : 0.f;
Shouldn't we turn activity detector part of AGC on when intelligibility is
enabled? Otherwise we will process all frames as inactive.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:785: if
(intelligibility_enabled_) {
I suppose we want to do all the modifications to playout signal before echo
cancellation/echo control, otherwise they have to compensate for the changes
that enhancer makes to playout signal.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:157:
void IntelligibilityEnhancer::ProcessRenderAudio(float* const* audio,
I did not comprehend the logic here.

|voice_probability| specifies if the current chunk is active (see below for more
explanation on voice probability). Therefore, it has to be used as an indicator
whether the current chunk should be considered in variance computation or not.
If we have |voice_probability| we should not need WebRTC VAD decision. 

Furthermore, why a call to |render_mangler_| is based on |voice_probability|? I
don't think we should stop applying enhancement gains as soon as a chunk has low
activity probability. I think we should keep on applying enhancer, but low
activity chunks are discarded from variance computation.

We need yet another threshold which specifies if enhancement should be applied
at all. So I think the flow of the program is;
  1) Update variance of to-be-rendered signal on active chunks.
  2) Update variance of captured signal on in-active chunks.
  3) Make a decision based on the current state of enhancer (on or off) and the
     current signal-to-noise ratio whether enhancer should be on or off.

Obviously steps 1) and 2) are done through different API calls.

I like the idea of having two ProcessRenderAudio(), but I thought the caller is
supposed to call ProcessRenderAudio(float* const* audio) if they do not have any
information regarding the activity the given audio chunk, then we call WebRTC
VAD to get activity flag. On the other hand, if the caller has info regarding
the activity, they call ProcessRenderAudio(float* const* audio, float
voice_probability), then the given voice_probability is compared with a
threshold to decide if the given frame should be included in variance
computation.

Same thing for ProcessCaptureAudio()

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:160:
vad_tmp_buffer_[i] = (int16_t)audio[0][i];
You better check with APM guys, but I suppose you get audio in range -1 to 1,
therefore, a cast to int16_t is not a good idea.

aluebs-webrtc

This will probably break when rolled into Chrome because the const-ness needs to be fixed ...

5 years, 5 months ago (2015-07-15 01:02:05 UTC) #6

ekm

Changes of note: - Added ProcessReverseStream to APM instead of changing interface of AnalyzeReverseStream. This ...

5 years, 5 months ago (2015-07-17 19:59:38 UTC) #8

Changes of note:
- Added ProcessReverseStream to APM instead of changing interface of
AnalyzeReverseStream. This should simplify integration with Chrome.
- Finished adding internal VAD capabilities to enhancer.
- Updated AudioBuffer.InterleaveTo to allow for case when src AudioBuffer has
single channel and target AudioFrame has multiple.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:614: agc_manager_.get()
? agc_manager_->voice_probability() : 0.f;
On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> On 2015/07/14 18:28:51, turaj wrote:
> > Shouldn't we turn activity detector part of AGC on when intelligibility is
> > enabled? Otherwise we will process all frames as inactive.
> 
> Also, to avoid being out of sync, this should probably run after agc_manager,
> right?

To avoid these problems, and simplify things, switched to enhancer using its own
capture VAD. Maybe optimize later.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:766: if
(intelligibility_enabled_) {
On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> You don't need this if statement, because InterleaveTo gets
> intelligibility_enabled_ and only copies when it is true.

Done. As a result, I updated audio_buffer.InterleaveTo to allow for case of
single source channel and >1 dest channels. This should make InterleaveTo make
more sense, and keeps it possible to test intelligibility in voe_cmd_test.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:785: if
(intelligibility_enabled_) {
On 2015/07/14 18:28:51, turaj wrote:
> I suppose we want to do all the modifications to playout signal before echo
> cancellation/echo control, otherwise they have to compensate for the changes
> that enhancer makes to playout signal. 

Done.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:787:
ra->split_channels_f(kBand0To8kHz));
On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> Maybe not for this CL, but at some point this needs to do something with the
> higher bands, probably something simpler in the time domain.

Acknowledged. I think we'll save this for a later CL. Since the enhancer will
only be on in cases of very high noise and it's targeting speech, 0-8kHz should
be ok, but you're right we might get some strange behavior if we do nothing to
the higher bands. Do you have an idea of what might go wrong here?

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:1038: if
(!intelligibility_enhancer_) {
On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> We probably want to reset the intelligibility_enhancer_ every time, since the
> split_rate_ or fwd_proc_format_.num_channels() could have changed, right?

Done.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:1041: config.channels =
fwd_in_format_.num_channels();
On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> Shouldn't this be fwd_proc_format_?

Just set it to single channel for now, since that's the primary use case, and we
haven't yet decided on how to handle multiple channels.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:157:
void IntelligibilityEnhancer::ProcessRenderAudio(float* const* audio,
On 2015/07/14 18:28:51, turaj wrote:
> I did not comprehend the logic here.
> 
> |voice_probability| specifies if the current chunk is active (see below for
more
> explanation on voice probability). Therefore, it has to be used as an
indicator
> whether the current chunk should be considered in variance computation or not.
> If we have |voice_probability| we should not need WebRTC VAD decision. 
> 
> Furthermore, why a call to |render_mangler_| is based on |voice_probability|?
I
> don't think we should stop applying enhancement gains as soon as a chunk has
low
> activity probability. I think we should keep on applying enhancer, but low
> activity chunks are discarded from variance computation.
> 
> We need yet another threshold which specifies if enhancement should be applied
> at all. So I think the flow of the program is;
>   1) Update variance of to-be-rendered signal on active chunks.
>   2) Update variance of captured signal on in-active chunks.
>   3) Make a decision based on the current state of enhancer (on or off) and
the
>      current signal-to-noise ratio whether enhancer should be on or off.
> 
> Obviously steps 1) and 2) are done through different API calls.
> 
> I like the idea of having two ProcessRenderAudio(), but I thought the caller
is
> supposed to call ProcessRenderAudio(float* const* audio) if they do not have
any
> information regarding the activity the given audio chunk, then we call WebRTC
> VAD to get activity flag. On the other hand, if the caller has info regarding
> the activity, they call ProcessRenderAudio(float* const* audio, float
> voice_probability), then the given voice_probability is compared with a
> threshold to decide if the given frame should be included in variance
> computation.
> 
> Same thing for ProcessCaptureAudio()

Done. You're right, the logic was off and not fully implemented. I've updated
everything to meet this spec and use the new VAD. It also uses smoothing when
deactivating. There may be a more intuitive logic than what I have now. See
UpdateActivity().

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:160:
vad_tmp_buffer_[i] = (int16_t)audio[0][i];
On 2015/07/14 18:28:51, turaj wrote:
> You better check with APM guys, but I suppose you get audio in range -1 to 1,
> therefore, a cast to int16_t is not a good idea. 

Done. audio_util is great!

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h
(right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:72:
void ProcessCaptureAudio(float* const* audio, const float voice_probability);
On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> Does it actually processes the capture audio or does it only analyze it? Maybe
> we need a naming change.

Agreed. Similarly, in APM reverted AnalyzeReverseStream to const and added
ProcessReverseStream.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:72:
void ProcessCaptureAudio(float* const* audio, const float voice_probability);
On 2015/07/15 01:02:05, aluebs-webrtc wrote:
> In all of these methods you assume the sample rate and number of channels did
> not change since the constructor. Maybe you want to have them as parameters
and
> assert on that?

Done.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:73:
void ProcessCaptureAudio(float* const* audio);  // Assumes noise.
On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> Do we want to surface both interfaces to the user?

I think it's nice to give the user the option of using their own voice
probabilities if they want (especially if they have really good ones). May be
useful if there is some very expensive VAD in the future that we only want to
run once for all APM.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:78:
void ProcessRenderAudio(float* const* audio);  // Assumes speech.
On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> Do we want to surface both interfaces to the user?

See above.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
File
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer_unittest.cc
(right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer_unittest.cc:95:
config_.sample_rate_hz = kSampleRate;
On 2015/07/15 01:02:05, aluebs-webrtc wrote:
> Shouldn't the step_type be set here as well?

Done. Nice catch.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/test/audioproc_float.cc (right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/test/audioproc_float.cc:1: /*
On 2015/07/15 01:02:05, aluebs-webrtc wrote:
> How about changing the audioproc (process_test) as well? Or are we deprecating
> that one, now that support for the reverse stream is added here?

Not sure. I'd be happy to add it if it'd be useful.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/test/audioproc_float.cc:47: DEFINE_bool(ie,
false, "Enable intelligibility.");
On 2015/07/15 01:02:05, aluebs-webrtc wrote:
> "Enable intelligibility enhancer."?

Done.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/test/audioproc_float.cc:111: bool
process_reverse = false;
On 2015/07/15 01:02:05, aluebs-webrtc wrote:
> bool process_reverse = !FLAGS_i_rev.empty();

Done.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/test/cmd_...
File webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc (right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/test/cmd_...
webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:425: printf("%i. Toggle
intelligibility\n", option_index++);
On 2015/07/15 01:02:05, aluebs-webrtc wrote:
> Toggle intelligibility enhancer?

Done.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/test/cmd_...
webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:514:
base1->audio_processing()->SetExtraOptions(config);
On 2015/07/15 01:02:05, aluebs-webrtc wrote:
> Now that there is 2 calls to SetExtraOptions, you need to set both configs
each
> time to not overwrite the other feature with the default value.

Done.

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode766 webrtc/modules/audio_processing/audio_processing_impl.cc:766: if (intelligibility_enabled_) { On 2015/07/17 19:59:38, ekm wrote: > ...

5 years, 5 months ago (2015-07-20 19:33:43 UTC) #9

ekm

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_processing/audio_buffer.cc File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_processing/audio_buffer.cc#newcode453 webrtc/modules/audio_processing/audio_buffer.cc:453: int16_t* const* channel_ptr_copies[frame->num_channels_]; On 2015/07/20 19:33:43, aluebs-webrtc wrote: > ...

5 years, 5 months ago (2015-07-21 01:02:44 UTC) #10

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc#newcode171 webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:171: if (render_voice_probability_ >= render_vad_thresh_ || active_) { On 2015/07/21 ...

5 years, 5 months ago (2015-07-21 01:50:55 UTC) #11

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:171:
if (render_voice_probability_ >= render_vad_thresh_ || active_) {
On 2015/07/21 01:02:44, ekm wrote:
> On 2015/07/20 19:33:43, aluebs-webrtc wrote:
> > I am not sure why the voice probability affects if the chunk gets processed.
> > Maybe I am just missing something.
> 
> If we're pretty sure the far-end chunk contains speech, we'd like to use it in
> our estimation of the speech signal.

Oh! ProcessChunk doesn't really processes the chunk, it just uses it for
estimation, right? But then why are all chunks used for estimation when active?
I feel I am still missing something.

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:203:
if (capture_voice_probability_ <= capture_vad_thresh_) {
On 2015/07/21 01:02:44, ekm wrote:
> On 2015/07/20 19:33:43, aluebs-webrtc wrote:
> > I am not sure why the voice probability affects if the chunk gets processed.
> > Maybe I am just missing something.
> 
> If we're pretty sure the near-end chunk contains noise, we'd like to use it in
> our estimation of the noise signal.

That makes sense. I was not aware that ProcessChunk was only estimating.

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:230:
if (render_voice_probability_ >= render_vad_thresh_) {
On 2015/07/21 01:02:44, ekm wrote:
> On 2015/07/20 19:33:43, aluebs-webrtc wrote:
> > This changed from has_voice_low to checking if the render_voice_probability
is
> > above the threshold. Isn't that just the opposite?
> 
> The has_voice_low_ was referring to detection of voice with 'low'
aggressiveness
> with the common_audio vad.

Oh, I see. Then it makes sense.

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:10: #include <iostream>
Left from debugging, right?

ekm

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode787 webrtc/modules/audio_processing/audio_processing_impl.cc:787: ra->split_channels_f(kBand0To8kHz)); On 2015/07/20 19:33:42, aluebs-webrtc wrote: > On 2015/07/17 ...

5 years, 5 months ago (2015-07-21 19:22:13 UTC) #12

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:787:
ra->split_channels_f(kBand0To8kHz));
On 2015/07/20 19:33:42, aluebs-webrtc wrote:
> On 2015/07/17 19:59:38, ekm wrote:
> > On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> > > Maybe not for this CL, but at some point this needs to do something with
the
> > > higher bands, probably something simpler in the time domain.
> > 
> > Acknowledged. I think we'll save this for a later CL. Since the enhancer
will
> > only be on in cases of very high noise and it's targeting speech, 0-8kHz
> should
> > be ok, but you're right we might get some strange behavior if we do nothing
to
> > the higher bands. Do you have an idea of what might go wrong here? 
> 
> Agreed on leaving for another CL. I don't think anything will go explicitly
> wrong, but if you are not processing the higher bands, you lose the ability to
> shape the spectrum above 8kHz, which probably improves your enhancement.

Acknowledged.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:1041: config.channels =
fwd_in_format_.num_channels();
On 2015/07/20 19:33:42, aluebs-webrtc wrote:
> On 2015/07/17 19:59:38, ekm wrote:
> > On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> > > Shouldn't this be fwd_proc_format_?
> > 
> > Just set it to single channel for now, since that's the primary use case,
and
> we
> > haven't yet decided on how to handle multiple channels.
> 
> Then I think it is better to set the correct value here and handle the
multiple
> channel cases in the enhancer, even if it is a CHECK for now. WDYT?

I agree. The problem was that the enhancer only had one num_channels member, so
didn't support case where render and capture have different number. I've now
added those members, so further changes to multi-channel case can be done in
enhancer. The enhancer actually can handle these cases, it's just not
particularly smart about it yet.

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:171:
if (render_voice_probability_ >= render_vad_thresh_ || active_) {
On 2015/07/21 01:50:55, aluebs-webrtc wrote:
> On 2015/07/21 01:02:44, ekm wrote:
> > On 2015/07/20 19:33:43, aluebs-webrtc wrote:
> > > I am not sure why the voice probability affects if the chunk gets
processed.
> > > Maybe I am just missing something.
> > 
> > If we're pretty sure the far-end chunk contains speech, we'd like to use it
in
> > our estimation of the speech signal.
> 
> Oh! ProcessChunk doesn't really processes the chunk, it just uses it for
> estimation, right? But then why are all chunks used for estimation when
active?
> I feel I am still missing something.

ProcessChunk transforms audio to frequency domain, which we need for both
estimation and processing. We overload the logic a bit here so we only have to
do the transform once. In this case, render_mangler_->ProcessChunk does the
transform then goes to ProcessClearBlock, where it decides what to do with the
freq data, either estimate, process or both.

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:203:
if (capture_voice_probability_ <= capture_vad_thresh_) {
On 2015/07/21 01:50:55, aluebs-webrtc wrote:
> On 2015/07/21 01:02:44, ekm wrote:
> > On 2015/07/20 19:33:43, aluebs-webrtc wrote:
> > > I am not sure why the voice probability affects if the chunk gets
processed.
> > > Maybe I am just missing something.
> > 
> > If we're pretty sure the near-end chunk contains noise, we'd like to use it
in
> > our estimation of the noise signal.
> 
> That makes sense. I was not aware that ProcessChunk was only estimating.

Yeah, in the capture case, the freq data is only used for estimation.

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:10: #include <iostream>
On 2015/07/21 01:50:55, aluebs-webrtc wrote:
> Left from debugging, right?

Yep. Done.

Andrew MacDonald

Some high-level comments. In the future, please break up these kind of changes more. Ideally, ...

5 years, 5 months ago (2015-07-21 19:29:22 UTC) #13

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/80001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/80001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc#newcode119 webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:119: malloc(sizeof(*temp_render_out_buffer_) * num_render_channels_ + There is probably a smarter ...

5 years, 5 months ago (2015-07-21 21:27:00 UTC) #14

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode1041 webrtc/modules/audio_processing/audio_processing_impl.cc:1041: config.channels = fwd_in_format_.num_channels(); On 2015/07/21 19:22:13, ekm wrote: > ...

5 years, 5 months ago (2015-07-21 21:30:22 UTC) #15

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/audio_processing_impl.cc:1041: config.channels =
fwd_in_format_.num_channels();
On 2015/07/21 19:22:13, ekm wrote:
> On 2015/07/20 19:33:42, aluebs-webrtc wrote:
> > On 2015/07/17 19:59:38, ekm wrote:
> > > On 2015/07/15 01:02:04, aluebs-webrtc wrote:
> > > > Shouldn't this be fwd_proc_format_?
> > > 
> > > Just set it to single channel for now, since that's the primary use case,
> and
> > we
> > > haven't yet decided on how to handle multiple channels.
> > 
> > Then I think it is better to set the correct value here and handle the
> multiple
> > channel cases in the enhancer, even if it is a CHECK for now. WDYT?
> 
> I agree. The problem was that the enhancer only had one num_channels member,
so
> didn't support case where render and capture have different number. I've now
> added those members, so further changes to multi-channel case can be done in
> enhancer. The enhancer actually can handle these cases, it's just not
> particularly smart about it yet.

This is better.

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:171:
if (render_voice_probability_ >= render_vad_thresh_ || active_) {
On 2015/07/21 19:22:13, ekm wrote:
> On 2015/07/21 01:50:55, aluebs-webrtc wrote:
> > On 2015/07/21 01:02:44, ekm wrote:
> > > On 2015/07/20 19:33:43, aluebs-webrtc wrote:
> > > > I am not sure why the voice probability affects if the chunk gets
> processed.
> > > > Maybe I am just missing something.
> > > 
> > > If we're pretty sure the far-end chunk contains speech, we'd like to use
it
> in
> > > our estimation of the speech signal.
> > 
> > Oh! ProcessChunk doesn't really processes the chunk, it just uses it for
> > estimation, right? But then why are all chunks used for estimation when
> active?
> > I feel I am still missing something.
> 
> ProcessChunk transforms audio to frequency domain, which we need for both
> estimation and processing. We overload the logic a bit here so we only have to
> do the transform once. In this case, render_mangler_->ProcessChunk does the
> transform then goes to ProcessClearBlock, where it decides what to do with the
> freq data, either estimate, process or both.

Oh, I see. Thanks for clarifying.

Andrew MacDonald

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_processing/include/audio_processing.h File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_processing/include/audio_processing.h#newcode126 webrtc/modules/audio_processing/include/audio_processing.h:126: // in the constructor or using AudioProcessing::SetExtraOptions(). On 2015/07/21 ...

5 years, 5 months ago (2015-07-22 02:09:48 UTC) #16

ekm

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode435 webrtc/modules/audio_processing/audio_processing_impl.cc:435: if (intelligibility_enabled_ != config.Get<Intelligibility>().enabled) { On 2015/07/21 19:29:21, andrew ...

5 years, 5 months ago (2015-07-23 00:26:28 UTC) #17

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:435: if
(intelligibility_enabled_ != config.Get<Intelligibility>().enabled) {
On 2015/07/21 19:29:21, andrew wrote:
> So, remove this.

Done.

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:773: int
AudioProcessingImpl::AnalyzeReverseStreamLocked() {
On 2015/07/21 19:29:22, andrew wrote:
> So AnalyzeReverseStream is no longer just for analysis; if IE is enabled it
will
> also process the stream, but just not cause it to be copied back to the output
> audio.
> 
> I don't think there's much benefit in maintaining separate Analyze and Process
> methods. Can you instead work towards replacing Analyze with Process? You'll
> have to keep the former around until you can update clients.

Yep, sounds good. I've re-renamed AnalyzeReverseStreamLocked to
ProcessReverseStreamLocked since it processes. Is there anything else I can do
now to ease the transition to ProcessReverseStream?

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/include/audio_processing.h (right):

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/include/audio_processing.h:126: // in the
constructor or using AudioProcessing::SetExtraOptions().
On 2015/07/22 02:09:48, andrew wrote:
> On 2015/07/21 19:29:22, andrew wrote:
> > Please don't allow it to be set through SetExtraOptions. We want to remove
> this
> > method. I realize this won't allow you to enable it through voice engine for
> > now, but I think that's an acceptable compromise.
> 
> Ah, my mistake. You can still enable it in voice engine, but it makes it
harder
> through the crappy voe_cmd_test. Put up a CL here which shows how you could do
> it through agc_harness at least:
> https://codereview.webrtc.org/1247033006/

Got it. Thanks for the pointer. Can also at least enable it in voe_cmd_test at
compile time. On a similar note, what's the status of the Enable method that
several of the apm components provide?

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/voice_engine/outpu...
File webrtc/voice_engine/output_mixer.cc (right):

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/voice_engine/outpu...
webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame,
&audioproc_resampler_, &_audioFrame);
On 2015/07/21 19:29:22, andrew wrote:
> Remove this if we decide to drop voice engine support.

Acknowledged.

https://codereview.webrtc.org/1234463003/diff/80001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/80001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:119:
malloc(sizeof(*temp_render_out_buffer_) * num_render_channels_ +
On 2015/07/21 21:27:00, aluebs-webrtc wrote:
> There is probably a smarter way of allocating this: vector, scoped_ptr<[]> or
> something similar.

Definitely. Changed to ChannelBuffer.

aluebs-webrtc

lgtm % one small comment https://codereview.webrtc.org/1234463003/diff/100001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/100001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc#newcode166 webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:166: chunk_length_ * sizeof(float)); sizeof(**audio)

5 years, 5 months ago (2015-07-23 00:35:26 UTC) #18

ekm

https://codereview.webrtc.org/1234463003/diff/100001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/100001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc#newcode166 webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:166: chunk_length_ * sizeof(float)); On 2015/07/23 00:35:26, aluebs-webrtc wrote: > ...

5 years, 5 months ago (2015-07-23 00:52:15 UTC) #19

Andrew MacDonald

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode773 webrtc/modules/audio_processing/audio_processing_impl.cc:773: int AudioProcessingImpl::AnalyzeReverseStreamLocked() { On 2015/07/23 00:26:28, ekm wrote: > ...

5 years, 5 months ago (2015-07-24 23:50:40 UTC) #20

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:773: int
AudioProcessingImpl::AnalyzeReverseStreamLocked() {
On 2015/07/23 00:26:28, ekm wrote:
> On 2015/07/21 19:29:22, andrew wrote:
> > So AnalyzeReverseStream is no longer just for analysis; if IE is enabled it
> will
> > also process the stream, but just not cause it to be copied back to the
output
> > audio.
> > 
> > I don't think there's much benefit in maintaining separate Analyze and
Process
> > methods. Can you instead work towards replacing Analyze with Process? You'll
> > have to keep the former around until you can update clients.
> 
> Yep, sounds good. I've re-renamed AnalyzeReverseStreamLocked to
> ProcessReverseStreamLocked since it processes. Is there anything else I can do
> now to ease the transition to ProcessReverseStream?

Yes, mark AnalyzeReverseStream as deprecated in audio_processing.h with a TODO
to remove when all users are updated.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_buffer.cc (right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_buffer.cc:452: // Copy single AudioBuffer
channel into all AudioFrame channels
Period at the end, and can you note that this is specifically for
intelligibility enhancement?

This is a problem (amongst other things) with introducing this render-side
processing. APM wasn't designed to handle it and so you're kind of shoe-horning
it in. I don't have a better suggestion without large refactoring though.

Ideally you would do your processing on however many channels you received
before downmixing to mono. But that would add a bunch of complexity to a system
that's already straining under the load. I think this upmixing is OK, but you
should note in the interface that if intelligibility enhancement is enabled the
reverse stream will become an upmixed mono signal.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_buffer.cc:454: new
int16_t*[frame->num_channels_]);
Arg, why did you switch to using dynamic allocation?

But anyway, perhaps better to add a UpmixMonoToInterleave? See the new
DownmixInterleavedToMono Michael added here:
https://chromium.googlesource.com/external/webrtc/+/master/webrtc/common_audi...

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:724: int
AudioProcessingImpl::AnalyzeReverseStream(AudioFrame* frame) {
The AudioFrame* and float* AnalyzeReverseStreams have different behavior wrt IE.
Normally I'd ask you to add a ProcessReverseStream for AudioFrame as well, but
we really want to get rid of this AudioFrame junk, so I suppose this is OK...

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:763:
render_audio_->InterleaveTo(frame, intelligibility_enabled_);
I think more clear if you add the check here:
if (intelligbility_enabled_) {
  InterleaveTo(...);
}

The bool parameter is just so the AudioFrame's VAD state can get updated, which
you don't care about on the reverse stream.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:785: if
(rev_proc_format_.rate() == kSampleRate32kHz) {
&& intelligbility_enabled_

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:233:
// efidata(n,:) = sqrt(b(n)) * fidata(n,:)
Remove if you don't want this.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h
(right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:159:
float SNR();
Sorry to do this, but could you please move these new features which are
modifying the output results of IE to another CL? You don't even mention these
in the CL description :)

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/outp...
File webrtc/voice_engine/output_mixer.cc (right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/outp...
webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame,
&audioproc_resampler_, &_audioFrame);
This is a big deal ;) First, you can't use the same resampler here, as it should
be operating on the same continuous stream.

I think the right way to do this is remove the pre-RemixAndResample and let APM
handle it. In the "modern" float interface that would be no problem.
Unfortunately, this AudioFrame interface doesn't handle resampling in
DeinterleaveFrom, as you've seen. So we'd need to add support for it there.

Another, and possibly superior, option is to remove support for IE through the
AudioFrame interface in this CL.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/test...
File webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc (right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/test...
webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:237: bool
intelligibility_enabled = false;
Revert changes to this file now that SetExtraOptions doesn't work.

turaj

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc#newcode231 webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:231: UpdateActivity(); If we are deactivated, then gains are all ...

5 years, 4 months ago (2015-07-27 20:01:05 UTC) #21

ekm

Changes of note: - Added support for resampling in InterleaveTo. - Removed this responsibility from ...

5 years, 4 months ago (2015-07-29 00:37:19 UTC) #22

Changes of note:

- Added support for resampling in InterleaveTo.
    - Removed this responsibility from VoE (output_mixer).
    - Still need to add rev_out_format (or similar) to 
      apm for this to be effective; will do this after
      merge with https://codereview.webrtc.org/1226093007/
      in later patch.

- Removed updated VAD functionality.
    - Will re-add in a later cl, and address Turaj's 
      newest related comments there.

- Removed voe_cmd_test toggleability, since
  SetExtraOptions is deprecated.

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/audio_processing_impl.cc:773: int
AudioProcessingImpl::AnalyzeReverseStreamLocked() {
On 2015/07/24 23:50:39, andrew wrote:
> On 2015/07/23 00:26:28, ekm wrote:
> > On 2015/07/21 19:29:22, andrew wrote:
> > > So AnalyzeReverseStream is no longer just for analysis; if IE is enabled
it
> > will
> > > also process the stream, but just not cause it to be copied back to the
> output
> > > audio.
> > > 
> > > I don't think there's much benefit in maintaining separate Analyze and
> Process
> > > methods. Can you instead work towards replacing Analyze with Process?
You'll
> > > have to keep the former around until you can update clients.
> > 
> > Yep, sounds good. I've re-renamed AnalyzeReverseStreamLocked to
> > ProcessReverseStreamLocked since it processes. Is there anything else I can
do
> > now to ease the transition to ProcessReverseStream?
> 
> Yes, mark AnalyzeReverseStream as deprecated in audio_processing.h with a TODO
> to remove when all users are updated.

Done.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_buffer.cc (right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_buffer.cc:452: // Copy single AudioBuffer
channel into all AudioFrame channels
On 2015/07/24 23:50:39, andrew wrote:
> Period at the end, and can you note that this is specifically for
> intelligibility enhancement?
> 
> This is a problem (amongst other things) with introducing this render-side
> processing. APM wasn't designed to handle it and so you're kind of
shoe-horning
> it in. I don't have a better suggestion without large refactoring though.
> 
> Ideally you would do your processing on however many channels you received
> before downmixing to mono. But that would add a bunch of complexity to a
system
> that's already straining under the load. I think this upmixing is OK, but you
> should note in the interface that if intelligibility enhancement is enabled
the
> reverse stream will become an upmixed mono signal.

Done.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_buffer.cc:454: new
int16_t*[frame->num_channels_]);
On 2015/07/24 23:50:39, andrew wrote:
> Arg, why did you switch to using dynamic allocation?
> 
> But anyway, perhaps better to add a UpmixMonoToInterleave? See the new
> DownmixInterleavedToMono Michael added here:
>
https://chromium.googlesource.com/external/webrtc/+/master/webrtc/common_audi...

Done.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:724: int
AudioProcessingImpl::AnalyzeReverseStream(AudioFrame* frame) {
On 2015/07/24 23:50:39, andrew wrote:
> The AudioFrame* and float* AnalyzeReverseStreams have different behavior wrt
IE.
> Normally I'd ask you to add a ProcessReverseStream for AudioFrame as well, but
> we really want to get rid of this AudioFrame junk, so I suppose this is OK...

Acknowledged.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:763:
render_audio_->InterleaveTo(frame, intelligibility_enabled_);
On 2015/07/24 23:50:39, andrew wrote:
> I think more clear if you add the check here:
> if (intelligbility_enabled_) {
>   InterleaveTo(...);
> }
> 
> The bool parameter is just so the AudioFrame's VAD state can get updated,
which
> you don't care about on the reverse stream. 

Done. Added is_rev_processed() to clean up these checks.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:785: if
(rev_proc_format_.rate() == kSampleRate32kHz) {
On 2015/07/24 23:50:39, andrew wrote:
> && intelligbility_enabled_

Done.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:233:
// efidata(n,:) = sqrt(b(n)) * fidata(n,:)
On 2015/07/24 23:50:40, andrew wrote:
> Remove if you don't want this.

Done.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h
(right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:159:
float SNR();
On 2015/07/24 23:50:40, andrew wrote:
> Sorry to do this, but could you please move these new features which are
> modifying the output results of IE to another CL? You don't even mention these
> in the CL description :)

Done.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/outp...
File webrtc/voice_engine/output_mixer.cc (right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/outp...
webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame,
&audioproc_resampler_, &_audioFrame);
On 2015/07/24 23:50:40, andrew wrote:
> This is a big deal ;) First, you can't use the same resampler here, as it
should
> be operating on the same continuous stream.
> 
> I think the right way to do this is remove the pre-RemixAndResample and let
APM
> handle it. In the "modern" float interface that would be no problem.
> Unfortunately, this AudioFrame interface doesn't handle resampling in
> DeinterleaveFrom, as you've seen. So we'd need to add support for it there.
> 
> Another, and possibly superior, option is to remove support for IE through the
> AudioFrame interface in this CL.
> 

Done (1st option): added support in InterleaveTo in audio_buffer.

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/test...
File webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc (right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/test...
webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:237: bool
intelligibility_enabled = false;
On 2015/07/24 23:50:40, andrew wrote:
> Revert changes to this file now that SetExtraOptions doesn't work.

Done.

Andrew MacDonald

Looking pretty good. Alex, could you take another quick look? This has changed considerably since ...

5 years, 4 months ago (2015-07-29 03:52:28 UTC) #23

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/include/audio_util.h File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/include/audio_util.h#newcode109 webrtc/common_audio/include/audio_util.h:109: interleaved[interleaved_idx] = mono[i]; How about: interleaved[i * num_channels + ...

5 years, 4 months ago (2015-07-29 22:17:10 UTC) #24

ekm

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h#newcode167 webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:167: const int freqs_; // Num frequencies in frequency domain. ...

5 years, 4 months ago (2015-07-29 23:35:06 UTC) #26

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h
(right):

https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:167:
const int freqs_;         // Num frequencies in frequency domain.
On 2015/07/27 20:01:05, turaj wrote:
> As comments eventually will be mis-aligned, maybe better to stick to the
> 2-spaces rule.

'git cl format' keeps re-aligning these. I'll just stick with whatever it does.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl...
File webrtc/common_audio/include/audio_util.h (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl...
webrtc/common_audio/include/audio_util.h:104: void UpmixMonoToInterleaved(const
T* mono, int samples_per_channel,
On 2015/07/29 03:52:27, andrew wrote:
> Go with num_frames for new code rather than samples_per_channel.

Done.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl...
webrtc/common_audio/include/audio_util.h:109: interleaved[interleaved_idx] =
mono[i];
On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> How about:
> 
> interleaved[i * num_channels + j] = mono[i];
> 
> And drop interleaved_idx completely?

I'd like to avoid the inner multiplication, though not of the tradeoff in
practice vs declaring the int every call.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl...
webrtc/common_audio/include/audio_util.h:110: interleaved_idx++;
On 2015/07/29 03:52:27, andrew wrote:
> nit: could put the post-increment in the above line.

Done.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_buffer.cc (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_buffer.cc:462: data_ptr = data_.get();
On 2015/07/29 03:52:27, andrew wrote:
> Initialize data_ptr to data_.get() when you declare it and you can skip this
> else block.

Done.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_buffer.cc:469:
UpmixMonoToInterleaved(data_ptr->ibuf()->channels()[0], proc_num_frames_,
On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> Should this method be in a anonymous namespace in this file instead in
> audio_util? Or even don't abstract this? Just to keep things together, since
the
> interleaved format is deprecated and will eventually disappear? I don't have a
> strong opinion.

Hmm.... it seems cleaner and clearer as is for now, though I can't really weigh
in on the deprecation issue. Andrew?

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:787: if
(rev_proc_format_.rate() == kSampleRate32kHz && is_rev_processed()) {
On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> Maybe this is a good time to make 32kHz and 48kHz consistent and use splitting
> filter in both cases? Today 48Khz down-samples and 32kHz splits since the
2-band
> splitting filter performance is better than the down-sampler, but the 3-band
> splitting filter isn't. But soon the ineligibility enhancer will require the
> higher bands anyway. WDYT?

I'd prefer to save this for a later cl, since this cl is pretty monstrous as is.
Also, with respect to intelligibility, this falls more under 'tuning' as we're
not sure yet how necessary this is for intelligibility; we might be able to get
away with just the 16k.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:1006: return
intelligibility_enabled_ && intelligibility_enhancer_->active();
On 2015/07/29 03:52:27, andrew wrote:
> Hmm, checking active() is nice to save some complexity, but can this flip
often?
> You'll get a probably audible discontinuity when it flips if you don't smooth
> this out.
> 
> One solution would be to have IE flip this on before it starts processing the
> signal (that is, it returns the unprocessed signal for at least one chunk) and
> flip it off after it stops (it returns at least one unprocessed chunk). It can
> then internally smooth between these states.
> 
> If you don't have that feature today, I'd remove that chunk and leave a TODO
> here.
> 
> Ah, reading further I see active() always returns true for now. So leave this,
> but make sure you add something like the feature I described when you make it
> non-const.

Yep, we have this smoothing feature implemented, and will add it in a later cl.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/include/audio_processing.h (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/include/audio_processing.h:332: virtual int
AnalyzeReverseStream(AudioFrame* frame) = 0;
On 2015/07/29 03:52:27, andrew wrote:
> Since we're now supporting this interface fully, you should similarly add a
> ProcessReverseStream for it and mark this as deprecated.

Done. Also added to the StreamConfig interface.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:27:
#include "webrtc/common_audio/include/audio_util.h"
On 2015/07/29 03:52:27, andrew wrote:
> Alpha order.

Done.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:139:
if (active_) {
On 2015/07/29 03:52:27, andrew wrote:
> This is where you need smoothing as you switch states.

Acknowledged.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:139:
if (active_) {
On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> There is a if(active_) just above where you can probably merge this into.

Although active_ is const in this cl, in the next it will be non-const and may
change between these two blocks, and the two conditions will be different, so
I'd prefer to keep them separate if that's ok.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h
(right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:151:
bool active_;  // Whether render gains are being updated.
On 2015/07/29 03:52:27, andrew wrote:
> Make this const for now.

Done.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/test/audioproc_float.cc (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/test/audioproc_float.cc:190:
LayoutFromChannels(in_rev_buf->num_channels())));
On 2015/07/29 03:52:27, andrew wrote:
> Is this indentation right? Run clang-format if you haven't.

It's strange, but that's what 'git cl format' gives me.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp...
File webrtc/voice_engine/output_mixer.cc (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp...
webrtc/voice_engine/output_mixer.cc:578: if
(_audioProcessingModulePtr->AnalyzeReverseStream(&_audioFrame) == -1) {
On 2015/07/29 03:52:28, andrew wrote:
> Can you run through a bunch of codecs in voe_cmd_test with AEC on/off
> intelligibility on/off and make sure everything sounds OK? I'll do the same
once
> you land this.

Tried a bunch on loopback. AEC is effective for all with or without
intelligibility enabled, except for opus. For opus, I couldn't get AEC to be
effective in any case, even using AnalyzeReverseStream instead of
ProcessReverseStream. I'm guessing this is a separate issue.

Andrew MacDonald

Thanks for uploading the merge separately; makes reviewing easier. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/output_mixer.cc File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/output_mixer.cc#newcode578 webrtc/voice_engine/output_mixer.cc:578: ...

5 years, 4 months ago (2015-07-30 03:53:17 UTC) #27

ekm

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/output_mixer.cc File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/output_mixer.cc#newcode578 webrtc/voice_engine/output_mixer.cc:578: if (_audioProcessingModulePtr->AnalyzeReverseStream(&_audioFrame) == -1) { On 2015/07/30 03:53:16, andrew ...

5 years, 4 months ago (2015-07-30 06:15:19 UTC) #29

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp...
File webrtc/voice_engine/output_mixer.cc (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp...
webrtc/voice_engine/output_mixer.cc:578: if
(_audioProcessingModulePtr->AnalyzeReverseStream(&_audioFrame) == -1) {
On 2015/07/30 03:53:16, andrew wrote:
> On 2015/07/29 23:35:06, ekm wrote:
> > On 2015/07/29 03:52:28, andrew wrote:
> > > Can you run through a bunch of codecs in voe_cmd_test with AEC on/off
> > > intelligibility on/off and make sure everything sounds OK? I'll do the
same
> > once
> > > you land this.
> > 
> > Tried a bunch on loopback. AEC is effective for all with or without
> > intelligibility enabled, except for opus. For opus, I couldn't get AEC to be
> > effective in any case, even using AnalyzeReverseStream instead of
> > ProcessReverseStream. I'm guessing this is a separate issue.
> 
> Hmm, OK, I'll try it once this lands.

Acknowledged.

https://codereview.webrtc.org/1234463003/diff/210018/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/include/audio_processing.h (right):

https://codereview.webrtc.org/1234463003/diff/210018/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/include/audio_processing.h:367: virtual int
ProcessReverseStream(float* const* data,
On 2015/07/30 03:53:17, andrew wrote:
> You don't need this one now...

Done. Removed.

https://codereview.webrtc.org/1234463003/diff/210018/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/include/audio_processing.h:376: virtual int
AnalyzeReverseStream(const float* const* data,
On 2015/07/30 03:53:17, andrew wrote:
> I don't think any clients are using this yet besides tests. You could remove
it
> in this CL.

Done. Since we're still supporting AnalyzeReverseStream for the other float
interface, and that uses this, I've moved this AnalyzeReverseStream to private.

Also, since this was only used in tests, I've updated this interface to support
a distinct reverse output stream.

https://codereview.webrtc.org/1234463003/diff/210018/webrtc/voice_engine/outp...
File webrtc/voice_engine/output_mixer.cc (right):

https://codereview.webrtc.org/1234463003/diff/210018/webrtc/voice_engine/outp...
webrtc/voice_engine/output_mixer.cc:554: WEBRTC_TRACE(kTraceWarning,
kTraceVoice, VoEId(_instanceId, -1),
On 2015/07/30 03:53:17, andrew wrote:
> Since we're here, can you make this a kTraceError?

Done.

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/include/audio_util.h File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/include/audio_util.h#newcode109 webrtc/common_audio/include/audio_util.h:109: interleaved[interleaved_idx] = mono[i]; On 2015/07/29 23:35:06, ekm wrote: > ...

5 years, 4 months ago (2015-07-30 15:28:07 UTC) #30

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl...
File webrtc/common_audio/include/audio_util.h (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl...
webrtc/common_audio/include/audio_util.h:109: interleaved[interleaved_idx] =
mono[i];
On 2015/07/29 23:35:06, ekm wrote:
> On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> > How about:
> > 
> > interleaved[i * num_channels + j] = mono[i];
> > 
> > And drop interleaved_idx completely?
> 
> I'd like to avoid the inner multiplication, though not of the tradeoff in
> practice vs declaring the int every call.

Agreed.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_buffer.cc (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_buffer.cc:469:
UpmixMonoToInterleaved(data_ptr->ibuf()->channels()[0], proc_num_frames_,
On 2015/07/29 23:35:06, ekm wrote:
> On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> > Should this method be in a anonymous namespace in this file instead in
> > audio_util? Or even don't abstract this? Just to keep things together, since
> the
> > interleaved format is deprecated and will eventually disappear? I don't have
a
> > strong opinion.
> 
> Hmm.... it seems cleaner and clearer as is for now, though I can't really
weigh
> in on the deprecation issue. Andrew?

As I said, I don't have a strong opinion and I agree it is cleaner. I was just
suggesting to avoid spreading this interface code. If you find it clearer, leave
it as is.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:787: if
(rev_proc_format_.rate() == kSampleRate32kHz && is_rev_processed()) {
On 2015/07/29 23:35:06, ekm wrote:
> On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> > Maybe this is a good time to make 32kHz and 48kHz consistent and use
splitting
> > filter in both cases? Today 48Khz down-samples and 32kHz splits since the
> 2-band
> > splitting filter performance is better than the down-sampler, but the 3-band
> > splitting filter isn't. But soon the ineligibility enhancer will require the
> > higher bands anyway. WDYT?
> 
> I'd prefer to save this for a later cl, since this cl is pretty monstrous as
is.
> Also, with respect to intelligibility, this falls more under 'tuning' as we're
> not sure yet how necessary this is for intelligibility; we might be able to
get
> away with just the 16k.

Leaving it for another CL sounds reasonable. But I think sooner or later the IE
will need this.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:139:
if (active_) {
On 2015/07/29 23:35:06, ekm wrote:
> On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> > There is a if(active_) just above where you can probably merge this into.
> 
> Although active_ is const in this cl, in the next it will be non-const and may
> change between these two blocks, and the two conditions will be different, so
> I'd prefer to keep them separate if that's ok.

Agreed.

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/include/audio_processing.h (right):

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/include/audio_processing.h:492: int num_channels
= 1,
I think this zero-initialization here is on purpose to force the user to set
reasonable values. Actually not setting them is a way of signalling that that
stream is not necessary, for example when a reverse stream is not available. Why
do you need to have these default values?

Andrew MacDonald

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/audio_processing_impl.h File webrtc/modules/audio_processing/audio_processing_impl.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/audio_processing_impl.h#newcode131 webrtc/modules/audio_processing/audio_processing_impl.h:131: int AnalyzeReverseStream(const float* const* src, Add a TODO for ...

5 years, 4 months ago (2015-07-30 18:48:53 UTC) #31

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/include/audio_processing.h File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/include/audio_processing.h#newcode368 webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, On 2015/07/30 18:48:53, andrew wrote: > ...

5 years, 4 months ago (2015-07-30 18:56:35 UTC) #32

ekm

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_processing/audio_buffer.cc File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_processing/audio_buffer.cc#newcode469 webrtc/modules/audio_processing/audio_buffer.cc:469: UpmixMonoToInterleaved(data_ptr->ibuf()->channels()[0], proc_num_frames_, On 2015/07/30 15:28:07, aluebs-webrtc wrote: > On ...

5 years, 4 months ago (2015-07-30 21:23:50 UTC) #33

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_buffer.cc (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_buffer.cc:469:
UpmixMonoToInterleaved(data_ptr->ibuf()->channels()[0], proc_num_frames_,
On 2015/07/30 15:28:07, aluebs-webrtc wrote:
> On 2015/07/29 23:35:06, ekm wrote:
> > On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> > > Should this method be in a anonymous namespace in this file instead in
> > > audio_util? Or even don't abstract this? Just to keep things together,
since
> > the
> > > interleaved format is deprecated and will eventually disappear? I don't
have
> a
> > > strong opinion.
> > 
> > Hmm.... it seems cleaner and clearer as is for now, though I can't really
> weigh
> > in on the deprecation issue. Andrew?
> 
> As I said, I don't have a strong opinion and I agree it is cleaner. I was just
> suggesting to avoid spreading this interface code. If you find it clearer,
leave
> it as is.

Acknowledged.

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:787: if
(rev_proc_format_.rate() == kSampleRate32kHz && is_rev_processed()) {
On 2015/07/30 15:28:07, aluebs-webrtc wrote:
> On 2015/07/29 23:35:06, ekm wrote:
> > On 2015/07/29 22:17:10, aluebs-webrtc wrote:
> > > Maybe this is a good time to make 32kHz and 48kHz consistent and use
> splitting
> > > filter in both cases? Today 48Khz down-samples and 32kHz splits since the
> > 2-band
> > > splitting filter performance is better than the down-sampler, but the
3-band
> > > splitting filter isn't. But soon the ineligibility enhancer will require
the
> > > higher bands anyway. WDYT?
> > 
> > I'd prefer to save this for a later cl, since this cl is pretty monstrous as
> is.
> > Also, with respect to intelligibility, this falls more under 'tuning' as
we're
> > not sure yet how necessary this is for intelligibility; we might be able to
> get
> > away with just the 16k.
> 
> Leaving it for another CL sounds reasonable. But I think sooner or later the
IE
> will need this.

Yes. I'm interested to hear what the difference sounds like.

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/include/audio_processing.h (right):

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/include/audio_processing.h:368: const
StreamConfig& reverse_output_config,
On 2015/07/30 18:48:53, andrew wrote:
> I'm not sure we want to do this. The way you have it now makes the interface
> inconsistent: the user can specify an output config, but they'll only get it
if
> intelligibility is enabled.
> 
> We either need to always support converting the stream, or remove the output
> config. I'm a bit divided on this. Alex what do you think?

Are we talking about converting reverse to capture, or reverse output to reverse
input? Since before IE apm left reverse input as is, it makes sense to me to
convert reverse out to reverse in. In that case, I can revert the expansion of
ProcessingConfig, etc.

What about adding back the AnalyzeReverseStream that only takes the input
stream+config? Is that too much interface?

ekm

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/audio_processing_impl.h File webrtc/modules/audio_processing/audio_processing_impl.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/audio_processing_impl.h#newcode131 webrtc/modules/audio_processing/audio_processing_impl.h:131: int AnalyzeReverseStream(const float* const* src, On 2015/07/30 18:48:53, andrew ...

5 years, 4 months ago (2015-07-30 22:38:44 UTC) #35

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/include/audio_processing.h File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/include/audio_processing.h#newcode368 webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, On 2015/07/30 21:23:50, ekm wrote: > ...

5 years, 4 months ago (2015-07-30 23:09:51 UTC) #36

ekm

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/include/audio_processing.h File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/include/audio_processing.h#newcode368 webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, On 2015/07/30 23:09:51, aluebs-webrtc wrote: > ...

5 years, 4 months ago (2015-07-30 23:20:17 UTC) #37

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/include/audio_processing.h File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_processing/include/audio_processing.h#newcode368 webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, On 2015/07/30 23:20:17, ekm wrote: > ...

5 years, 4 months ago (2015-07-30 23:23:39 UTC) #38

Andrew MacDonald

Unfortunately, you're going to need to add tests verifying the reverse conversions. https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc ...

5 years, 4 months ago (2015-07-31 01:22:42 UTC) #39

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode726 webrtc/modules/audio_processing/audio_processing_impl.cc:726: if (is_rev_processed()) { On 2015/07/31 01:22:42, andrew wrote: > ...

5 years, 4 months ago (2015-07-31 16:51:03 UTC) #40

ekm

https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode726 webrtc/modules/audio_processing/audio_processing_impl.cc:726: if (is_rev_processed()) { On 2015/07/31 16:51:03, aluebs-webrtc wrote: > ...

5 years, 4 months ago (2015-08-01 04:21:37 UTC) #41

https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:726: if
(is_rev_processed()) {
On 2015/07/31 16:51:03, aluebs-webrtc wrote:
> On 2015/07/31 01:22:42, andrew wrote:
> > So, we need to change this part. This is what I meant about being
consistent:
> if
> > the user requests a different output config, you should provide that
> regardless
> > of processing.
> > 
> > However, if we have no processing and the user requests a different output
> > config, we need to convert from src not render_audio_. Otherwise, say the
user
> > passes in 48 kHz and they request 44.1 kHz, they'll actually get 16 kHz
audio
> > upsampled to 44.1.
> > 
> > Perhaps my olde AudioConverter can finally see the light of day?
> > 
> > if (is_rev_processed() {
> >   ...
> > } else {
> >   render_converter_->Convert(src, ..., dest);
> > }
> > 
> > where you only create render_converter_ in InitializeLocked() when
> appropriate.
> > 
> > Alex, what do you think? We could back out the output_config addition, but
> then
> > we'd still then need at least a copy from src to dest here, so it doesn't
help
> a
> > lot.
> 
> I like your proposal. I don't think that dropping the out config helps, so I
> would keep it, but I don't have a strong opinion.

I was intrigued and added the AudioConverter method, but kept interface the same
for now.

https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:773: if
(is_rev_processed()) {
On 2015/07/31 01:22:42, andrew wrote:
> You can leave this one, because the user has no way to specify an output
config.

Acknowledged.

https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:1090:
((api_format_.reverse_output_stream().num_channels() ==
Added these constraints because CommonFormats/AudioProcessingTest tests 2 input
0 output channels case. Maybe better to just declare this case invalid in
audio_processing.h as in ProcessStream case, and fix the test.

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode10 webrtc/modules/audio_processing/audio_processing_impl.cc:10: Re-add blank line. https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode1090 webrtc/modules/audio_processing/audio_processing_impl.cc:1090: ...

5 years, 4 months ago (2015-08-03 16:30:50 UTC) #42

ekm

https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode10 webrtc/modules/audio_processing/audio_processing_impl.cc:10: On 2015/08/03 16:30:49, aluebs-webrtc wrote: > Re-add blank line. ...

5 years, 4 months ago (2015-08-05 05:24:08 UTC) #44

Andrew MacDonald

Had a first look at the new stuff. I still think we need to unit ...

5 years, 4 months ago (2015-08-05 06:18:18 UTC) #45

Andrew MacDonald

https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode1090 webrtc/modules/audio_processing/audio_processing_impl.cc:1090: ((api_format_.reverse_output_stream().num_channels() == On 2015/08/03 16:30:50, aluebs-webrtc wrote: > On ...

5 years, 4 months ago (2015-08-05 06:22:39 UTC) #46

ekm

Extended (AudioProcessingTest, Formats) to include render stream conversion test. https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode1090 webrtc/modules/audio_processing/audio_processing_impl.cc:1090: ...

5 years, 4 months ago (2015-08-07 06:06:00 UTC) #49

aluebs-webrtc

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/include/audio_util.h File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/include/audio_util.h#newcode70 webrtc/common_audio/include/audio_util.h:70: // allocated in the |dest| channels. Comment something about ...

5 years, 4 months ago (2015-08-10 19:14:18 UTC) #50

Andrew MacDonald

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/include/audio_util.h File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/include/audio_util.h#newcode70 webrtc/common_audio/include/audio_util.h:70: // allocated in the |dest| channels. Add a comment ...

5 years, 4 months ago (2015-08-10 19:24:05 UTC) #51

Andrew MacDonald

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_processing/test/audio_processing_unittest.cc File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_processing/test/audio_processing_unittest.cc#newcode2488 webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2488: {2, 2, 2, 2}, On 2015/08/10 19:14:18, aluebs-webrtc wrote: ...

5 years, 4 months ago (2015-08-10 19:30:51 UTC) #52

ekm

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/include/audio_util.h File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/include/audio_util.h#newcode70 webrtc/common_audio/include/audio_util.h:70: // allocated in the |dest| channels. On 2015/08/10 19:14:18, ...

5 years, 4 months ago (2015-08-11 23:59:36 UTC) #54

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/incl...
File webrtc/common_audio/include/audio_util.h (right):

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/incl...
webrtc/common_audio/include/audio_util.h:70: // allocated in the |dest|
channels.
On 2015/08/10 19:14:18, aluebs-webrtc wrote:
> Comment something about that there is no copy done when they are the same
> pointer.

Done.

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/audio_processing_impl.cc (right):

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/audio_processing_impl.cc:740:
render_converter_->Convert(src, reverse_input_config.num_frames() *
On 2015/08/10 19:24:05, andrew_ooo_to_aug14 wrote:
> Perhaps add a helper to StreamConfig?
> int num_samples() const { return num_channels_ * num_frames_; }

Done.

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right):

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/test/audio_processing_unittest.cc:65: enum
StreamDirection { kFwd = 0, kRev };
On 2015/08/10 19:14:18, aluebs-webrtc wrote:
> How about spelling these out: kForward and kReverse?

Done.

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2449:
ChannelBuffer<float>* cb_write = file_direction ? &rev_out_cb : &out_cb;
On 2015/08/10 19:24:05, andrew_ooo_to_aug14 wrote:
> nit: could make this a const reference.

Removed.

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2488: {2, 2,
2, 2},
On 2015/08/10 19:30:51, andrew_ooo_to_aug14 wrote:
> On 2015/08/10 19:14:18, aluebs-webrtc wrote:
> > If this is supposed to be an exhaustive list, you are missing the cases {1,
1,
> > 2, 2}, {2, 1, 2, 2}, {2, 2, 2, 1}, right? But I am not sure if it should be
> > exhaustive, since it is already a slow test.
> 
> Right, I don't think there's a lot of value in making it exhaustive, since the
> two paths are fairly independent for the purposes of this test. He has all the
> permitted reverse combinations {1, 1}, {2, 1}, {2, 2}, so I think we're good.

Acknowledged.

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2492: for
(auto file_direction : {kFwd, kRev}) {
On 2015/08/10 19:24:05, andrew_ooo_to_aug14 wrote:
> On 2015/08/07 06:06:00, ekm wrote:
> > The following 124 lines should be indented, but codereview couldn't figure
out
> > how to match up the diffs. So to make reviewing easier, I left unindented
for
> > now.
> 
> Nice, thanks.
> 
> This looks pretty good, but I see one issue: you're doubling the number of
test
> cases and only verifying forward or reverse on any particular case, right?
Would
> it be possible to verify both at once without doubling the test duration?

Done. Yep, switched up ProcessFormat to write both files at once, and moved the
first for loop here after ProcessFormat so the processing is done only once for
each case.

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2512: }
On 2015/08/10 19:14:18, aluebs-webrtc wrote:
> I think it is clearer to declare and define each variable in the same line as:
> int in_rate = file_direction == kFwd ? input_rate_ : reverse_input_rate_;
> But I don't have a strong opinion.

Done.

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2595: // of
reference vs output.
On 2015/08/10 19:24:05, andrew_ooo_to_aug14 wrote:
> Move to previous line.

Done.

https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro...
webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2732:
std::tr1::make_tuple(16000, 16000, 16000, 16000, 0, 0)));
On 2015/08/10 19:30:51, andrew_ooo_to_aug14 wrote:
> On 2015/08/10 19:14:18, aluebs-webrtc wrote:
> > Here I have the same question about making them exhaustive. Andrew?
> 
> Right, it's not supposed to be exhaustive. It looks like what you have covers
> the interesting cases.

Acknowledged.

hlundin-webrtc

Sorry for the long delay. And, I also just browsed the CL. I think you ...

5 years, 4 months ago (2015-08-12 19:23:58 UTC) #56

Andrew MacDonald

lgtm. Nice work Elliot. https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_processing/test/audio_processing_unittest.cc File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_processing/test/audio_processing_unittest.cc#newcode2510 webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2510: int in_rate = file_direction ? ...

5 years, 4 months ago (2015-08-12 19:47:45 UTC) #57

Andrew MacDonald

https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode751 webrtc/modules/audio_processing/audio_processing_impl.cc:751: const float* const* src, On 2015/08/12 19:23:58, hlundin-webrtc wrote: ...

5 years, 4 months ago (2015-08-12 20:06:25 UTC) #58

hlundin-webrtc

On 2015/08/12 20:06:25, andrew_ooo_to_aug14 wrote: > https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_processing/audio_processing_impl.cc > File webrtc/modules/audio_processing/audio_processing_impl.cc (right): > > https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode751 > ...

5 years, 4 months ago (2015-08-12 20:21:23 UTC) #59

ekm

https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_processing/audio_processing_impl.cc File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_processing/audio_processing_impl.cc#newcode754 webrtc/modules/audio_processing/audio_processing_impl.cc:754: const float* const* dest) { On 2015/08/12 20:06:25, andrew_ooo_to_aug14 ...

5 years, 4 months ago (2015-08-13 01:29:22 UTC) #60

ekm

The patchset sent to the CQ was uploaded after l-g-t-m from henrika@webrtc.org, henrik.lundin@webrtc.org, andrew@webrtc.org, aluebs@webrtc.org ...

5 years, 4 months ago (2015-08-13 01:31:21 UTC) #62

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1234463003/470001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1234463003/470001

5 years, 4 months ago (2015-08-13 01:31:27 UTC) #63

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

5 years, 4 months ago (2015-08-13 01:33:45 UTC) #64

commit-bot: I haz the power

Try jobs failed on following builders: presubmit on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/presubmit/builds/346)

5 years, 4 months ago (2015-08-13 01:33:46 UTC) #65

ekm

ekmeyerson@webrtc.org changed reviewers: + pthatcher@webrtc.org

5 years, 4 months ago (2015-08-13 02:03:05 UTC) #66

ekm

Adding pthatcher as an owner of talk/media/webrtc/fakewebrtcvoiceengine.h . Hi Peter, just a small interface change ...

5 years, 4 months ago (2015-08-13 02:03:06 UTC) #67

ekm

+pbos as an owner of talk/media/webrtc/fakewebrtcvoiceengine.h If either of you could take a look that'd ...

5 years, 4 months ago (2015-08-13 18:49:58 UTC) #69

hlundin-webrtc

On 2015/08/13 18:49:58, ekm wrote: > +pbos as an owner of talk/media/webrtc/fakewebrtcvoiceengine.h > > If ...

5 years, 4 months ago (2015-08-14 09:34:05 UTC) #70

Andrew MacDonald

On 2015/08/14 09:34:05, hlundin-webrtc wrote: > On 2015/08/13 18:49:58, ekm wrote: > > +pbos as ...

5 years, 4 months ago (2015-08-14 14:09:00 UTC) #71

ekm

The patchset sent to the CQ was uploaded after l-g-t-m from henrik.lundin@webrtc.org, andrew@webrtc.org, aluebs@webrtc.org, henrika@webrtc.org ...

5 years, 4 months ago (2015-08-14 16:53:27 UTC) #73

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1234463003/530001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1234463003/530001

5 years, 4 months ago (2015-08-14 16:53:33 UTC) #74

commit-bot: I haz the power

Patchset 20 (id:??) landed as https://crrev.com/60d9b332a5391045439bfb6a3a5447973e3d5603 Cr-Commit-Position: refs/heads/master@{#9713}

5 years, 4 months ago (2015-08-14 17:36:11 UTC) #76

pbos-webrtc

On 2015/08/14 17:36:11, commit-bot: I haz the power wrote: > Patchset 20 (id:??) landed as ...

5 years, 4 months ago (2015-08-25 10:46:52 UTC) #77

henrika_webrtc

henrika@webrtc.org changed reviewers: + henrika@webrtc.org

5 years, 2 months ago (2015-09-25 10:44:52 UTC) #78

henrika_webrtc

Posting old commment. https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/output_mixer.cc File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/output_mixer.cc#newcode589 webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame, &audioproc_resampler_, &_audioFrame); Are we adding ...

5 years, 2 months ago (2015-09-25 10:44:54 UTC) #79

Andrew MacDonald

5 years, 2 months ago (2015-09-26 00:38:32 UTC) #80

Message was sent while issue was closed.

https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/output_mi...
File webrtc/voice_engine/output_mixer.cc (right):

https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/output_mi...
webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame,
&audioproc_resampler_, &_audioFrame);
On 2015/09/25 10:44:53, henrika_webrtc wrote:
> Are we adding complexity for all cases here?

Yes, this was a problem, but this is an old patchset and no longer relevant.

Issue 1234463003: Integrate Intelligibility with APM (Closed)

Description

Patch Set 1 #

Patch Set 2 : Updated interface, how VAD is used, other issues #

Patch Set 3 : Addr. comments from aluebs (incl. made ProcessReverseStream nicer) #

Patch Set 4 : Addr. comments from aluebs (incl. support diff num fwd/rev channels) #

Patch Set 5 : Addressed comments #

Patch Set 6 : Fixed memcpy #

Patch Set 7 : Added resampling support to InterleaveTo; removed VAD logic #

Patch Set 8 : merge #

Patch Set 9 : git cl format #

Patch Set 10 : Addressed comments from Patch Set 7 #

Patch Set 11 : Added support for distinct reverse input/output streams. #

Patch Set 12 : Reverted ProcessingConfig defaults #

Patch Set 13 : Added rev conversion for unprocessed case #

Patch Set 14 : Copy on no conversion or processing #

Patch Set 15 : Added render stream conversion unittest #

Patch Set 16 : Updated render stream conversion unittest #

Patch Set 17 : Addressed comments; format #

Patch Set 18 : Include <initializer_list> for Mac error #

Patch Set 19 : Fix Mac error (2) #

Patch Set 20 : Fix Mac Error (3) #

Messages