|
|
Created:
5 years, 5 months ago by ekm Modified:
5 years, 2 months ago Reviewers:
henrika_webrtc, Andrew MacDonald, hlundin-webrtc, pthatcher1, turaj, pbos-webrtc, aluebs-webrtc CC:
webrtc-reviews_webrtc.org, interface-changes_webrtc.org, rwolff_gocast.it, yujie_mao (webrtc), tterriberry_mozilla.com, qiang.lu, niklas.enbom, kwiberg-webrtc, aluebs-webrtc, bjornv1 Base URL:
https://chromium.googlesource.com/external/webrtc.git@master Target Ref:
refs/pending/heads/master Project:
webrtc Visibility:
Public. |
DescriptionIntegrate Intelligibility with APM
- Integrates intelligibility into audio_processing.
- Allows modification of reverse stream if intelligibility enabled.
- Makes intelligibility available in audioproc_float test.
- Adds reverse stream processing to audioproc_float.
- (removed) Makes intelligibility toggleable in real time in voe_cmd_test.
- Cleans up intelligibility construction, parameters, constants and dead code.
TBR=pbos@webrtc.org
Committed: https://crrev.com/60d9b332a5391045439bfb6a3a5447973e3d5603
Cr-Commit-Position: refs/heads/master@{#9713}
Patch Set 1 #
Total comments: 48
Patch Set 2 : Updated interface, how VAD is used, other issues #
Total comments: 36
Patch Set 3 : Addr. comments from aluebs (incl. made ProcessReverseStream nicer) #
Total comments: 13
Patch Set 4 : Addr. comments from aluebs (incl. support diff num fwd/rev channels) #
Total comments: 2
Patch Set 5 : Addressed comments #
Total comments: 2
Patch Set 6 : Fixed memcpy #
Total comments: 25
Patch Set 7 : Added resampling support to InterleaveTo; removed VAD logic #
Total comments: 36
Patch Set 8 : merge #Patch Set 9 : git cl format #Patch Set 10 : Addressed comments from Patch Set 7 #
Total comments: 6
Patch Set 11 : Added support for distinct reverse input/output streams. #
Total comments: 13
Patch Set 12 : Reverted ProcessingConfig defaults #
Total comments: 7
Patch Set 13 : Added rev conversion for unprocessed case #
Total comments: 4
Patch Set 14 : Copy on no conversion or processing #
Total comments: 8
Patch Set 15 : Added render stream conversion unittest #
Total comments: 23
Patch Set 16 : Updated render stream conversion unittest #
Total comments: 6
Patch Set 17 : Addressed comments; format #
Total comments: 1
Patch Set 18 : Include <initializer_list> for Mac error #Patch Set 19 : Fix Mac error (2) #Patch Set 20 : Fix Mac Error (3) #Messages
Total messages: 80 (17 generated)
ekmeyerson@webrtc.org changed reviewers: + aluebs@webrtc.org, andrew@webrtc.org, henrik.lundin@webrtc.org, turaj@webrtc.org
I don't know any background here and don't know if I am the best person to review. Any documentation or link to bug/issue?
voice_engine parts: LGTM
https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:614: agc_manager_.get() ? agc_manager_->voice_probability() : 0.f; Shouldn't we turn activity detector part of AGC on when intelligibility is enabled? Otherwise we will process all frames as inactive. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:785: if (intelligibility_enabled_) { I suppose we want to do all the modifications to playout signal before echo cancellation/echo control, otherwise they have to compensate for the changes that enhancer makes to playout signal. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:157: void IntelligibilityEnhancer::ProcessRenderAudio(float* const* audio, I did not comprehend the logic here. |voice_probability| specifies if the current chunk is active (see below for more explanation on voice probability). Therefore, it has to be used as an indicator whether the current chunk should be considered in variance computation or not. If we have |voice_probability| we should not need WebRTC VAD decision. Furthermore, why a call to |render_mangler_| is based on |voice_probability|? I don't think we should stop applying enhancement gains as soon as a chunk has low activity probability. I think we should keep on applying enhancer, but low activity chunks are discarded from variance computation. We need yet another threshold which specifies if enhancement should be applied at all. So I think the flow of the program is; 1) Update variance of to-be-rendered signal on active chunks. 2) Update variance of captured signal on in-active chunks. 3) Make a decision based on the current state of enhancer (on or off) and the current signal-to-noise ratio whether enhancer should be on or off. Obviously steps 1) and 2) are done through different API calls. I like the idea of having two ProcessRenderAudio(), but I thought the caller is supposed to call ProcessRenderAudio(float* const* audio) if they do not have any information regarding the activity the given audio chunk, then we call WebRTC VAD to get activity flag. On the other hand, if the caller has info regarding the activity, they call ProcessRenderAudio(float* const* audio, float voice_probability), then the given voice_probability is compared with a threshold to decide if the given frame should be included in variance computation. Same thing for ProcessCaptureAudio() https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:160: vad_tmp_buffer_[i] = (int16_t)audio[0][i]; You better check with APM guys, but I suppose you get audio in range -1 to 1, therefore, a cast to int16_t is not a good idea.
This will probably break when rolled into Chrome because the const-ness needs to be fixed in media_stream_audio_processor. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:614: agc_manager_.get() ? agc_manager_->voice_probability() : 0.f; On 2015/07/14 18:28:51, turaj wrote: > Shouldn't we turn activity detector part of AGC on when intelligibility is > enabled? Otherwise we will process all frames as inactive. Also, to avoid being out of sync, this should probably run after agc_manager, right? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:766: if (intelligibility_enabled_) { You don't need this if statement, because InterleaveTo gets intelligibility_enabled_ and only copies when it is true. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:787: ra->split_channels_f(kBand0To8kHz)); Maybe not for this CL, but at some point this needs to do something with the higher bands, probably something simpler in the time domain. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:1038: if (!intelligibility_enhancer_) { We probably want to reset the intelligibility_enhancer_ every time, since the split_rate_ or fwd_proc_format_.num_channels() could have changed, right? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:1041: config.channels = fwd_in_format_.num_channels(); Shouldn't this be fwd_proc_format_? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:160: vad_tmp_buffer_[i] = (int16_t)audio[0][i]; On 2015/07/14 18:28:51, turaj wrote: > You better check with APM guys, but I suppose you get audio in range -1 to 1, > therefore, a cast to int16_t is not a good idea. No, you get audio in the int16_t range. But for this there are beautiful tools under audio_util. Also, if you need the audio in int16_t and float, you can receive a IFChannelBuffer and let it take care of this? Not sure if it makes it easier or not. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:72: void ProcessCaptureAudio(float* const* audio, const float voice_probability); Does it actually processes the capture audio or does it only analyze it? Maybe we need a naming change. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:72: void ProcessCaptureAudio(float* const* audio, const float voice_probability); In all of these methods you assume the sample rate and number of channels did not change since the constructor. Maybe you want to have them as parameters and assert on that? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:73: void ProcessCaptureAudio(float* const* audio); // Assumes noise. Do we want to surface both interfaces to the user? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:78: void ProcessRenderAudio(float* const* audio); // Assumes speech. Do we want to surface both interfaces to the user? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer_unittest.cc:95: config_.sample_rate_hz = kSampleRate; Shouldn't the step_type be set here as well? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/test/audioproc_float.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/test/audioproc_float.cc:1: /* How about changing the audioproc (process_test) as well? Or are we deprecating that one, now that support for the reverse stream is added here? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/test/audioproc_float.cc:47: DEFINE_bool(ie, false, "Enable intelligibility."); "Enable intelligibility enhancer."? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/test/audioproc_float.cc:111: bool process_reverse = false; bool process_reverse = !FLAGS_i_rev.empty(); https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/test/cmd_... File webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/test/cmd_... webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:425: printf("%i. Toggle intelligibility\n", option_index++); Toggle intelligibility enhancer? https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/test/cmd_... webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:514: base1->audio_processing()->SetExtraOptions(config); Now that there is 2 calls to SetExtraOptions, you need to set both configs each time to not overwrite the other feature with the default value.
Patchset #2 (id:20001) has been deleted
Changes of note: - Added ProcessReverseStream to APM instead of changing interface of AnalyzeReverseStream. This should simplify integration with Chrome. - Finished adding internal VAD capabilities to enhancer. - Updated AudioBuffer.InterleaveTo to allow for case when src AudioBuffer has single channel and target AudioFrame has multiple. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:614: agc_manager_.get() ? agc_manager_->voice_probability() : 0.f; On 2015/07/15 01:02:04, aluebs-webrtc wrote: > On 2015/07/14 18:28:51, turaj wrote: > > Shouldn't we turn activity detector part of AGC on when intelligibility is > > enabled? Otherwise we will process all frames as inactive. > > Also, to avoid being out of sync, this should probably run after agc_manager, > right? To avoid these problems, and simplify things, switched to enhancer using its own capture VAD. Maybe optimize later. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:766: if (intelligibility_enabled_) { On 2015/07/15 01:02:04, aluebs-webrtc wrote: > You don't need this if statement, because InterleaveTo gets > intelligibility_enabled_ and only copies when it is true. Done. As a result, I updated audio_buffer.InterleaveTo to allow for case of single source channel and >1 dest channels. This should make InterleaveTo make more sense, and keeps it possible to test intelligibility in voe_cmd_test. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:785: if (intelligibility_enabled_) { On 2015/07/14 18:28:51, turaj wrote: > I suppose we want to do all the modifications to playout signal before echo > cancellation/echo control, otherwise they have to compensate for the changes > that enhancer makes to playout signal. Done. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:787: ra->split_channels_f(kBand0To8kHz)); On 2015/07/15 01:02:04, aluebs-webrtc wrote: > Maybe not for this CL, but at some point this needs to do something with the > higher bands, probably something simpler in the time domain. Acknowledged. I think we'll save this for a later CL. Since the enhancer will only be on in cases of very high noise and it's targeting speech, 0-8kHz should be ok, but you're right we might get some strange behavior if we do nothing to the higher bands. Do you have an idea of what might go wrong here? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:1038: if (!intelligibility_enhancer_) { On 2015/07/15 01:02:04, aluebs-webrtc wrote: > We probably want to reset the intelligibility_enhancer_ every time, since the > split_rate_ or fwd_proc_format_.num_channels() could have changed, right? Done. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:1041: config.channels = fwd_in_format_.num_channels(); On 2015/07/15 01:02:04, aluebs-webrtc wrote: > Shouldn't this be fwd_proc_format_? Just set it to single channel for now, since that's the primary use case, and we haven't yet decided on how to handle multiple channels. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:157: void IntelligibilityEnhancer::ProcessRenderAudio(float* const* audio, On 2015/07/14 18:28:51, turaj wrote: > I did not comprehend the logic here. > > |voice_probability| specifies if the current chunk is active (see below for more > explanation on voice probability). Therefore, it has to be used as an indicator > whether the current chunk should be considered in variance computation or not. > If we have |voice_probability| we should not need WebRTC VAD decision. > > Furthermore, why a call to |render_mangler_| is based on |voice_probability|? I > don't think we should stop applying enhancement gains as soon as a chunk has low > activity probability. I think we should keep on applying enhancer, but low > activity chunks are discarded from variance computation. > > We need yet another threshold which specifies if enhancement should be applied > at all. So I think the flow of the program is; > 1) Update variance of to-be-rendered signal on active chunks. > 2) Update variance of captured signal on in-active chunks. > 3) Make a decision based on the current state of enhancer (on or off) and the > current signal-to-noise ratio whether enhancer should be on or off. > > Obviously steps 1) and 2) are done through different API calls. > > I like the idea of having two ProcessRenderAudio(), but I thought the caller is > supposed to call ProcessRenderAudio(float* const* audio) if they do not have any > information regarding the activity the given audio chunk, then we call WebRTC > VAD to get activity flag. On the other hand, if the caller has info regarding > the activity, they call ProcessRenderAudio(float* const* audio, float > voice_probability), then the given voice_probability is compared with a > threshold to decide if the given frame should be included in variance > computation. > > Same thing for ProcessCaptureAudio() Done. You're right, the logic was off and not fully implemented. I've updated everything to meet this spec and use the new VAD. It also uses smoothing when deactivating. There may be a more intuitive logic than what I have now. See UpdateActivity(). https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:160: vad_tmp_buffer_[i] = (int16_t)audio[0][i]; On 2015/07/14 18:28:51, turaj wrote: > You better check with APM guys, but I suppose you get audio in range -1 to 1, > therefore, a cast to int16_t is not a good idea. Done. audio_util is great! https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:72: void ProcessCaptureAudio(float* const* audio, const float voice_probability); On 2015/07/15 01:02:04, aluebs-webrtc wrote: > Does it actually processes the capture audio or does it only analyze it? Maybe > we need a naming change. Agreed. Similarly, in APM reverted AnalyzeReverseStream to const and added ProcessReverseStream. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:72: void ProcessCaptureAudio(float* const* audio, const float voice_probability); On 2015/07/15 01:02:05, aluebs-webrtc wrote: > In all of these methods you assume the sample rate and number of channels did > not change since the constructor. Maybe you want to have them as parameters and > assert on that? Done. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:73: void ProcessCaptureAudio(float* const* audio); // Assumes noise. On 2015/07/15 01:02:04, aluebs-webrtc wrote: > Do we want to surface both interfaces to the user? I think it's nice to give the user the option of using their own voice probabilities if they want (especially if they have really good ones). May be useful if there is some very expensive VAD in the future that we only want to run once for all APM. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:78: void ProcessRenderAudio(float* const* audio); // Assumes speech. On 2015/07/15 01:02:04, aluebs-webrtc wrote: > Do we want to surface both interfaces to the user? See above. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer_unittest.cc:95: config_.sample_rate_hz = kSampleRate; On 2015/07/15 01:02:05, aluebs-webrtc wrote: > Shouldn't the step_type be set here as well? Done. Nice catch. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/test/audioproc_float.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/test/audioproc_float.cc:1: /* On 2015/07/15 01:02:05, aluebs-webrtc wrote: > How about changing the audioproc (process_test) as well? Or are we deprecating > that one, now that support for the reverse stream is added here? Not sure. I'd be happy to add it if it'd be useful. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/test/audioproc_float.cc:47: DEFINE_bool(ie, false, "Enable intelligibility."); On 2015/07/15 01:02:05, aluebs-webrtc wrote: > "Enable intelligibility enhancer."? Done. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/test/audioproc_float.cc:111: bool process_reverse = false; On 2015/07/15 01:02:05, aluebs-webrtc wrote: > bool process_reverse = !FLAGS_i_rev.empty(); Done. https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/test/cmd_... File webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/test/cmd_... webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:425: printf("%i. Toggle intelligibility\n", option_index++); On 2015/07/15 01:02:05, aluebs-webrtc wrote: > Toggle intelligibility enhancer? Done. https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/test/cmd_... webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:514: base1->audio_processing()->SetExtraOptions(config); On 2015/07/15 01:02:05, aluebs-webrtc wrote: > Now that there is 2 calls to SetExtraOptions, you need to set both configs each > time to not overwrite the other feature with the default value. Done.
https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:766: if (intelligibility_enabled_) { On 2015/07/17 19:59:38, ekm wrote: > On 2015/07/15 01:02:04, aluebs-webrtc wrote: > > You don't need this if statement, because InterleaveTo gets > > intelligibility_enabled_ and only copies when it is true. > > Done. As a result, I updated audio_buffer.InterleaveTo to allow for case of > single source channel and >1 dest channels. This should make InterleaveTo make > more sense, and keeps it possible to test intelligibility in voe_cmd_test. Ack https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:787: ra->split_channels_f(kBand0To8kHz)); On 2015/07/17 19:59:38, ekm wrote: > On 2015/07/15 01:02:04, aluebs-webrtc wrote: > > Maybe not for this CL, but at some point this needs to do something with the > > higher bands, probably something simpler in the time domain. > > Acknowledged. I think we'll save this for a later CL. Since the enhancer will > only be on in cases of very high noise and it's targeting speech, 0-8kHz should > be ok, but you're right we might get some strange behavior if we do nothing to > the higher bands. Do you have an idea of what might go wrong here? Agreed on leaving for another CL. I don't think anything will go explicitly wrong, but if you are not processing the higher bands, you lose the ability to shape the spectrum above 8kHz, which probably improves your enhancement. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:1041: config.channels = fwd_in_format_.num_channels(); On 2015/07/17 19:59:38, ekm wrote: > On 2015/07/15 01:02:04, aluebs-webrtc wrote: > > Shouldn't this be fwd_proc_format_? > > Just set it to single channel for now, since that's the primary use case, and we > haven't yet decided on how to handle multiple channels. Then I think it is better to set the correct value here and handle the multiple channel cases in the enhancer, even if it is a CHECK for now. WDYT? https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:73: void ProcessCaptureAudio(float* const* audio); // Assumes noise. On 2015/07/17 19:59:38, ekm wrote: > On 2015/07/15 01:02:04, aluebs-webrtc wrote: > > Do we want to surface both interfaces to the user? > > I think it's nice to give the user the option of using their own voice > probabilities if they want (especially if they have really good ones). May be > useful if there is some very expensive VAD in the future that we only want to > run once for all APM. Agreed. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/test/audioproc_float.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/test/audioproc_float.cc:1: /* On 2015/07/17 19:59:38, ekm wrote: > On 2015/07/15 01:02:05, aluebs-webrtc wrote: > > How about changing the audioproc (process_test) as well? Or are we deprecating > > that one, now that support for the reverse stream is added here? > > Not sure. I'd be happy to add it if it'd be useful. I would like to have Andrew's opinion here :) https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_buffer.cc:453: int16_t* const* channel_ptr_copies[frame->num_channels_]; int16_t* channel_ptr_copies[frame->num_channels_]; https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_buffer.cc:454: for (int i = 0; i < frame->num_channels_; i++) { ++i https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_buffer.cc:455: channel_ptr_copies[i] = data_->ibuf()->channels(); data->ibuf()->channels()[0]; https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_buffer.cc:457: Interleave(data_->ibuf()->channels(), proc_num_frames_, num_channels_, channel_ptr_copies https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:686: CHECK(!intelligibility_enabled_); I don't like that AnalyzeReverseStream explodes if you are running intelligibility. I would rather just do wah it was doing before, just don't copy to the output. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/include/mock_audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/include/mock_audio_processing.h:230: int samples_per_channel, Maybe change both to num_frames, since it is the naming convention. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/include/mock_audio_processing.h:232: ChannelLayout layout)); input_layout, to be consistent. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:162: CHECK_EQ(num_channels, channels_); Use num_channels_ for the member variable as well? It is clearer. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:171: if (render_voice_probability_ >= render_vad_thresh_ || active_) { I am not sure why the voice probability affects if the chunk gets processed. Maybe I am just missing something. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:203: if (capture_voice_probability_ <= capture_vad_thresh_) { I am not sure why the voice probability affects if the chunk gets processed. Maybe I am just missing something. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:230: if (render_voice_probability_ >= render_vad_thresh_) { This changed from has_voice_low to checking if the render_voice_probability is above the threshold. Isn't that just the opposite? https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:333: for (int i = 0; i < freqs_; i++) { ++i Also, maybe use num_freqs_? It is clearer. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:334: target_[i] = 1.0f; 1.f https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:340: for (int i = 0; i < freqs_; i++) { ++i https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:341: if (current_[i] != 1.0f || target_[i] != 1.0f) { 1.f both times.
https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_buffer.cc:453: int16_t* const* channel_ptr_copies[frame->num_channels_]; On 2015/07/20 19:33:43, aluebs-webrtc wrote: > int16_t* channel_ptr_copies[frame->num_channels_]; Done. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_buffer.cc:454: for (int i = 0; i < frame->num_channels_; i++) { On 2015/07/20 19:33:43, aluebs-webrtc wrote: > ++i Done. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_buffer.cc:455: channel_ptr_copies[i] = data_->ibuf()->channels(); On 2015/07/20 19:33:43, aluebs-webrtc wrote: > data->ibuf()->channels()[0]; Done. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_buffer.cc:457: Interleave(data_->ibuf()->channels(), proc_num_frames_, num_channels_, On 2015/07/20 19:33:43, aluebs-webrtc wrote: > channel_ptr_copies Done. Phew, nice catch. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:686: CHECK(!intelligibility_enabled_); On 2015/07/20 19:33:43, aluebs-webrtc wrote: > I don't like that AnalyzeReverseStream explodes if you are running > intelligibility. I would rather just do wah it was doing before, just don't copy > to the output. Done. Thanks, this is MUCH better. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/include/mock_audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/include/mock_audio_processing.h:230: int samples_per_channel, On 2015/07/20 19:33:43, aluebs-webrtc wrote: > Maybe change both to num_frames, since it is the naming convention. Done. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/include/mock_audio_processing.h:232: ChannelLayout layout)); On 2015/07/20 19:33:43, aluebs-webrtc wrote: > input_layout, to be consistent. Done. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:162: CHECK_EQ(num_channels, channels_); On 2015/07/20 19:33:43, aluebs-webrtc wrote: > Use num_channels_ for the member variable as well? It is clearer. Done. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:171: if (render_voice_probability_ >= render_vad_thresh_ || active_) { On 2015/07/20 19:33:43, aluebs-webrtc wrote: > I am not sure why the voice probability affects if the chunk gets processed. > Maybe I am just missing something. If we're pretty sure the far-end chunk contains speech, we'd like to use it in our estimation of the speech signal. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:203: if (capture_voice_probability_ <= capture_vad_thresh_) { On 2015/07/20 19:33:43, aluebs-webrtc wrote: > I am not sure why the voice probability affects if the chunk gets processed. > Maybe I am just missing something. If we're pretty sure the near-end chunk contains noise, we'd like to use it in our estimation of the noise signal. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:230: if (render_voice_probability_ >= render_vad_thresh_) { On 2015/07/20 19:33:43, aluebs-webrtc wrote: > This changed from has_voice_low to checking if the render_voice_probability is > above the threshold. Isn't that just the opposite? The has_voice_low_ was referring to detection of voice with 'low' aggressiveness with the common_audio vad. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:333: for (int i = 0; i < freqs_; i++) { On 2015/07/20 19:33:43, aluebs-webrtc wrote: > ++i > Also, maybe use num_freqs_? It is clearer. Done. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:334: target_[i] = 1.0f; On 2015/07/20 19:33:43, aluebs-webrtc wrote: > 1.f Done. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:340: for (int i = 0; i < freqs_; i++) { On 2015/07/20 19:33:43, aluebs-webrtc wrote: > ++i Done. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:341: if (current_[i] != 1.0f || target_[i] != 1.0f) { On 2015/07/20 19:33:43, aluebs-webrtc wrote: > 1.f both times. Done.
https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:171: if (render_voice_probability_ >= render_vad_thresh_ || active_) { On 2015/07/21 01:02:44, ekm wrote: > On 2015/07/20 19:33:43, aluebs-webrtc wrote: > > I am not sure why the voice probability affects if the chunk gets processed. > > Maybe I am just missing something. > > If we're pretty sure the far-end chunk contains speech, we'd like to use it in > our estimation of the speech signal. Oh! ProcessChunk doesn't really processes the chunk, it just uses it for estimation, right? But then why are all chunks used for estimation when active? I feel I am still missing something. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:203: if (capture_voice_probability_ <= capture_vad_thresh_) { On 2015/07/21 01:02:44, ekm wrote: > On 2015/07/20 19:33:43, aluebs-webrtc wrote: > > I am not sure why the voice probability affects if the chunk gets processed. > > Maybe I am just missing something. > > If we're pretty sure the near-end chunk contains noise, we'd like to use it in > our estimation of the noise signal. That makes sense. I was not aware that ProcessChunk was only estimating. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:230: if (render_voice_probability_ >= render_vad_thresh_) { On 2015/07/21 01:02:44, ekm wrote: > On 2015/07/20 19:33:43, aluebs-webrtc wrote: > > This changed from has_voice_low to checking if the render_voice_probability is > > above the threshold. Isn't that just the opposite? > > The has_voice_low_ was referring to detection of voice with 'low' aggressiveness > with the common_audio vad. Oh, I see. Then it makes sense. https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:10: #include <iostream> Left from debugging, right?
https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:787: ra->split_channels_f(kBand0To8kHz)); On 2015/07/20 19:33:42, aluebs-webrtc wrote: > On 2015/07/17 19:59:38, ekm wrote: > > On 2015/07/15 01:02:04, aluebs-webrtc wrote: > > > Maybe not for this CL, but at some point this needs to do something with the > > > higher bands, probably something simpler in the time domain. > > > > Acknowledged. I think we'll save this for a later CL. Since the enhancer will > > only be on in cases of very high noise and it's targeting speech, 0-8kHz > should > > be ok, but you're right we might get some strange behavior if we do nothing to > > the higher bands. Do you have an idea of what might go wrong here? > > Agreed on leaving for another CL. I don't think anything will go explicitly > wrong, but if you are not processing the higher bands, you lose the ability to > shape the spectrum above 8kHz, which probably improves your enhancement. Acknowledged. https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:1041: config.channels = fwd_in_format_.num_channels(); On 2015/07/20 19:33:42, aluebs-webrtc wrote: > On 2015/07/17 19:59:38, ekm wrote: > > On 2015/07/15 01:02:04, aluebs-webrtc wrote: > > > Shouldn't this be fwd_proc_format_? > > > > Just set it to single channel for now, since that's the primary use case, and > we > > haven't yet decided on how to handle multiple channels. > > Then I think it is better to set the correct value here and handle the multiple > channel cases in the enhancer, even if it is a CHECK for now. WDYT? I agree. The problem was that the enhancer only had one num_channels member, so didn't support case where render and capture have different number. I've now added those members, so further changes to multi-channel case can be done in enhancer. The enhancer actually can handle these cases, it's just not particularly smart about it yet. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:171: if (render_voice_probability_ >= render_vad_thresh_ || active_) { On 2015/07/21 01:50:55, aluebs-webrtc wrote: > On 2015/07/21 01:02:44, ekm wrote: > > On 2015/07/20 19:33:43, aluebs-webrtc wrote: > > > I am not sure why the voice probability affects if the chunk gets processed. > > > Maybe I am just missing something. > > > > If we're pretty sure the far-end chunk contains speech, we'd like to use it in > > our estimation of the speech signal. > > Oh! ProcessChunk doesn't really processes the chunk, it just uses it for > estimation, right? But then why are all chunks used for estimation when active? > I feel I am still missing something. ProcessChunk transforms audio to frequency domain, which we need for both estimation and processing. We overload the logic a bit here so we only have to do the transform once. In this case, render_mangler_->ProcessChunk does the transform then goes to ProcessClearBlock, where it decides what to do with the freq data, either estimate, process or both. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:203: if (capture_voice_probability_ <= capture_vad_thresh_) { On 2015/07/21 01:50:55, aluebs-webrtc wrote: > On 2015/07/21 01:02:44, ekm wrote: > > On 2015/07/20 19:33:43, aluebs-webrtc wrote: > > > I am not sure why the voice probability affects if the chunk gets processed. > > > Maybe I am just missing something. > > > > If we're pretty sure the near-end chunk contains noise, we'd like to use it in > > our estimation of the noise signal. > > That makes sense. I was not aware that ProcessChunk was only estimating. Yeah, in the capture case, the freq data is only used for estimation. https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:10: #include <iostream> On 2015/07/21 01:50:55, aluebs-webrtc wrote: > Left from debugging, right? Yep. Done.
Some high-level comments. In the future, please break up these kind of changes more. Ideally, this CL would only include changes necessary for APM integration. You could have at least two other CLs for the Config and tuning changes. Since this has been reviewed for awhile already, don't break them up now though... https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:435: if (intelligibility_enabled_ != config.Get<Intelligibility>().enabled) { So, remove this. https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:773: int AudioProcessingImpl::AnalyzeReverseStreamLocked() { So AnalyzeReverseStream is no longer just for analysis; if IE is enabled it will also process the stream, but just not cause it to be copied back to the output audio. I don't think there's much benefit in maintaining separate Analyze and Process methods. Can you instead work towards replacing Analyze with Process? You'll have to keep the former around until you can update clients. https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/include/audio_processing.h:126: // in the constructor or using AudioProcessing::SetExtraOptions(). Please don't allow it to be set through SetExtraOptions. We want to remove this method. I realize this won't allow you to enable it through voice engine for now, but I think that's an acceptable compromise. https://codereview.webrtc.org/1234463003/diff/60001/webrtc/voice_engine/outpu... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/voice_engine/outpu... webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame, &audioproc_resampler_, &_audioFrame); Remove this if we decide to drop voice engine support.
https://codereview.webrtc.org/1234463003/diff/80001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/80001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:119: malloc(sizeof(*temp_render_out_buffer_) * num_render_channels_ + There is probably a smarter way of allocating this: vector, scoped_ptr<[]> or something similar.
https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/audio_processing_impl.cc:1041: config.channels = fwd_in_format_.num_channels(); On 2015/07/21 19:22:13, ekm wrote: > On 2015/07/20 19:33:42, aluebs-webrtc wrote: > > On 2015/07/17 19:59:38, ekm wrote: > > > On 2015/07/15 01:02:04, aluebs-webrtc wrote: > > > > Shouldn't this be fwd_proc_format_? > > > > > > Just set it to single channel for now, since that's the primary use case, > and > > we > > > haven't yet decided on how to handle multiple channels. > > > > Then I think it is better to set the correct value here and handle the > multiple > > channel cases in the enhancer, even if it is a CHECK for now. WDYT? > > I agree. The problem was that the enhancer only had one num_channels member, so > didn't support case where render and capture have different number. I've now > added those members, so further changes to multi-channel case can be done in > enhancer. The enhancer actually can handle these cases, it's just not > particularly smart about it yet. This is better. https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:171: if (render_voice_probability_ >= render_vad_thresh_ || active_) { On 2015/07/21 19:22:13, ekm wrote: > On 2015/07/21 01:50:55, aluebs-webrtc wrote: > > On 2015/07/21 01:02:44, ekm wrote: > > > On 2015/07/20 19:33:43, aluebs-webrtc wrote: > > > > I am not sure why the voice probability affects if the chunk gets > processed. > > > > Maybe I am just missing something. > > > > > > If we're pretty sure the far-end chunk contains speech, we'd like to use it > in > > > our estimation of the speech signal. > > > > Oh! ProcessChunk doesn't really processes the chunk, it just uses it for > > estimation, right? But then why are all chunks used for estimation when > active? > > I feel I am still missing something. > > ProcessChunk transforms audio to frequency domain, which we need for both > estimation and processing. We overload the logic a bit here so we only have to > do the transform once. In this case, render_mangler_->ProcessChunk does the > transform then goes to ProcessClearBlock, where it decides what to do with the > freq data, either estimate, process or both. Oh, I see. Thanks for clarifying.
https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/include/audio_processing.h:126: // in the constructor or using AudioProcessing::SetExtraOptions(). On 2015/07/21 19:29:22, andrew wrote: > Please don't allow it to be set through SetExtraOptions. We want to remove this > method. I realize this won't allow you to enable it through voice engine for > now, but I think that's an acceptable compromise. Ah, my mistake. You can still enable it in voice engine, but it makes it harder through the crappy voe_cmd_test. Put up a CL here which shows how you could do it through agc_harness at least: https://codereview.webrtc.org/1247033006/
https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:435: if (intelligibility_enabled_ != config.Get<Intelligibility>().enabled) { On 2015/07/21 19:29:21, andrew wrote: > So, remove this. Done. https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:773: int AudioProcessingImpl::AnalyzeReverseStreamLocked() { On 2015/07/21 19:29:22, andrew wrote: > So AnalyzeReverseStream is no longer just for analysis; if IE is enabled it will > also process the stream, but just not cause it to be copied back to the output > audio. > > I don't think there's much benefit in maintaining separate Analyze and Process > methods. Can you instead work towards replacing Analyze with Process? You'll > have to keep the former around until you can update clients. Yep, sounds good. I've re-renamed AnalyzeReverseStreamLocked to ProcessReverseStreamLocked since it processes. Is there anything else I can do now to ease the transition to ProcessReverseStream? https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/include/audio_processing.h:126: // in the constructor or using AudioProcessing::SetExtraOptions(). On 2015/07/22 02:09:48, andrew wrote: > On 2015/07/21 19:29:22, andrew wrote: > > Please don't allow it to be set through SetExtraOptions. We want to remove > this > > method. I realize this won't allow you to enable it through voice engine for > > now, but I think that's an acceptable compromise. > > Ah, my mistake. You can still enable it in voice engine, but it makes it harder > through the crappy voe_cmd_test. Put up a CL here which shows how you could do > it through agc_harness at least: > https://codereview.webrtc.org/1247033006/ Got it. Thanks for the pointer. Can also at least enable it in voe_cmd_test at compile time. On a similar note, what's the status of the Enable method that several of the apm components provide? https://codereview.webrtc.org/1234463003/diff/60001/webrtc/voice_engine/outpu... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/voice_engine/outpu... webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame, &audioproc_resampler_, &_audioFrame); On 2015/07/21 19:29:22, andrew wrote: > Remove this if we decide to drop voice engine support. Acknowledged. https://codereview.webrtc.org/1234463003/diff/80001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/80001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:119: malloc(sizeof(*temp_render_out_buffer_) * num_render_channels_ + On 2015/07/21 21:27:00, aluebs-webrtc wrote: > There is probably a smarter way of allocating this: vector, scoped_ptr<[]> or > something similar. Definitely. Changed to ChannelBuffer.
lgtm % one small comment https://codereview.webrtc.org/1234463003/diff/100001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/100001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:166: chunk_length_ * sizeof(float)); sizeof(**audio)
https://codereview.webrtc.org/1234463003/diff/100001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/100001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:166: chunk_length_ * sizeof(float)); On 2015/07/23 00:35:26, aluebs-webrtc wrote: > sizeof(**audio) Done.
https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:773: int AudioProcessingImpl::AnalyzeReverseStreamLocked() { On 2015/07/23 00:26:28, ekm wrote: > On 2015/07/21 19:29:22, andrew wrote: > > So AnalyzeReverseStream is no longer just for analysis; if IE is enabled it > will > > also process the stream, but just not cause it to be copied back to the output > > audio. > > > > I don't think there's much benefit in maintaining separate Analyze and Process > > methods. Can you instead work towards replacing Analyze with Process? You'll > > have to keep the former around until you can update clients. > > Yep, sounds good. I've re-renamed AnalyzeReverseStreamLocked to > ProcessReverseStreamLocked since it processes. Is there anything else I can do > now to ease the transition to ProcessReverseStream? Yes, mark AnalyzeReverseStream as deprecated in audio_processing.h with a TODO to remove when all users are updated. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:452: // Copy single AudioBuffer channel into all AudioFrame channels Period at the end, and can you note that this is specifically for intelligibility enhancement? This is a problem (amongst other things) with introducing this render-side processing. APM wasn't designed to handle it and so you're kind of shoe-horning it in. I don't have a better suggestion without large refactoring though. Ideally you would do your processing on however many channels you received before downmixing to mono. But that would add a bunch of complexity to a system that's already straining under the load. I think this upmixing is OK, but you should note in the interface that if intelligibility enhancement is enabled the reverse stream will become an upmixed mono signal. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:454: new int16_t*[frame->num_channels_]); Arg, why did you switch to using dynamic allocation? But anyway, perhaps better to add a UpmixMonoToInterleave? See the new DownmixInterleavedToMono Michael added here: https://chromium.googlesource.com/external/webrtc/+/master/webrtc/common_audi... https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:724: int AudioProcessingImpl::AnalyzeReverseStream(AudioFrame* frame) { The AudioFrame* and float* AnalyzeReverseStreams have different behavior wrt IE. Normally I'd ask you to add a ProcessReverseStream for AudioFrame as well, but we really want to get rid of this AudioFrame junk, so I suppose this is OK... https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:763: render_audio_->InterleaveTo(frame, intelligibility_enabled_); I think more clear if you add the check here: if (intelligbility_enabled_) { InterleaveTo(...); } The bool parameter is just so the AudioFrame's VAD state can get updated, which you don't care about on the reverse stream. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:785: if (rev_proc_format_.rate() == kSampleRate32kHz) { && intelligbility_enabled_ https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:233: // efidata(n,:) = sqrt(b(n)) * fidata(n,:) Remove if you don't want this. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:159: float SNR(); Sorry to do this, but could you please move these new features which are modifying the output results of IE to another CL? You don't even mention these in the CL description :) https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/outp... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/outp... webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame, &audioproc_resampler_, &_audioFrame); This is a big deal ;) First, you can't use the same resampler here, as it should be operating on the same continuous stream. I think the right way to do this is remove the pre-RemixAndResample and let APM handle it. In the "modern" float interface that would be no problem. Unfortunately, this AudioFrame interface doesn't handle resampling in DeinterleaveFrom, as you've seen. So we'd need to add support for it there. Another, and possibly superior, option is to remove support for IE through the AudioFrame interface in this CL. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/test... File webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/test... webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:237: bool intelligibility_enabled = false; Revert changes to this file now that SetExtraOptions doesn't work.
https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:231: UpdateActivity(); If we are deactivated, then gains are all one, right? then if UpdateActivity() decides to set activity to true, we keep applying unit gains until |block_count_| is |analysis_rate_ - 1|, and even then we will ramp up from unit gains to target gain. Obviously the perceived speed of enhancer depends on |analysis_rate_| but I have concern that this might be a bit slow. Can we modify the logic such that at a switch from inactive to active, we start ramp up gains from unity to their target values. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:403: float total_clear_var = std::accumulate( these two variables can be const. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:159: float SNR(); This method is const. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:167: const int freqs_; // Num frequencies in frequency domain. As comments eventually will be mis-aligned, maybe better to stick to the 2-spaces rule. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h:154: bool IsIdentity(); This method is constant.
Changes of note: - Added support for resampling in InterleaveTo. - Removed this responsibility from VoE (output_mixer). - Still need to add rev_out_format (or similar) to apm for this to be effective; will do this after merge with https://codereview.webrtc.org/1226093007/ in later patch. - Removed updated VAD functionality. - Will re-add in a later cl, and address Turaj's newest related comments there. - Removed voe_cmd_test toggleability, since SetExtraOptions is deprecated. https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_impl.cc:773: int AudioProcessingImpl::AnalyzeReverseStreamLocked() { On 2015/07/24 23:50:39, andrew wrote: > On 2015/07/23 00:26:28, ekm wrote: > > On 2015/07/21 19:29:22, andrew wrote: > > > So AnalyzeReverseStream is no longer just for analysis; if IE is enabled it > > will > > > also process the stream, but just not cause it to be copied back to the > output > > > audio. > > > > > > I don't think there's much benefit in maintaining separate Analyze and > Process > > > methods. Can you instead work towards replacing Analyze with Process? You'll > > > have to keep the former around until you can update clients. > > > > Yep, sounds good. I've re-renamed AnalyzeReverseStreamLocked to > > ProcessReverseStreamLocked since it processes. Is there anything else I can do > > now to ease the transition to ProcessReverseStream? > > Yes, mark AnalyzeReverseStream as deprecated in audio_processing.h with a TODO > to remove when all users are updated. Done. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:452: // Copy single AudioBuffer channel into all AudioFrame channels On 2015/07/24 23:50:39, andrew wrote: > Period at the end, and can you note that this is specifically for > intelligibility enhancement? > > This is a problem (amongst other things) with introducing this render-side > processing. APM wasn't designed to handle it and so you're kind of shoe-horning > it in. I don't have a better suggestion without large refactoring though. > > Ideally you would do your processing on however many channels you received > before downmixing to mono. But that would add a bunch of complexity to a system > that's already straining under the load. I think this upmixing is OK, but you > should note in the interface that if intelligibility enhancement is enabled the > reverse stream will become an upmixed mono signal. Done. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:454: new int16_t*[frame->num_channels_]); On 2015/07/24 23:50:39, andrew wrote: > Arg, why did you switch to using dynamic allocation? > > But anyway, perhaps better to add a UpmixMonoToInterleave? See the new > DownmixInterleavedToMono Michael added here: > https://chromium.googlesource.com/external/webrtc/+/master/webrtc/common_audi... Done. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:724: int AudioProcessingImpl::AnalyzeReverseStream(AudioFrame* frame) { On 2015/07/24 23:50:39, andrew wrote: > The AudioFrame* and float* AnalyzeReverseStreams have different behavior wrt IE. > Normally I'd ask you to add a ProcessReverseStream for AudioFrame as well, but > we really want to get rid of this AudioFrame junk, so I suppose this is OK... Acknowledged. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:763: render_audio_->InterleaveTo(frame, intelligibility_enabled_); On 2015/07/24 23:50:39, andrew wrote: > I think more clear if you add the check here: > if (intelligbility_enabled_) { > InterleaveTo(...); > } > > The bool parameter is just so the AudioFrame's VAD state can get updated, which > you don't care about on the reverse stream. Done. Added is_rev_processed() to clean up these checks. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:785: if (rev_proc_format_.rate() == kSampleRate32kHz) { On 2015/07/24 23:50:39, andrew wrote: > && intelligbility_enabled_ Done. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:233: // efidata(n,:) = sqrt(b(n)) * fidata(n,:) On 2015/07/24 23:50:40, andrew wrote: > Remove if you don't want this. Done. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:159: float SNR(); On 2015/07/24 23:50:40, andrew wrote: > Sorry to do this, but could you please move these new features which are > modifying the output results of IE to another CL? You don't even mention these > in the CL description :) Done. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/outp... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/outp... webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame, &audioproc_resampler_, &_audioFrame); On 2015/07/24 23:50:40, andrew wrote: > This is a big deal ;) First, you can't use the same resampler here, as it should > be operating on the same continuous stream. > > I think the right way to do this is remove the pre-RemixAndResample and let APM > handle it. In the "modern" float interface that would be no problem. > Unfortunately, this AudioFrame interface doesn't handle resampling in > DeinterleaveFrom, as you've seen. So we'd need to add support for it there. > > Another, and possibly superior, option is to remove support for IE through the > AudioFrame interface in this CL. > Done (1st option): added support in InterleaveTo in audio_buffer. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/test... File webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/voice_engine/test... webrtc/voice_engine/test/cmd_test/voe_cmd_test.cc:237: bool intelligibility_enabled = false; On 2015/07/24 23:50:40, andrew wrote: > Revert changes to this file now that SetExtraOptions doesn't work. Done.
Looking pretty good. Alex, could you take another quick look? This has changed considerably since you approved. https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:763: render_audio_->InterleaveTo(frame, intelligibility_enabled_); On 2015/07/29 00:37:19, ekm wrote: > On 2015/07/24 23:50:39, andrew wrote: > > I think more clear if you add the check here: > > if (intelligbility_enabled_) { > > InterleaveTo(...); > > } > > > > The bool parameter is just so the AudioFrame's VAD state can get updated, > which > > you don't care about on the reverse stream. > > Done. Added is_rev_processed() to clean up these checks. Nice. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:104: void UpmixMonoToInterleaved(const T* mono, int samples_per_channel, Go with num_frames for new code rather than samples_per_channel. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:110: interleaved_idx++; nit: could put the post-increment in the above line. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:462: data_ptr = data_.get(); Initialize data_ptr to data_.get() when you declare it and you can skip this else block. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:1006: return intelligibility_enabled_ && intelligibility_enhancer_->active(); Hmm, checking active() is nice to save some complexity, but can this flip often? You'll get a probably audible discontinuity when it flips if you don't smooth this out. One solution would be to have IE flip this on before it starts processing the signal (that is, it returns the unprocessed signal for at least one chunk) and flip it off after it stops (it returns at least one unprocessed chunk). It can then internally smooth between these states. If you don't have that feature today, I'd remove that chunk and leave a TODO here. Ah, reading further I see active() always returns true for now. So leave this, but make sure you add something like the feature I described when you make it non-const. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:332: virtual int AnalyzeReverseStream(AudioFrame* frame) = 0; Since we're now supporting this interface fully, you should similarly add a ProcessReverseStream for it and mark this as deprecated. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:27: #include "webrtc/common_audio/include/audio_util.h" Alpha order. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:139: if (active_) { This is where you need smoothing as you switch states. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:151: bool active_; // Whether render gains are being updated. Make this const for now. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audioproc_float.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audioproc_float.cc:190: LayoutFromChannels(in_rev_buf->num_channels()))); Is this indentation right? Run clang-format if you haven't. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp... webrtc/voice_engine/output_mixer.cc:578: if (_audioProcessingModulePtr->AnalyzeReverseStream(&_audioFrame) == -1) { Can you run through a bunch of codecs in voe_cmd_test with AEC on/off intelligibility on/off and make sure everything sounds OK? I'll do the same once you land this.
https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:109: interleaved[interleaved_idx] = mono[i]; How about: interleaved[i * num_channels + j] = mono[i]; And drop interleaved_idx completely? https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:469: UpmixMonoToInterleaved(data_ptr->ibuf()->channels()[0], proc_num_frames_, Should this method be in a anonymous namespace in this file instead in audio_util? Or even don't abstract this? Just to keep things together, since the interleaved format is deprecated and will eventually disappear? I don't have a strong opinion. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:787: if (rev_proc_format_.rate() == kSampleRate32kHz && is_rev_processed()) { Maybe this is a good time to make 32kHz and 48kHz consistent and use splitting filter in both cases? Today 48Khz down-samples and 32kHz splits since the 2-band splitting filter performance is better than the down-sampler, but the 3-band splitting filter isn't. But soon the ineligibility enhancer will require the higher bands anyway. WDYT? https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:139: if (active_) { There is a if(active_) just above where you can probably merge this into.
Patchset #10 (id:200001) has been deleted
https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/120001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:167: const int freqs_; // Num frequencies in frequency domain. On 2015/07/27 20:01:05, turaj wrote: > As comments eventually will be mis-aligned, maybe better to stick to the > 2-spaces rule. 'git cl format' keeps re-aligning these. I'll just stick with whatever it does. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:104: void UpmixMonoToInterleaved(const T* mono, int samples_per_channel, On 2015/07/29 03:52:27, andrew wrote: > Go with num_frames for new code rather than samples_per_channel. Done. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:109: interleaved[interleaved_idx] = mono[i]; On 2015/07/29 22:17:10, aluebs-webrtc wrote: > How about: > > interleaved[i * num_channels + j] = mono[i]; > > And drop interleaved_idx completely? I'd like to avoid the inner multiplication, though not of the tradeoff in practice vs declaring the int every call. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:110: interleaved_idx++; On 2015/07/29 03:52:27, andrew wrote: > nit: could put the post-increment in the above line. Done. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:462: data_ptr = data_.get(); On 2015/07/29 03:52:27, andrew wrote: > Initialize data_ptr to data_.get() when you declare it and you can skip this > else block. Done. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:469: UpmixMonoToInterleaved(data_ptr->ibuf()->channels()[0], proc_num_frames_, On 2015/07/29 22:17:10, aluebs-webrtc wrote: > Should this method be in a anonymous namespace in this file instead in > audio_util? Or even don't abstract this? Just to keep things together, since the > interleaved format is deprecated and will eventually disappear? I don't have a > strong opinion. Hmm.... it seems cleaner and clearer as is for now, though I can't really weigh in on the deprecation issue. Andrew? https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:787: if (rev_proc_format_.rate() == kSampleRate32kHz && is_rev_processed()) { On 2015/07/29 22:17:10, aluebs-webrtc wrote: > Maybe this is a good time to make 32kHz and 48kHz consistent and use splitting > filter in both cases? Today 48Khz down-samples and 32kHz splits since the 2-band > splitting filter performance is better than the down-sampler, but the 3-band > splitting filter isn't. But soon the ineligibility enhancer will require the > higher bands anyway. WDYT? I'd prefer to save this for a later cl, since this cl is pretty monstrous as is. Also, with respect to intelligibility, this falls more under 'tuning' as we're not sure yet how necessary this is for intelligibility; we might be able to get away with just the 16k. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:1006: return intelligibility_enabled_ && intelligibility_enhancer_->active(); On 2015/07/29 03:52:27, andrew wrote: > Hmm, checking active() is nice to save some complexity, but can this flip often? > You'll get a probably audible discontinuity when it flips if you don't smooth > this out. > > One solution would be to have IE flip this on before it starts processing the > signal (that is, it returns the unprocessed signal for at least one chunk) and > flip it off after it stops (it returns at least one unprocessed chunk). It can > then internally smooth between these states. > > If you don't have that feature today, I'd remove that chunk and leave a TODO > here. > > Ah, reading further I see active() always returns true for now. So leave this, > but make sure you add something like the feature I described when you make it > non-const. Yep, we have this smoothing feature implemented, and will add it in a later cl. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:332: virtual int AnalyzeReverseStream(AudioFrame* frame) = 0; On 2015/07/29 03:52:27, andrew wrote: > Since we're now supporting this interface fully, you should similarly add a > ProcessReverseStream for it and mark this as deprecated. Done. Also added to the StreamConfig interface. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:27: #include "webrtc/common_audio/include/audio_util.h" On 2015/07/29 03:52:27, andrew wrote: > Alpha order. Done. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:139: if (active_) { On 2015/07/29 03:52:27, andrew wrote: > This is where you need smoothing as you switch states. Acknowledged. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:139: if (active_) { On 2015/07/29 22:17:10, aluebs-webrtc wrote: > There is a if(active_) just above where you can probably merge this into. Although active_ is const in this cl, in the next it will be non-const and may change between these two blocks, and the two conditions will be different, so I'd prefer to keep them separate if that's ok. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:151: bool active_; // Whether render gains are being updated. On 2015/07/29 03:52:27, andrew wrote: > Make this const for now. Done. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audioproc_float.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audioproc_float.cc:190: LayoutFromChannels(in_rev_buf->num_channels()))); On 2015/07/29 03:52:27, andrew wrote: > Is this indentation right? Run clang-format if you haven't. It's strange, but that's what 'git cl format' gives me. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp... webrtc/voice_engine/output_mixer.cc:578: if (_audioProcessingModulePtr->AnalyzeReverseStream(&_audioFrame) == -1) { On 2015/07/29 03:52:28, andrew wrote: > Can you run through a bunch of codecs in voe_cmd_test with AEC on/off > intelligibility on/off and make sure everything sounds OK? I'll do the same once > you land this. Tried a bunch on loopback. AEC is effective for all with or without intelligibility enabled, except for opus. For opus, I couldn't get AEC to be effective in any case, even using AnalyzeReverseStream instead of ProcessReverseStream. I'm guessing this is a separate issue.
Thanks for uploading the merge separately; makes reviewing easier. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp... webrtc/voice_engine/output_mixer.cc:578: if (_audioProcessingModulePtr->AnalyzeReverseStream(&_audioFrame) == -1) { On 2015/07/29 23:35:06, ekm wrote: > On 2015/07/29 03:52:28, andrew wrote: > > Can you run through a bunch of codecs in voe_cmd_test with AEC on/off > > intelligibility on/off and make sure everything sounds OK? I'll do the same > once > > you land this. > > Tried a bunch on loopback. AEC is effective for all with or without > intelligibility enabled, except for opus. For opus, I couldn't get AEC to be > effective in any case, even using AnalyzeReverseStream instead of > ProcessReverseStream. I'm guessing this is a separate issue. Hmm, OK, I'll try it once this lands. https://codereview.webrtc.org/1234463003/diff/210018/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/210018/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:367: virtual int ProcessReverseStream(float* const* data, You don't need this one now... https://codereview.webrtc.org/1234463003/diff/210018/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:376: virtual int AnalyzeReverseStream(const float* const* data, I don't think any clients are using this yet besides tests. You could remove it in this CL. https://codereview.webrtc.org/1234463003/diff/210018/webrtc/voice_engine/outp... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/210018/webrtc/voice_engine/outp... webrtc/voice_engine/output_mixer.cc:554: WEBRTC_TRACE(kTraceWarning, kTraceVoice, VoEId(_instanceId, -1), Since we're here, can you make this a kTraceError?
Patchset #11 (id:230001) has been deleted
https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/voice_engine/outp... webrtc/voice_engine/output_mixer.cc:578: if (_audioProcessingModulePtr->AnalyzeReverseStream(&_audioFrame) == -1) { On 2015/07/30 03:53:16, andrew wrote: > On 2015/07/29 23:35:06, ekm wrote: > > On 2015/07/29 03:52:28, andrew wrote: > > > Can you run through a bunch of codecs in voe_cmd_test with AEC on/off > > > intelligibility on/off and make sure everything sounds OK? I'll do the same > > once > > > you land this. > > > > Tried a bunch on loopback. AEC is effective for all with or without > > intelligibility enabled, except for opus. For opus, I couldn't get AEC to be > > effective in any case, even using AnalyzeReverseStream instead of > > ProcessReverseStream. I'm guessing this is a separate issue. > > Hmm, OK, I'll try it once this lands. Acknowledged. https://codereview.webrtc.org/1234463003/diff/210018/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/210018/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:367: virtual int ProcessReverseStream(float* const* data, On 2015/07/30 03:53:17, andrew wrote: > You don't need this one now... Done. Removed. https://codereview.webrtc.org/1234463003/diff/210018/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:376: virtual int AnalyzeReverseStream(const float* const* data, On 2015/07/30 03:53:17, andrew wrote: > I don't think any clients are using this yet besides tests. You could remove it > in this CL. Done. Since we're still supporting AnalyzeReverseStream for the other float interface, and that uses this, I've moved this AnalyzeReverseStream to private. Also, since this was only used in tests, I've updated this interface to support a distinct reverse output stream. https://codereview.webrtc.org/1234463003/diff/210018/webrtc/voice_engine/outp... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/210018/webrtc/voice_engine/outp... webrtc/voice_engine/output_mixer.cc:554: WEBRTC_TRACE(kTraceWarning, kTraceVoice, VoEId(_instanceId, -1), On 2015/07/30 03:53:17, andrew wrote: > Since we're here, can you make this a kTraceError? Done.
https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:109: interleaved[interleaved_idx] = mono[i]; On 2015/07/29 23:35:06, ekm wrote: > On 2015/07/29 22:17:10, aluebs-webrtc wrote: > > How about: > > > > interleaved[i * num_channels + j] = mono[i]; > > > > And drop interleaved_idx completely? > > I'd like to avoid the inner multiplication, though not of the tradeoff in > practice vs declaring the int every call. Agreed. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:469: UpmixMonoToInterleaved(data_ptr->ibuf()->channels()[0], proc_num_frames_, On 2015/07/29 23:35:06, ekm wrote: > On 2015/07/29 22:17:10, aluebs-webrtc wrote: > > Should this method be in a anonymous namespace in this file instead in > > audio_util? Or even don't abstract this? Just to keep things together, since > the > > interleaved format is deprecated and will eventually disappear? I don't have a > > strong opinion. > > Hmm.... it seems cleaner and clearer as is for now, though I can't really weigh > in on the deprecation issue. Andrew? As I said, I don't have a strong opinion and I agree it is cleaner. I was just suggesting to avoid spreading this interface code. If you find it clearer, leave it as is. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:787: if (rev_proc_format_.rate() == kSampleRate32kHz && is_rev_processed()) { On 2015/07/29 23:35:06, ekm wrote: > On 2015/07/29 22:17:10, aluebs-webrtc wrote: > > Maybe this is a good time to make 32kHz and 48kHz consistent and use splitting > > filter in both cases? Today 48Khz down-samples and 32kHz splits since the > 2-band > > splitting filter performance is better than the down-sampler, but the 3-band > > splitting filter isn't. But soon the ineligibility enhancer will require the > > higher bands anyway. WDYT? > > I'd prefer to save this for a later cl, since this cl is pretty monstrous as is. > Also, with respect to intelligibility, this falls more under 'tuning' as we're > not sure yet how necessary this is for intelligibility; we might be able to get > away with just the 16k. Leaving it for another CL sounds reasonable. But I think sooner or later the IE will need this. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:139: if (active_) { On 2015/07/29 23:35:06, ekm wrote: > On 2015/07/29 22:17:10, aluebs-webrtc wrote: > > There is a if(active_) just above where you can probably merge this into. > > Although active_ is const in this cl, in the next it will be non-const and may > change between these two blocks, and the two conditions will be different, so > I'd prefer to keep them separate if that's ok. Agreed. https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:492: int num_channels = 1, I think this zero-initialization here is on purpose to force the user to set reasonable values. Actually not setting them is a way of signalling that that stream is not necessary, for example when a reverse stream is not available. Why do you need to have these default values?
https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.h:131: int AnalyzeReverseStream(const float* const* src, Add a TODO for removing this. https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:352: // Same as |AnalyzeReverseStream|, but may modify |data| if intelligibility modify |frame| https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, I'm not sure we want to do this. The way you have it now makes the interface inconsistent: the user can specify an output config, but they'll only get it if intelligibility is enabled. We either need to always support converting the stream, or remove the output config. I'm a bit divided on this. Alex what do you think? https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:492: int num_channels = 1, On 2015/07/30 15:28:07, aluebs-webrtc wrote: > I think this zero-initialization here is on purpose to force the user to set > reasonable values. Actually not setting them is a way of signalling that that > stream is not necessary, for example when a reverse stream is not available. Why > do you need to have these default values? Agreed.
https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, On 2015/07/30 18:48:53, andrew wrote: > I'm not sure we want to do this. The way you have it now makes the interface > inconsistent: the user can specify an output config, but they'll only get it if > intelligibility is enabled. > > We either need to always support converting the stream, or remove the output > config. I'm a bit divided on this. Alex what do you think? You bring a good point here. I think it makes sense to always convert (except when they are the same of course), so that the reverse stream is similar to the capture stream. Then enabling IE (or any other future component) doesn't change the behavior of the APM.
https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_buffer.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_buffer.cc:469: UpmixMonoToInterleaved(data_ptr->ibuf()->channels()[0], proc_num_frames_, On 2015/07/30 15:28:07, aluebs-webrtc wrote: > On 2015/07/29 23:35:06, ekm wrote: > > On 2015/07/29 22:17:10, aluebs-webrtc wrote: > > > Should this method be in a anonymous namespace in this file instead in > > > audio_util? Or even don't abstract this? Just to keep things together, since > > the > > > interleaved format is deprecated and will eventually disappear? I don't have > a > > > strong opinion. > > > > Hmm.... it seems cleaner and clearer as is for now, though I can't really > weigh > > in on the deprecation issue. Andrew? > > As I said, I don't have a strong opinion and I agree it is cleaner. I was just > suggesting to avoid spreading this interface code. If you find it clearer, leave > it as is. Acknowledged. https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/140001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:787: if (rev_proc_format_.rate() == kSampleRate32kHz && is_rev_processed()) { On 2015/07/30 15:28:07, aluebs-webrtc wrote: > On 2015/07/29 23:35:06, ekm wrote: > > On 2015/07/29 22:17:10, aluebs-webrtc wrote: > > > Maybe this is a good time to make 32kHz and 48kHz consistent and use > splitting > > > filter in both cases? Today 48Khz down-samples and 32kHz splits since the > > 2-band > > > splitting filter performance is better than the down-sampler, but the 3-band > > > splitting filter isn't. But soon the ineligibility enhancer will require the > > > higher bands anyway. WDYT? > > > > I'd prefer to save this for a later cl, since this cl is pretty monstrous as > is. > > Also, with respect to intelligibility, this falls more under 'tuning' as we're > > not sure yet how necessary this is for intelligibility; we might be able to > get > > away with just the 16k. > > Leaving it for another CL sounds reasonable. But I think sooner or later the IE > will need this. Yes. I'm interested to hear what the difference sounds like. https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, On 2015/07/30 18:48:53, andrew wrote: > I'm not sure we want to do this. The way you have it now makes the interface > inconsistent: the user can specify an output config, but they'll only get it if > intelligibility is enabled. > > We either need to always support converting the stream, or remove the output > config. I'm a bit divided on this. Alex what do you think? Are we talking about converting reverse to capture, or reverse output to reverse input? Since before IE apm left reverse input as is, it makes sense to me to convert reverse out to reverse in. In that case, I can revert the expansion of ProcessingConfig, etc. What about adding back the AnalyzeReverseStream that only takes the input stream+config? Is that too much interface?
Patchset #12 (id:270001) has been deleted
https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.h:131: int AnalyzeReverseStream(const float* const* src, On 2015/07/30 18:48:53, andrew wrote: > Add a TODO for removing this. Done. https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:352: // Same as |AnalyzeReverseStream|, but may modify |data| if intelligibility On 2015/07/30 18:48:53, andrew wrote: > modify |frame| Done. https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:492: int num_channels = 1, On 2015/07/30 15:28:07, aluebs-webrtc wrote: > I think this zero-initialization here is on purpose to force the user to set > reasonable values. Actually not setting them is a way of signalling that that > stream is not necessary, for example when a reverse stream is not available. Why > do you need to have these default values? That makes sense. I thought it'd be a nice way to get around the asserts in audio_buffer, but I see it defeats the purpose. I re-zeroed these and added a new conditional at the beginning of InitializeLocked() instead.
https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, On 2015/07/30 21:23:50, ekm wrote: > On 2015/07/30 18:48:53, andrew wrote: > > I'm not sure we want to do this. The way you have it now makes the interface > > inconsistent: the user can specify an output config, but they'll only get it > if > > intelligibility is enabled. > > > > We either need to always support converting the stream, or remove the output > > config. I'm a bit divided on this. Alex what do you think? > > Are we talking about converting reverse to capture, or reverse output to reverse > input? Since before IE apm left reverse input as is, it makes sense to me to > convert reverse out to reverse in. In that case, I can revert the expansion of > ProcessingConfig, etc. > > What about adding back the AnalyzeReverseStream that only takes the input > stream+config? Is that too much interface? I would prefer to avoid adding yet another interface. So the input would be converted to the output format, as it happens in the capture stream. Andrew?
https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, On 2015/07/30 23:09:51, aluebs-webrtc wrote: > On 2015/07/30 21:23:50, ekm wrote: > > On 2015/07/30 18:48:53, andrew wrote: > > > I'm not sure we want to do this. The way you have it now makes the interface > > > inconsistent: the user can specify an output config, but they'll only get it > > if > > > intelligibility is enabled. > > > > > > We either need to always support converting the stream, or remove the output > > > config. I'm a bit divided on this. Alex what do you think? > > > > Are we talking about converting reverse to capture, or reverse output to > reverse > > input? Since before IE apm left reverse input as is, it makes sense to me to > > convert reverse out to reverse in. In that case, I can revert the expansion of > > ProcessingConfig, etc. > > > > What about adding back the AnalyzeReverseStream that only takes the input > > stream+config? Is that too much interface? > > I would prefer to avoid adding yet another interface. So the input would be > converted to the output format, as it happens in the capture stream. Andrew? Ok. Why are the two configs in the ProcessStream interface necessary?
https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/include/audio_processing.h (right): https://codereview.webrtc.org/1234463003/diff/250001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/include/audio_processing.h:368: const StreamConfig& reverse_output_config, On 2015/07/30 23:20:17, ekm wrote: > On 2015/07/30 23:09:51, aluebs-webrtc wrote: > > On 2015/07/30 21:23:50, ekm wrote: > > > On 2015/07/30 18:48:53, andrew wrote: > > > > I'm not sure we want to do this. The way you have it now makes the > interface > > > > inconsistent: the user can specify an output config, but they'll only get > it > > > if > > > > intelligibility is enabled. > > > > > > > > We either need to always support converting the stream, or remove the > output > > > > config. I'm a bit divided on this. Alex what do you think? > > > > > > Are we talking about converting reverse to capture, or reverse output to > > reverse > > > input? Since before IE apm left reverse input as is, it makes sense to me to > > > convert reverse out to reverse in. In that case, I can revert the expansion > of > > > ProcessingConfig, etc. > > > > > > What about adding back the AnalyzeReverseStream that only takes the input > > > stream+config? Is that too much interface? > > > > I would prefer to avoid adding yet another interface. So the input would be > > converted to the output format, as it happens in the capture stream. Andrew? > > Ok. Why are the two configs in the ProcessStream interface necessary? So that the APM knows what the user expects as output and does the right thing.
Unfortunately, you're going to need to add tests verifying the reverse conversions. https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:726: if (is_rev_processed()) { So, we need to change this part. This is what I meant about being consistent: if the user requests a different output config, you should provide that regardless of processing. However, if we have no processing and the user requests a different output config, we need to convert from src not render_audio_. Otherwise, say the user passes in 48 kHz and they request 44.1 kHz, they'll actually get 16 kHz audio upsampled to 44.1. Perhaps my olde AudioConverter can finally see the light of day? if (is_rev_processed() { ... } else { render_converter_->Convert(src, ..., dest); } where you only create render_converter_ in InitializeLocked() when appropriate. Alex, what do you think? We could back out the output_config addition, but then we'd still then need at least a copy from src to dest here, so it doesn't help a lot. https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:773: if (is_rev_processed()) { You can leave this one, because the user has no way to specify an output config.
https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:726: if (is_rev_processed()) { On 2015/07/31 01:22:42, andrew wrote: > So, we need to change this part. This is what I meant about being consistent: if > the user requests a different output config, you should provide that regardless > of processing. > > However, if we have no processing and the user requests a different output > config, we need to convert from src not render_audio_. Otherwise, say the user > passes in 48 kHz and they request 44.1 kHz, they'll actually get 16 kHz audio > upsampled to 44.1. > > Perhaps my olde AudioConverter can finally see the light of day? > > if (is_rev_processed() { > ... > } else { > render_converter_->Convert(src, ..., dest); > } > > where you only create render_converter_ in InitializeLocked() when appropriate. > > Alex, what do you think? We could back out the output_config addition, but then > we'd still then need at least a copy from src to dest here, so it doesn't help a > lot. I like your proposal. I don't think that dropping the out config helps, so I would keep it, but I don't have a strong opinion.
https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:726: if (is_rev_processed()) { On 2015/07/31 16:51:03, aluebs-webrtc wrote: > On 2015/07/31 01:22:42, andrew wrote: > > So, we need to change this part. This is what I meant about being consistent: > if > > the user requests a different output config, you should provide that > regardless > > of processing. > > > > However, if we have no processing and the user requests a different output > > config, we need to convert from src not render_audio_. Otherwise, say the user > > passes in 48 kHz and they request 44.1 kHz, they'll actually get 16 kHz audio > > upsampled to 44.1. > > > > Perhaps my olde AudioConverter can finally see the light of day? > > > > if (is_rev_processed() { > > ... > > } else { > > render_converter_->Convert(src, ..., dest); > > } > > > > where you only create render_converter_ in InitializeLocked() when > appropriate. > > > > Alex, what do you think? We could back out the output_config addition, but > then > > we'd still then need at least a copy from src to dest here, so it doesn't help > a > > lot. > > I like your proposal. I don't think that dropping the out config helps, so I > would keep it, but I don't have a strong opinion. I was intrigued and added the AudioConverter method, but kept interface the same for now. https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:773: if (is_rev_processed()) { On 2015/07/31 01:22:42, andrew wrote: > You can leave this one, because the user has no way to specify an output config. Acknowledged. https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:1090: ((api_format_.reverse_output_stream().num_channels() == Added these constraints because CommonFormats/AudioProcessingTest tests 2 input 0 output channels case. Maybe better to just declare this case invalid in audio_processing.h as in ProcessStream case, and fix the test.
https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:10: Re-add blank line. https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:1090: ((api_format_.reverse_output_stream().num_channels() == On 2015/08/01 04:21:37, ekm wrote: > Added these constraints because CommonFormats/AudioProcessingTest tests 2 input > 0 output channels case. Maybe better to just declare this case invalid in > audio_processing.h as in ProcessStream case, and fix the test. I think 0 channels just means it doesn't need any output, so you are doing the right thing. Andrew?
Patchset #14 (id:330001) has been deleted
https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/290001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:10: On 2015/08/03 16:30:49, aluebs-webrtc wrote: > Re-add blank line. Done.
Had a first look at the new stuff. I still think we need to unit test the render side conversion. https://codereview.webrtc.org/1234463003/diff/350001/webrtc/common_audio/incl... File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/350001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:72: void CopyAudio(const T* const* src, Perhaps call this CopyAudioIfNeeded() and check if the ptrs point to different memory. https://codereview.webrtc.org/1234463003/diff/350001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:73: int samples_per_channel, num_frames is the new way. https://codereview.webrtc.org/1234463003/diff/350001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:80: dest_channel[j] = src_channel[j]; Do a memcpy (or std::copy) instead of individual assignment, so if (src[i] != dest[i]) { std::copy(src[i], src[i] + num_frames, dest[i]); } https://codereview.webrtc.org/1234463003/diff/350001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/350001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:743: CopyAudio<float>(src, reverse_input_config.num_frames(), You shouldn't need the explicit float; can be deduced from parameter types.
https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:1090: ((api_format_.reverse_output_stream().num_channels() == On 2015/08/03 16:30:50, aluebs-webrtc wrote: > On 2015/08/01 04:21:37, ekm wrote: > > Added these constraints because CommonFormats/AudioProcessingTest tests 2 > input > > 0 output channels case. Maybe better to just declare this case invalid in > > audio_processing.h as in ProcessStream case, and fix the test. > > I think 0 channels just means it doesn't need any output, so you are doing the > right thing. Andrew? If we want to be consistent, I think the test is invalid. If you don't want any conversion, you should supply the same output as input config.
Patchset #15 (id:370001) has been deleted
Patchset #15 (id:390001) has been deleted
Extended (AudioProcessingTest, Formats) to include render stream conversion test. https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/310001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:1090: ((api_format_.reverse_output_stream().num_channels() == On 2015/08/05 06:22:39, andrew wrote: > On 2015/08/03 16:30:50, aluebs-webrtc wrote: > > On 2015/08/01 04:21:37, ekm wrote: > > > Added these constraints because CommonFormats/AudioProcessingTest tests 2 > > input > > > 0 output channels case. Maybe better to just declare this case invalid in > > > audio_processing.h as in ProcessStream case, and fix the test. > > > > I think 0 channels just means it doesn't need any output, so you are doing the > > right thing. Andrew? > > If we want to be consistent, I think the test is invalid. If you don't want any > conversion, you should supply the same output as input config. Done. https://codereview.webrtc.org/1234463003/diff/350001/webrtc/common_audio/incl... File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/350001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:72: void CopyAudio(const T* const* src, On 2015/08/05 06:18:18, andrew wrote: > Perhaps call this CopyAudioIfNeeded() and check if the ptrs point to different > memory. Nice. Done. https://codereview.webrtc.org/1234463003/diff/350001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:73: int samples_per_channel, On 2015/08/05 06:18:17, andrew wrote: > num_frames is the new way. Done. https://codereview.webrtc.org/1234463003/diff/350001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:80: dest_channel[j] = src_channel[j]; On 2015/08/05 06:18:18, andrew wrote: > Do a memcpy (or std::copy) instead of individual assignment, so > > if (src[i] != dest[i]) { > std::copy(src[i], src[i] + num_frames, dest[i]); > } > Done. https://codereview.webrtc.org/1234463003/diff/350001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/350001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:743: CopyAudio<float>(src, reverse_input_config.num_frames(), On 2015/08/05 06:18:18, andrew wrote: > You shouldn't need the explicit float; can be deduced from parameter types. Done. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2492: for (auto file_direction : {kFwd, kRev}) { The following 124 lines should be indented, but codereview couldn't figure out how to match up the diffs. So to make reviewing easier, I left unindented for now.
https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/incl... File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:70: // allocated in the |dest| channels. Comment something about that there is no copy done when they are the same pointer. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:65: enum StreamDirection { kFwd = 0, kRev }; How about spelling these out: kForward and kReverse? https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2488: {2, 2, 2, 2}, If this is supposed to be an exhaustive list, you are missing the cases {1, 1, 2, 2}, {2, 1, 2, 2}, {2, 2, 2, 1}, right? But I am not sure if it should be exhaustive, since it is already a slow test. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2492: for (auto file_direction : {kFwd, kRev}) { On 2015/08/07 06:06:00, ekm wrote: > The following 124 lines should be indented, but codereview couldn't figure out > how to match up the diffs. So to make reviewing easier, I left unindented for > now. Thanks :) Just remember to change it before landing. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2512: } I think it is clearer to declare and define each variable in the same line as: int in_rate = file_direction == kFwd ? input_rate_ : reverse_input_rate_; But I don't have a strong opinion. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2732: std::tr1::make_tuple(16000, 16000, 16000, 16000, 0, 0))); Here I have the same question about making them exhaustive. Andrew?
https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/incl... File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:70: // allocated in the |dest| channels. Add a comment here about what "needed" means. "Skips the copy when a |src| and |dest| channel point to the same address." https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:740: render_converter_->Convert(src, reverse_input_config.num_frames() * Perhaps add a helper to StreamConfig? int num_samples() const { return num_channels_ * num_frames_; } https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2449: ChannelBuffer<float>* cb_write = file_direction ? &rev_out_cb : &out_cb; nit: could make this a const reference. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2492: for (auto file_direction : {kFwd, kRev}) { On 2015/08/07 06:06:00, ekm wrote: > The following 124 lines should be indented, but codereview couldn't figure out > how to match up the diffs. So to make reviewing easier, I left unindented for > now. Nice, thanks. This looks pretty good, but I see one issue: you're doubling the number of test cases and only verifying forward or reverse on any particular case, right? Would it be possible to verify both at once without doubling the test duration? https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2595: // of reference vs output. Move to previous line.
https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2488: {2, 2, 2, 2}, On 2015/08/10 19:14:18, aluebs-webrtc wrote: > If this is supposed to be an exhaustive list, you are missing the cases {1, 1, > 2, 2}, {2, 1, 2, 2}, {2, 2, 2, 1}, right? But I am not sure if it should be > exhaustive, since it is already a slow test. Right, I don't think there's a lot of value in making it exhaustive, since the two paths are fairly independent for the purposes of this test. He has all the permitted reverse combinations {1, 1}, {2, 1}, {2, 2}, so I think we're good. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2732: std::tr1::make_tuple(16000, 16000, 16000, 16000, 0, 0))); On 2015/08/10 19:14:18, aluebs-webrtc wrote: > Here I have the same question about making them exhaustive. Andrew? Right, it's not supposed to be exhaustive. It looks like what you have covers the interesting cases.
Patchset #16 (id:430001) has been deleted
https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/incl... File webrtc/common_audio/include/audio_util.h (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/common_audio/incl... webrtc/common_audio/include/audio_util.h:70: // allocated in the |dest| channels. On 2015/08/10 19:14:18, aluebs-webrtc wrote: > Comment something about that there is no copy done when they are the same > pointer. Done. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:740: render_converter_->Convert(src, reverse_input_config.num_frames() * On 2015/08/10 19:24:05, andrew_ooo_to_aug14 wrote: > Perhaps add a helper to StreamConfig? > int num_samples() const { return num_channels_ * num_frames_; } Done. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:65: enum StreamDirection { kFwd = 0, kRev }; On 2015/08/10 19:14:18, aluebs-webrtc wrote: > How about spelling these out: kForward and kReverse? Done. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2449: ChannelBuffer<float>* cb_write = file_direction ? &rev_out_cb : &out_cb; On 2015/08/10 19:24:05, andrew_ooo_to_aug14 wrote: > nit: could make this a const reference. Removed. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2488: {2, 2, 2, 2}, On 2015/08/10 19:30:51, andrew_ooo_to_aug14 wrote: > On 2015/08/10 19:14:18, aluebs-webrtc wrote: > > If this is supposed to be an exhaustive list, you are missing the cases {1, 1, > > 2, 2}, {2, 1, 2, 2}, {2, 2, 2, 1}, right? But I am not sure if it should be > > exhaustive, since it is already a slow test. > > Right, I don't think there's a lot of value in making it exhaustive, since the > two paths are fairly independent for the purposes of this test. He has all the > permitted reverse combinations {1, 1}, {2, 1}, {2, 2}, so I think we're good. Acknowledged. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2492: for (auto file_direction : {kFwd, kRev}) { On 2015/08/10 19:24:05, andrew_ooo_to_aug14 wrote: > On 2015/08/07 06:06:00, ekm wrote: > > The following 124 lines should be indented, but codereview couldn't figure out > > how to match up the diffs. So to make reviewing easier, I left unindented for > > now. > > Nice, thanks. > > This looks pretty good, but I see one issue: you're doubling the number of test > cases and only verifying forward or reverse on any particular case, right? Would > it be possible to verify both at once without doubling the test duration? Done. Yep, switched up ProcessFormat to write both files at once, and moved the first for loop here after ProcessFormat so the processing is done only once for each case. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2512: } On 2015/08/10 19:14:18, aluebs-webrtc wrote: > I think it is clearer to declare and define each variable in the same line as: > int in_rate = file_direction == kFwd ? input_rate_ : reverse_input_rate_; > But I don't have a strong opinion. Done. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2595: // of reference vs output. On 2015/08/10 19:24:05, andrew_ooo_to_aug14 wrote: > Move to previous line. Done. https://codereview.webrtc.org/1234463003/diff/410001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2732: std::tr1::make_tuple(16000, 16000, 16000, 16000, 0, 0))); On 2015/08/10 19:30:51, andrew_ooo_to_aug14 wrote: > On 2015/08/10 19:14:18, aluebs-webrtc wrote: > > Here I have the same question about making them exhaustive. Andrew? > > Right, it's not supposed to be exhaustive. It looks like what you have covers > the interesting cases. Acknowledged.
lgtm
Sorry for the long delay. And, I also just browsed the CL. I think you got plenty of feedback from the other reviewers. I have only one concern, and that is the "pointer to a const pointer to a float" and similar. I have a hard time wrapping my head around those. Is is possible to simplify somewhat? Const references instead? Anyway, when the others approve, you have my rubbstamp lgtm. https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:751: const float* const* src, So, this is a "pointer to a const pointer to a float"? The mind boggles... Any chance this can be simplified with a reference?
lgtm. Nice work Elliot. https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2510: int in_rate = file_direction ? reverse_input_rate_ : input_rate_; You can make these const now. If it makes things overall more readable, consider extracting the contents of this loop to a function VerifyProcessedAudio or something. Up to you.
https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:751: const float* const* src, On 2015/08/12 19:23:58, hlundin-webrtc wrote: > So, this is a "pointer to a const pointer to a float"? The mind boggles... pointer to a const pointer to a const float :) > > Any chance this can be simplified with a reference? No, because both pointees are arrays which should be dereferenced. We could use const float* const src[], but not sure that makes anything more clear. If we were going to change this, we should probably consider taking a vector<vector<float>> instead. The current approach is more flexible, and is used elsewhere in the codebase (see e.g. common_audio/include/audio_util.h) https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:754: const float* const* dest) { Ah, I missed this. This function doesn't need the dest parameter, right? (The type is wrong anyway.)
On 2015/08/12 20:06:25, andrew_ooo_to_aug14 wrote: > https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... > File webrtc/modules/audio_processing/audio_processing_impl.cc (right): > > https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... > webrtc/modules/audio_processing/audio_processing_impl.cc:751: const float* > const* src, > On 2015/08/12 19:23:58, hlundin-webrtc wrote: > > So, this is a "pointer to a const pointer to a float"? The mind boggles... > > pointer to a const pointer to a const float :) > > > > Any chance this can be simplified with a reference? > > No, because both pointees are arrays which should be dereferenced. Fair enough. Just wanted to ask. > > We could use const float* const src[], but not sure that makes anything more > clear. > > If we were going to change this, we should probably consider taking a > vector<vector<float>> instead. The current approach is more flexible, and is > used elsewhere in the codebase (see e.g. common_audio/include/audio_util.h) > > https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... > webrtc/modules/audio_processing/audio_processing_impl.cc:754: const float* > const* dest) { > Ah, I missed this. This function doesn't need the dest parameter, right? (The > type is wrong anyway.)
https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/audio_processing_impl.cc (right): https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/audio_processing_impl.cc:754: const float* const* dest) { On 2015/08/12 20:06:25, andrew_ooo_to_aug14 wrote: > Ah, I missed this. This function doesn't need the dest parameter, right? (The > type is wrong anyway.) Done. https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/450001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2510: int in_rate = file_direction ? reverse_input_rate_ : input_rate_; On 2015/08/12 19:47:45, andrew_ooo_to_aug14 wrote: > You can make these const now. > > If it makes things overall more readable, consider extracting the contents of > this loop to a function VerifyProcessedAudio or something. Up to you. Done. Decided against extracting to a function. Added a comment. I think readability is fine, and nicer to access members of AudioProcessingTest as is. Might be clearer that a function is cleaner once we switch to StreamConfigs and ProcessingConfigs throughout. https://codereview.webrtc.org/1234463003/diff/470001/webrtc/modules/audio_pro... File webrtc/modules/audio_processing/test/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1234463003/diff/470001/webrtc/modules/audio_pro... webrtc/modules/audio_processing/test/audio_processing_unittest.cc:2522: const int min_ref_rate = std::min(in_rate, out_rate); Changed above to const. Only change below is indenting. Not sure why it can't figure it out...
The CQ bit was checked by ekmeyerson@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from henrika@webrtc.org, henrik.lundin@webrtc.org, andrew@webrtc.org, aluebs@webrtc.org Link to the patchset: https://codereview.webrtc.org/1234463003/#ps470001 (title: "Addressed comments; format")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1234463003/470001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1234463003/470001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: presubmit on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/presubmit/builds/346)
ekmeyerson@webrtc.org changed reviewers: + pthatcher@webrtc.org
Adding pthatcher as an owner of talk/media/webrtc/fakewebrtcvoiceengine.h . Hi Peter, just a small interface change affecting you here.
ekmeyerson@webrtc.org changed reviewers: + pbos@webrtc.org
+pbos as an owner of talk/media/webrtc/fakewebrtcvoiceengine.h If either of you could take a look that'd be great.
On 2015/08/13 18:49:58, ekm wrote: > +pbos as an owner of talk/media/webrtc/fakewebrtcvoiceengine.h > > If either of you could take a look that'd be great. fyi: pbos is OOO for another week.
On 2015/08/14 09:34:05, hlundin-webrtc wrote: > On 2015/08/13 18:49:58, ekm wrote: > > +pbos as an owner of talk/media/webrtc/fakewebrtcvoiceengine.h > > > > If either of you could take a look that'd be great. > > fyi: pbos is OOO for another week. Just TBR this to him and land it. The talk/ change is trivial.
The CQ bit was checked by ekmeyerson@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from henrik.lundin@webrtc.org, andrew@webrtc.org, aluebs@webrtc.org, henrika@webrtc.org Link to the patchset: https://codereview.webrtc.org/1234463003/#ps530001 (title: "Fix Mac Error (3)")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1234463003/530001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1234463003/530001
Message was sent while issue was closed.
Committed patchset #20 (id:530001)
Message was sent while issue was closed.
Patchset 20 (id:??) landed as https://crrev.com/60d9b332a5391045439bfb6a3a5447973e3d5603 Cr-Commit-Position: refs/heads/master@{#9713}
Message was sent while issue was closed.
On 2015/08/14 17:36:11, commit-bot: I haz the power wrote: > Patchset 20 (id:??) landed as > https://crrev.com/60d9b332a5391045439bfb6a3a5447973e3d5603 > Cr-Commit-Position: refs/heads/master@{#9713} talk/ rs-lgtm
Message was sent while issue was closed.
henrika@webrtc.org changed reviewers: + henrika@webrtc.org
Message was sent while issue was closed.
Posting old commment. https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/output_mi... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/output_mi... webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame, &audioproc_resampler_, &_audioFrame); Are we adding complexity for all cases here?
Message was sent while issue was closed.
https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/output_mi... File webrtc/voice_engine/output_mixer.cc (right): https://codereview.webrtc.org/1234463003/diff/1/webrtc/voice_engine/output_mi... webrtc/voice_engine/output_mixer.cc:589: RemixAndResample(frame, &audioproc_resampler_, &_audioFrame); On 2015/09/25 10:44:53, henrika_webrtc wrote: > Are we adding complexity for all cases here? Yes, this was a problem, but this is an old patchset and no longer relevant. |