webrtc/modules/audio_processing/include/audio_processing.h - Issue 1234463003: Integrate Intelligibility with APM

Side by Side Diff: webrtc/modules/audio_processing/include/audio_processing.h

Issue 1234463003: Integrate Intelligibility with APM (Closed) Base URL: https://chromium.googlesource.com/external/webrtc.git@master

Patch Set: Addr. comments from aluebs (incl. made ProcessReverseStream nicer) Created 5 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

« webrtc/modules/audio_processing/audio_processing_impl.cc ('K') | « webrtc/modules/audio_processing/audio_processing_impl.cc ('k') | webrtc/modules/audio_processing/include/mock_audio_processing.h » ('j') | webrtc/voice_engine/output_mixer.cc » ('J')
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Hide Comments ('s')

OLD	NEW
1 /*	1 /*

2 * Copyright (c) 2012 The WebRTC project authors. All Rights Reserved.	2 * Copyright (c) 2012 The WebRTC project authors. All Rights Reserved.

3 *	3 *

4 * Use of this source code is governed by a BSD-style license	4 * Use of this source code is governed by a BSD-style license

5 * that can be found in the LICENSE file in the root of the source	5 * that can be found in the LICENSE file in the root of the source

6 * tree. An additional intellectual property rights grant can be found	6 * tree. An additional intellectual property rights grant can be found

7 * in the file PATENTS. All contributing project authors may	7 * in the file PATENTS. All contributing project authors may

8 * be found in the AUTHORS file in the root of the source tree.	8 * be found in the AUTHORS file in the root of the source tree.

9 */	9 */

10	10

(...skipping 104 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
115	115

116 // Use to enable 48kHz support in audio processing. Must be provided through the	116 // Use to enable 48kHz support in audio processing. Must be provided through the

117 // constructor. It will have no impact if used with	117 // constructor. It will have no impact if used with

118 // AudioProcessing::SetExtraOptions().	118 // AudioProcessing::SetExtraOptions().

119 struct AudioProcessing48kHzSupport {	119 struct AudioProcessing48kHzSupport {

120 AudioProcessing48kHzSupport() : enabled(true) {}	120 AudioProcessing48kHzSupport() : enabled(true) {}

121 explicit AudioProcessing48kHzSupport(bool enabled) : enabled(enabled) {}	121 explicit AudioProcessing48kHzSupport(bool enabled) : enabled(enabled) {}

122 bool enabled;	122 bool enabled;

123 };	123 };

124	124

	125 // Use to enable intelligibility enhancer in audio processing. It can be set

	126 // in the constructor or using AudioProcessing::SetExtraOptions().
	Andrew MacDonald 2015/07/21 19:29:22 Please don't allow it to be set through SetExtraOp Please don't allow it to be set through SetExtraOptions. We want to remove this method. I realize this won't allow you to enable it through voice engine for now, but I think that's an acceptable compromise. Andrew MacDonald 2015/07/22 02:09:48 Ah, my mistake. You can still enable it in voice e Show quoted text On 2015/07/21 19:29:22, andrew wrote: > Please don't allow it to be set through SetExtraOptions. We want to remove this > method. I realize this won't allow you to enable it through voice engine for > now, but I think that's an acceptable compromise. Ah, my mistake. You can still enable it in voice engine, but it makes it harder through the crappy voe_cmd_test. Put up a CL here which shows how you could do it through agc_harness at least: https://codereview.webrtc.org/1247033006/ ekm 2015/07/23 00:26:28 Got it. Thanks for the pointer. Can also at least Show quoted text On 2015/07/22 02:09:48, andrew wrote: > On 2015/07/21 19:29:22, andrew wrote: > > Please don't allow it to be set through SetExtraOptions. We want to remove > this > > method. I realize this won't allow you to enable it through voice engine for > > now, but I think that's an acceptable compromise. > > Ah, my mistake. You can still enable it in voice engine, but it makes it harder > through the crappy voe_cmd_test. Put up a CL here which shows how you could do > it through agc_harness at least: > https://codereview.webrtc.org/1247033006/ Got it. Thanks for the pointer. Can also at least enable it in voe_cmd_test at compile time. On a similar note, what's the status of the Enable method that several of the apm components provide?
	127 struct Intelligibility {

	128 Intelligibility() : enabled(false) {}

	129 explicit Intelligibility(bool enabled) : enabled(enabled) {}

	130 bool enabled;

	131 };

	132

125 static const int kAudioProcMaxNativeSampleRateHz = 32000;	133 static const int kAudioProcMaxNativeSampleRateHz = 32000;

126	134

127 // The Audio Processing Module (APM) provides a collection of voice processing	135 // The Audio Processing Module (APM) provides a collection of voice processing

128 // components designed for real-time communications software.	136 // components designed for real-time communications software.

129 //	137 //

130 // APM operates on two audio streams on a frame-by-frame basis. Frames of the	138 // APM operates on two audio streams on a frame-by-frame basis. Frames of the

131 // primary stream, on which all processing is applied, are passed to	139 // primary stream, on which all processing is applied, are passed to

132 // \|ProcessStream()\|. Frames of the reverse direction stream, which are used for	140 // \|ProcessStream()\|. Frames of the reverse direction stream, which are used for

133 // analysis by some components, are passed to \|AnalyzeReverseStream()\|. On the	141 // analysis by some components, are passed to \|AnalyzeReverseStream()\|. On the

134 // client-side, this will typically be the near-end (capture) and far-end	142 // client-side, this will typically be the near-end (capture) and far-end

(...skipping 181 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
316 // members of \|frame\| must be valid. \|sample_rate_hz_\| must correspond to	324 // members of \|frame\| must be valid. \|sample_rate_hz_\| must correspond to

317 // \|input_sample_rate_hz()\|	325 // \|input_sample_rate_hz()\|

318 //	326 //

319 // TODO(ajm): add const to input; requires an implementation fix.	327 // TODO(ajm): add const to input; requires an implementation fix.

320 virtual int AnalyzeReverseStream(AudioFrame* frame) = 0;	328 virtual int AnalyzeReverseStream(AudioFrame* frame) = 0;

321	329

322 // Accepts deinterleaved float audio with the range [-1, 1]. Each element	330 // Accepts deinterleaved float audio with the range [-1, 1]. Each element

323 // of \|data\| points to a channel buffer, arranged according to \|layout\|.	331 // of \|data\| points to a channel buffer, arranged according to \|layout\|.

324 virtual int AnalyzeReverseStream(const float* const* data,	332 virtual int AnalyzeReverseStream(const float* const* data,

325 int samples_per_channel,	333 int samples_per_channel,

326 int sample_rate_hz,	334 int rev_sample_rate_hz,

	335 ChannelLayout layout) = 0;

	336

	337 // Same as AnalyzeReverseStream, but may modify \|data\| if intelligibility

	338 // is enabled.

	339 virtual int ProcessReverseStream(float* const* data,

	340 int samples_per_channel,

	341 int rev_sample_rate_hz,

327 ChannelLayout layout) = 0;	342 ChannelLayout layout) = 0;

328	343

329 // This must be called if and only if echo processing is enabled.	344 // This must be called if and only if echo processing is enabled.

330 //	345 //

331 // Sets the \|delay\| in ms between AnalyzeReverseStream() receiving a far-end	346 // Sets the \|delay\| in ms between AnalyzeReverseStream() receiving a far-end

332 // frame and ProcessStream() receiving a near-end frame containing the	347 // frame and ProcessStream() receiving a near-end frame containing the

333 // corresponding echo. On the client-side this can be expressed as	348 // corresponding echo. On the client-side this can be expressed as

334 // delay = (t_render - t_analyze) + (t_process - t_capture)	349 // delay = (t_render - t_analyze) + (t_process - t_capture)

335 // where,	350 // where,

336 // - t_analyze is the time a frame is passed to AnalyzeReverseStream() and	351 // - t_analyze is the time a frame is passed to AnalyzeReverseStream() and

(...skipping 450 matching lines...) Expand 10 before \| Expand all \| Expand 10 after Loading...
787 // This does not impact the size of frames passed to \|ProcessStream()\|.	802 // This does not impact the size of frames passed to \|ProcessStream()\|.

788 virtual int set_frame_size_ms(int size) = 0;	803 virtual int set_frame_size_ms(int size) = 0;

789 virtual int frame_size_ms() const = 0;	804 virtual int frame_size_ms() const = 0;

790	805

791 protected:	806 protected:

792 virtual ~VoiceDetection() {}	807 virtual ~VoiceDetection() {}

793 };	808 };

794 } // namespace webrtc	809 } // namespace webrtc

795	810

796 #endif // WEBRTC_MODULES_AUDIO_PROCESSING_INCLUDE_AUDIO_PROCESSING_H_	811 #endif // WEBRTC_MODULES_AUDIO_PROCESSING_INCLUDE_AUDIO_PROCESSING_H_

OLD	NEW