webrtc/modules/audio_processing/vad/voice_activity_detector_unittest.cc - Issue 1181933002: Pull the Voice Activity Detector out from the AGC

Side by Side Diff: webrtc/modules/audio_processing/vad/voice_activity_detector_unittest.cc

Issue 1181933002: Pull the Voice Activity Detector out from the AGC (Closed) Base URL: https://chromium.googlesource.com/external/webrtc.git@master

Patch Set: Test without relying on golden output Created 5 years, 6 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
(Empty)
	1 /*

	2 * Copyright (c) 2015 The WebRTC project authors. All Rights Reserved.

	3 *

	4 * Use of this source code is governed by a BSD-style license

	5 * that can be found in the LICENSE file in the root of the source

	6 * tree. An additional intellectual property rights grant can be found

	7 * in the file PATENTS. All contributing project authors may

	8 * be found in the AUTHORS file in the root of the source tree.

	9 */

	10

	11 #include "webrtc/modules/audio_processing/vad/voice_activity_detector.h"

	12

	13 #include <algorithm>

	14

	15 #include "testing/gtest/include/gtest/gtest.h"

	16 #include "webrtc/test/testsupport/fileutils.h"

	17

	18 namespace webrtc {

	19 namespace {

	20

	21 const double kDefaultVoiceValue = 1.0;

	22 const float kMeanSpeechProbability = 0.3f;

	23 const float kMaxNoiseProbability = 0.05f;
	Andrew MacDonald 2015/06/17 04:14:57 Move these three to the tests where they're used. Move these three to the tests where they're used. And just curious, do you actually see values above 0.0? aluebs-webrtc 2015/06/17 17:22:03 Done. And yes, there are a few (2 or 3) values whi Show quoted text On 2015/06/17 04:14:57, andrew wrote: > Move these three to the tests where they're used. > > And just curious, do you actually see values above 0.0? Done. And yes, there are a few (2 or 3) values which come up to 0.04 or something.
	24 const size_t kNumChunks = 100u;

	25 const size_t kNumChunksPerIsacBlock = 3;

	26

	27 void GenerateNoise(int16_t* data, size_t length) {

	28 for (size_t i = 0; i < length; ++i) {

	29 data[i] = std::rand();
	Andrew MacDonald 2015/06/17 04:14:57 std::rand returns between 0 and RAND_MAX. I guess std::rand returns between 0 and RAND_MAX. I guess this will work because it wraps into some random place. Might want to add a comment though. aluebs-webrtc 2015/06/17 17:22:04 Yes, I know, but I assumed the wrapping would stil Show quoted text On 2015/06/17 04:14:57, andrew wrote: > std::rand returns between 0 and RAND_MAX. I guess this will work because it > wraps into some random place. Might want to add a comment though. Yes, I know, but I assumed the wrapping would still give me noise. Added comment.
	30 }

	31 }

	32

	33 } // namespace

	34

	35 TEST(VoiceActivityDetectorTest, ConstructorSetsDefaultValues) {

	36 VoiceActivityDetector vad;

	37

	38 std::vector<double> p = vad.chunkwise_voice_probabilities();

	39 std::vector<double> rms = vad.chunkwise_rms();

	40

	41 EXPECT_EQ(p.size(), 0u);

	42 EXPECT_EQ(rms.size(), 0u);

	43

	44 EXPECT_DOUBLE_EQ(vad.last_voice_probability(), kDefaultVoiceValue);
	Andrew MacDonald 2015/06/17 04:14:57 EXPECT_FLOAT_EQ EXPECT_FLOAT_EQ aluebs-webrtc 2015/06/17 17:22:03 I forgot to change this after the double->float ch Show quoted text On 2015/06/17 04:14:57, andrew wrote: > EXPECT_FLOAT_EQ I forgot to change this after the double->float change. Thanks for catching that.
	45 }

	46

	47 TEST(VoiceActivityDetectorTest, SpeechHasHighVoiceProbabilities) {

	48 VoiceActivityDetector vad;

	49

	50 int16_t data[kLength10Ms];

	51 float mean_probability = 0.f;

	52

	53 FILE* pcm_file =

	54 fopen(test::ResourcePath("audio_processing/agc/agc_audio", "pcm").c_str(),
	Andrew MacDonald 2015/06/17 04:14:57 What is actually in this file? Is it mostly speech What is actually in this file? Is it mostly speech? I'm a bit surprised you can't make the threshold higher. aluebs-webrtc 2015/06/17 17:22:03 Yes, there is some utterances with some silence in Show quoted text On 2015/06/17 04:14:57, andrew wrote: > What is actually in this file? Is it mostly speech? I'm a bit surprised you > can't make the threshold higher. Yes, there is some utterances with some silence in between. Because of the silences, noise and non-voiced speech I am actually not surprised that the threshold is that low. It actually is more like 0.39, but I used 0.3 to be more robust. Maybe I can tweak that a little more, WDYT? Also, I could measure max_probability instead of mean_probability, which will be definitively higher, but I think this gives a better picture. WDYT?
	55 "rb");

	56 ASSERT_TRUE(pcm_file != NULL);

	57

	58 size_t num_chunks = 0;

	59 while (fread(data, sizeof(*data), kLength10Ms, pcm_file) == kLength10Ms) {

	60 vad.ProcessChunk(data, kLength10Ms, kSampleRateHz);

	61

	62 mean_probability += vad.last_voice_probability();

	63

	64 ++num_chunks;

	65 }

	66

	67 mean_probability /= num_chunks;

	68

	69 EXPECT_GT(mean_probability, kMeanSpeechProbability);

	70 }

	71

	72 TEST(VoiceActivityDetectorTest, NoiseHasLowVoiceProbabilities) {

	73 VoiceActivityDetector vad;

	74

	75 int16_t data[kLength10Ms];
	Andrew MacDonald 2015/06/17 04:14:56 nit: Use a vector? Then you don't need to pass the nit: Use a vector? Then you don't need to pass the length separately to GenerateNoise. aluebs-webrtc 2015/06/17 17:22:03 Done. Show quoted text On 2015/06/17 04:14:56, andrew wrote: > nit: Use a vector? Then you don't need to pass the length separately to > GenerateNoise. Done.
	76 float max_probability = 0.f;

	77

	78 std::srand(42);

	79

	80 for (size_t i = 0; i < kNumChunks; ++i) {

	81 GenerateNoise(data, kLength10Ms);

	82

	83 vad.ProcessChunk(data, kLength10Ms, kSampleRateHz);

	84

	85 if (i > kNumChunksPerIsacBlock) {
	Andrew MacDonald 2015/06/17 04:14:57 Do you need to know about this? Why not just check Do you need to know about this? Why not just check every chunk? aluebs-webrtc 2015/06/17 17:22:04 Because before the vad has enough data to process Show quoted text On 2015/06/17 04:14:57, andrew wrote: > Do you need to know about this? Why not just check every chunk? Because before the vad has enough data to process an ISAC block it will return the default value, 1.f, which would ruin the max_probability value. Andrew MacDonald 2015/06/17 21:15:14 Ah OK. Perhaps add a comment to that effect. Show quoted text On 2015/06/17 17:22:04, aluebs-webrtc wrote: > On 2015/06/17 04:14:57, andrew wrote: > > Do you need to know about this? Why not just check every chunk? > > Because before the vad has enough data to process an ISAC block it will return > the default value, 1.f, which would ruin the max_probability value. Ah OK. Perhaps add a comment to that effect. aluebs-webrtc 2015/06/18 00:49:21 Done. Show quoted text On 2015/06/17 21:15:14, andrew wrote: > On 2015/06/17 17:22:04, aluebs-webrtc wrote: > > On 2015/06/17 04:14:57, andrew wrote: > > > Do you need to know about this? Why not just check every chunk? > > > > Because before the vad has enough data to process an ISAC block it will return > > the default value, 1.f, which would ruin the max_probability value. > > Ah OK. Perhaps add a comment to that effect. Done.
	86 max_probability = std::max(max_probability, vad.last_voice_probability());

	87 }

	88 }

	89

	90 EXPECT_LT(max_probability, kMaxNoiseProbability);

	91 }

	92

	93 } // namespace webrtc

OLD	NEW