|
|
Created:
4 years, 10 months ago by aluebs-webrtc Modified:
4 years, 10 months ago CC:
webrtc-reviews_webrtc.org, peah-webrtc, Andrew MacDonald, tterriberry_mozilla.com, audio-team_agora.io, kwiberg-webrtc, minyue-webrtc, the sun, bjornv1 Base URL:
https://chromium.googlesource.com/external/webrtc.git@pow Target Ref:
refs/pending/heads/master Project:
webrtc Visibility:
Public. |
DescriptionUse VAD to get a better speech power estimation in the IntelligibilityEnhancer
R=henrik.lundin@webrtc.org, turaj@webrtc.org
Committed: https://chromium.googlesource.com/external/webrtc/+/18fcbcf48c190b6248cd16ef044d1edb79cba040
Patch Set 1 #
Total comments: 40
Patch Set 2 : Rebasing #Patch Set 3 : Make gain change limit relative #
Total comments: 6
Patch Set 4 : Make sure the gain never becomes zero #Patch Set 5 : Make PowerEstimator a template class #Patch Set 6 : Rebasing #Patch Set 7 : Returned to the horrible solution of having a constant for a size_t one #Patch Set 8 : Use f for float #Dependent Patchsets: Messages
Total messages: 31 (12 generated)
aluebs@webrtc.org changed reviewers: + henrik.lundin@webrtc.org, turaj@webrtc.org
https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:38: const float kGainChangeLimit = 0.1f; // Maximum change in gain. Is kGainChangeLimit relative to current value, or absolute? The naming indicates absolute limit (I haven't checked the implementation), but doesn't relative limit make more sense? If so, please change the name. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:155: MapToErbBands(noise_power_estimator_->power(), I'm confused that why we are back to using PowerEstimator for captured signal. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:323: if (vad_.last_voice_probability() > kVoiceProbabilityThreshold) { I thought we gonna use the energy-based VAD with has binary output. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:325: } else if (chunks_since_voice_ < kSpeechOffsetDelay) { If energy-based VAD is used, do we still need this guard? https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (left): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:51: explicit IntelligibilityEnhancer(const Config& config); ctor with a config struct might come handy for tuning, but it is totally up to you. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:39: void PowerEstimator::Step(const float* data) { This is a utility class so I guess it is fine to be implemented like this. But this is a bit dangerous implementation in the sense that one might call both Step() methods and mess up with averaging. I guess what you need here is a template class. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:40: for (size_t i = 0; i < power_.size(); ++i) { maybe a DCHECK that power_ and data are of the same size. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:46: for (size_t i = 0; i < power_.size(); ++i) { same comment as line 40.
https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:37: const float kDecayRate = 0.97f; // Power estimation decay rate. Two spaces before comment. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:37: const float kDecayRate = 0.97f; // Power estimation decay rate. You change this value from 0.9 to 0.97. Can you explain why? https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:38: const float kGainChangeLimit = 0.1f; // Maximum change in gain. Two spaces before comment. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:39: const float kRho = 0.0004f; This value is also changed... https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:110: float freqs_khz = kClipFreq / 1000.f; This is const too. And it should probably be named kFreqsKhz, since it is compile-time defined. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:111: size_t erb_index = static_cast<size_t>(ceilf( const https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:116: std::vector<int16_t> audio_s16_; What does "s" mean in audio_s16_? https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:39: void PowerEstimator::Step(const float* data) { On 2016/02/13 00:09:43, turaj wrote: > This is a utility class so I guess it is fine to be implemented like this. But > this is a bit dangerous implementation in the sense that one might call both > Step() methods and mess up with averaging. I guess what you need here is a > template class. Acknowledged. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:40: for (size_t i = 0; i < power_.size(); ++i) { On 2016/02/13 00:09:43, turaj wrote: > maybe a DCHECK that power_ and data are of the same size. You don't know the size of |data|. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h:35: const float* power() { return &power_[0]; }; Why not return a const std::vector<float>&? https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/test/intelligibility_proc.cc (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/test/intelligibility_proc.cc:63: size_t samples = std::min(in_stat.st_size, noise_stat.st_size) / 2; const?
https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:37: const float kDecayRate = 0.97f; // Power estimation decay rate. On 2016/02/15 13:05:11, hlundin-webrtc wrote: > Two spaces before comment. Done. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:37: const float kDecayRate = 0.97f; // Power estimation decay rate. On 2016/02/15 13:05:11, hlundin-webrtc wrote: > You change this value from 0.9 to 0.97. Can you explain why? For this algorithm we care about the long-time psd of the speech and not individual phonemes, which is what was happening before this patch. Using 0.97 and knowing that each step is 8ms long we can estimate the time constant in 182ms instead of 52ms. Now I increased it to 0.98, which gives us a time constant of 275ms. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:38: const float kGainChangeLimit = 0.1f; // Maximum change in gain. On 2016/02/15 13:05:11, hlundin-webrtc wrote: > Two spaces before comment. Done. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:38: const float kGainChangeLimit = 0.1f; // Maximum change in gain. On 2016/02/13 00:09:42, turaj wrote: > Is kGainChangeLimit relative to current value, or absolute? The naming indicates > absolute limit (I haven't checked the implementation), but doesn't relative > limit make more sense? If so, please change the name. It was an absolute limit, but I agree that a relative limit makes much more sense, so I changed it to be that way and fixed the name. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:39: const float kRho = 0.0004f; On 2016/02/15 13:05:11, hlundin-webrtc wrote: > This value is also changed... It changed to be squared, which is the only way it is used. I didn't change the naming because the variable containing the squared value before was also called rho_. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:110: float freqs_khz = kClipFreq / 1000.f; On 2016/02/15 13:05:11, hlundin-webrtc wrote: > This is const too. And it should probably be named kFreqsKhz, since it is > compile-time defined. I removed it and updated the constant directly. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:111: size_t erb_index = static_cast<size_t>(ceilf( On 2016/02/15 13:05:11, hlundin-webrtc wrote: > const Done. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:155: MapToErbBands(noise_power_estimator_->power(), On 2016/02/13 00:09:42, turaj wrote: > I'm confused that why we are back to using PowerEstimator for captured signal. To be consistent with the PSD estimation from the spectrum magnitude which looks something like this: mean(x^2). Also, it adds an additional smoothing, which will help to avoid some spurious peaks in the noise estimation. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:323: if (vad_.last_voice_probability() > kVoiceProbabilityThreshold) { On 2016/02/13 00:09:42, turaj wrote: > I thought we gonna use the energy-based VAD with has binary output. As discussed offline, having the pitch-based VAD will have less false positives and by adding a guard we can estimate the PSD for voiced and unvoiced speech, while not adapting to the noise during long periods of silence. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:325: } else if (chunks_since_voice_ < kSpeechOffsetDelay) { On 2016/02/13 00:09:42, turaj wrote: > If energy-based VAD is used, do we still need this guard? No, but I think we should use the pitch-based VAD. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (left): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:51: explicit IntelligibilityEnhancer(const Config& config); On 2016/02/13 00:09:43, turaj wrote: > ctor with a config struct might come handy for tuning, but it is totally up to > you. I dropped num_capture_channels and analysis_rate and tuned decay_rate in this CL. That only leaves the gain_change_limit and rho to be tuned, because sample_rate_hz and num_render_channels are settable parameters. I think in this case this constructor adds more complexity than the simplicity for tuning. But if you have a strong opinion I am happy to add it back. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:116: std::vector<int16_t> audio_s16_; On 2016/02/15 13:05:11, hlundin-webrtc wrote: > What does "s" mean in audio_s16_? Not sure, but I am following the same naming than in audio_util. I am happy to rename it if you suggest a better name for it :) https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:39: void PowerEstimator::Step(const float* data) { On 2016/02/13 00:09:43, turaj wrote: > This is a utility class so I guess it is fine to be implemented like this. But > this is a bit dangerous implementation in the sense that one might call both > Step() methods and mess up with averaging. I guess what you need here is a > template class. Done. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:40: for (size_t i = 0; i < power_.size(); ++i) { On 2016/02/15 13:05:12, hlundin-webrtc wrote: > On 2016/02/13 00:09:43, turaj wrote: > > maybe a DCHECK that power_ and data are of the same size. > > You don't know the size of |data|. As Henrik points out, the size of data is unknown. If you think it is worth it I can pass in the size as another parameter and DCHECK. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h:35: const float* power() { return &power_[0]; }; On 2016/02/15 13:05:12, hlundin-webrtc wrote: > Why not return a const std::vector<float>&? Done. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/test/intelligibility_proc.cc (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/test/intelligibility_proc.cc:63: size_t samples = std::min(in_stat.st_size, noise_stat.st_size) / 2; On 2016/02/15 13:05:12, hlundin-webrtc wrote: > const? Done.
https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (left): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:51: explicit IntelligibilityEnhancer(const Config& config); On 2016/02/19 03:56:31, aluebs-webrtc wrote: > On 2016/02/13 00:09:43, turaj wrote: > > ctor with a config struct might come handy for tuning, but it is totally up to > > you. > > I dropped num_capture_channels and analysis_rate and tuned decay_rate in this > CL. That only leaves the gain_change_limit and rho to be tuned, because > sample_rate_hz and num_render_channels are settable parameters. I think in this > case this constructor adds more complexity than the simplicity for tuning. But > if you have a strong opinion I am happy to add it back. My point was that to tune for rho or gain_change_limit (maybe decay_factor of noise should be different than clear speech) one needs to recompile, but if that is fine with you I have no complain. I agree that the final code should look like what you have here. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:116: std::vector<int16_t> audio_s16_; On 2016/02/19 03:56:31, aluebs-webrtc wrote: > On 2016/02/15 13:05:11, hlundin-webrtc wrote: > > What does "s" mean in audio_s16_? > > Not sure, but I am following the same naming than in audio_util. I am happy to > rename it if you suggest a better name for it :) Perhaps it means signed, like sox uses similar convention. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:40: for (size_t i = 0; i < power_.size(); ++i) { On 2016/02/19 03:56:31, aluebs-webrtc wrote: > On 2016/02/15 13:05:12, hlundin-webrtc wrote: > > On 2016/02/13 00:09:43, turaj wrote: > > > maybe a DCHECK that power_ and data are of the same size. > > > > You don't know the size of |data|. > > As Henrik points out, the size of data is unknown. If you think it is worth it I > can pass in the size as another parameter and DCHECK. Sorry, my mistake, no I don't think it is necessary to pass the size. https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc (right): https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:34: return current * gain; I'm not sure if |current| could ever be zero, but if it is then this way of update will never recover it, how about just changing the last line of the previous implementation to return current + sign * fminf(delta, fmaxf(limit * current, epsilon())) https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h (right): https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h:33: template <typename T> I'm sure have compiled this code, just out of my curiosity, usually templates need instantiations, wasn't that necessary here? And why implementation is in header file, while implementation of constructor is in .cc file? I'm not experienced with templates, but I would have templated the class and instantiate the possible templates. I guess in this case instantiations would be <float> and <std::complex<float>>.
https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (left): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:51: explicit IntelligibilityEnhancer(const Config& config); On 2016/02/19 16:48:47, turaj wrote: > On 2016/02/19 03:56:31, aluebs-webrtc wrote: > > On 2016/02/13 00:09:43, turaj wrote: > > > ctor with a config struct might come handy for tuning, but it is totally up > to > > > you. > > > > I dropped num_capture_channels and analysis_rate and tuned decay_rate in this > > CL. That only leaves the gain_change_limit and rho to be tuned, because > > sample_rate_hz and num_render_channels are settable parameters. I think in > this > > case this constructor adds more complexity than the simplicity for tuning. But > > if you have a strong opinion I am happy to add it back. > > My point was that to tune for rho or gain_change_limit (maybe decay_factor of > noise should be different than clear speech) one needs to recompile, but if that > is fine with you I have no complain. I agree that the final code should look > like what you have here. I will leave this as is and will add a more flexible constructor locally when tuning the remaining parameters. https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc (right): https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.cc:34: return current * gain; On 2016/02/19 16:48:47, turaj wrote: > I'm not sure if |current| could ever be zero, but if it is then this way of > update will never recover it, how about just changing the last line of the > previous implementation to > > return current + sign * fminf(delta, fmaxf(limit * current, epsilon())) |current| should never be zero, since it starts in 1.f and gets multiplied by gains between (1.f - limit) and (1.f + limit), but I see how it could get really close. What I fear from your suggestion is that it could return negative numbers when |target| is zero and |current| is smaller than epsilon. I added epsilon to the return value to my proposal to make sure current never becomes zero. WDYT? https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h (right): https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h:33: template <typename T> On 2016/02/19 16:48:47, turaj wrote: > I'm sure have compiled this code, just out of my curiosity, usually templates > need instantiations, wasn't that necessary here? And why implementation is in > header file, while implementation of constructor is in .cc file? > > I'm not experienced with templates, but I would have templated the class and > instantiate the possible templates. I guess in this case instantiations would be > <float> and <std::complex<float>>. I am not experienced with templates either, but I decided to only templatize the function, since the implementation is the same and power_ has the same type, changing only the input type of that function. But maybe it is preferred to templatize the whole class, I am not sure. Templates only need explicit instantiations when implemented in the cc file instead of the header, so I implemented the template function in the header to get the additional flexibility of not having to explicitly instantiate. If you prefer a class template, I am happy to change it, since I am not sure myself what the criteria is to chose between function and class template. Or if you prefer to have the implementation in the cc file and restrict the template with explicit instantations.
The diff got horrible by the rebase you did in patch set 2. Please, avoid mid-review rebasing in the future if you can. But, as far as I can see, this LGTM. https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h (right): https://codereview.webrtc.org/1693823004/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h:116: std::vector<int16_t> audio_s16_; On 2016/02/19 16:48:47, turaj wrote: > On 2016/02/19 03:56:31, aluebs-webrtc wrote: > > On 2016/02/15 13:05:11, hlundin-webrtc wrote: > > > What does "s" mean in audio_s16_? > > > > Not sure, but I am following the same naming than in audio_util. I am happy to > > rename it if you suggest a better name for it :) > > Perhaps it means signed, like sox uses similar convention. Acknowledged.
LGTM, with an optional change. https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h (right): https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h:33: template <typename T> Like you, I really don't know what is preferred with respect to templating member function vs templating class, nor header-file implementation versus source-file implementation. However, the reason I proposed to have template, beside saving couple of lines of code, was to prevent a potential bug my mixing complex and float averaging. If that concern is valid (which I'm not sure as this is an internal class and only intelligibility is using it) then templating like this is not serving the purpose. Things can still go wrong. Templating class, on the other hand, prevents the above issue as intelligibility will have PowerEstimator<float> noise_power_; PowerEstimator<std::complex<float>> speech_power_; Given the above, I a bit prefer class templating, but I don't have strong opinion, so I leave it up to you.
Sorry if the diff got horrible, I find that as long as it is kept in its own separate patch it is easier to handle. Will try to not do it next time. https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h (right): https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h:33: template <typename T> On 2016/02/22 17:36:53, turaj wrote: > > Like you, I really don't know what is preferred with respect to templating > member function vs templating class, nor header-file implementation versus > source-file implementation. However, the reason I proposed to have template, > beside saving couple of lines of code, was to prevent a potential bug my mixing > complex and float averaging. If that concern is valid (which I'm not sure as > this is an internal class and only intelligibility is using it) then templating > like this is not serving the purpose. Things can still go wrong. > > Templating class, on the other hand, prevents the above issue as intelligibility > will have > > PowerEstimator<float> noise_power_; > PowerEstimator<std::complex<float>> speech_power_; > > Given the above, I a bit prefer class templating, but I don't have strong > opinion, so I leave it up to you. That is an excellent point! Changed it to a template class.
The CQ bit was checked by aluebs@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from henrik.lundin@webrtc.org, turaj@webrtc.org Link to the patchset: https://codereview.webrtc.org/1693823004/#ps80001 (title: "Make PowerEstimator a template class")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1693823004/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1693823004/80001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: ios64_sim_dbg on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/ios64_sim_dbg/builds/5317) ios_arm64_rel on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/ios_arm64_rel/builds/7562) ios_rel on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/ios_rel/builds/11470) mac_compile_dbg on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/mac_compile_dbg/builds/...) mac_gn_dbg on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/mac_gn_dbg/builds/979)
The CQ bit was checked by aluebs@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from henrik.lundin@webrtc.org, turaj@webrtc.org Link to the patchset: https://codereview.webrtc.org/1693823004/#ps100001 (title: "Rebasing")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1693823004/100001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1693823004/100001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: ios32_sim_dbg on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/ios32_sim_dbg/builds/5320) ios_arm64_dbg on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/ios_arm64_dbg/builds/7646) ios_rel on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/ios_rel/builds/11471)
The CQ bit was checked by aluebs@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from henrik.lundin@webrtc.org, turaj@webrtc.org Link to the patchset: https://codereview.webrtc.org/1693823004/#ps120001 (title: "Returned to the horrible solution of having a constant for a size_t one")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1693823004/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1693823004/120001
The CQ bit was unchecked by commit-bot@chromium.org
Exceeded global retry quota
Message was sent while issue was closed.
Description was changed from ========== Use VAD to get a better speech power estimation in the IntelligibilityEnhancer ========== to ========== Use VAD to get a better speech power estimation in the IntelligibilityEnhancer R=henrik.lundin@webrtc.org, turaj@webrtc.org Committed: https://crrev.com/18fcbcf48c190b6248cd16ef044d1edb79cba040 Cr-Commit-Position: refs/heads/master@{#11713} ==========
Message was sent while issue was closed.
Patchset 8 (id:??) landed as https://crrev.com/18fcbcf48c190b6248cd16ef044d1edb79cba040 Cr-Commit-Position: refs/heads/master@{#11713}
Message was sent while issue was closed.
Description was changed from ========== Use VAD to get a better speech power estimation in the IntelligibilityEnhancer R=henrik.lundin@webrtc.org, turaj@webrtc.org Committed: https://crrev.com/18fcbcf48c190b6248cd16ef044d1edb79cba040 Cr-Commit-Position: refs/heads/master@{#11713} ========== to ========== Use VAD to get a better speech power estimation in the IntelligibilityEnhancer R=henrik.lundin@webrtc.org, turaj@webrtc.org Committed: https://chromium.googlesource.com/external/webrtc/+/18fcbcf48c190b6248cd16ef0... ==========
Message was sent while issue was closed.
Committed patchset #8 (id:140001) manually as 18fcbcf48c190b6248cd16ef044d1edb79cba040 (presubmit successful).
Message was sent while issue was closed.
On 2016/02/22 20:41:03, aluebs-webrtc wrote: > Sorry if the diff got horrible, I find that as long as it is kept in its own > separate patch it is easier to handle. Will try to not do it next time. Yes, rebasing in a separate patch set is better than baking it with your own changes. But if I have reviewed patch set 1, and then you make a rebase in patch set 2, followed by patch set 3 containing updates in response to my review, what I want to see really is the diff between 1 and 3. I will of course look at the diff between 2 and 3, but since my memory is weak and poor, I really need my own comments in patch set 1 on the left hand to be able to see if you did what I told you to. :) > > https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... > File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h > (right): > > https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... > webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h:33: > template <typename T> > On 2016/02/22 17:36:53, turaj wrote: > > > > Like you, I really don't know what is preferred with respect to templating > > member function vs templating class, nor header-file implementation versus > > source-file implementation. However, the reason I proposed to have template, > > beside saving couple of lines of code, was to prevent a potential bug my > mixing > > complex and float averaging. If that concern is valid (which I'm not sure as > > this is an internal class and only intelligibility is using it) then > templating > > like this is not serving the purpose. Things can still go wrong. > > > > Templating class, on the other hand, prevents the above issue as > intelligibility > > will have > > > > PowerEstimator<float> noise_power_; > > PowerEstimator<std::complex<float>> speech_power_; > > > > Given the above, I a bit prefer class templating, but I don't have strong > > opinion, so I leave it up to you. > > That is an excellent point! Changed it to a template class.
Message was sent while issue was closed.
On 2016/02/23 14:22:50, hlundin-webrtc wrote: > On 2016/02/22 20:41:03, aluebs-webrtc wrote: > > Sorry if the diff got horrible, I find that as long as it is kept in its own > > separate patch it is easier to handle. Will try to not do it next time. > > Yes, rebasing in a separate patch set is better than baking it with your own > changes. But if I have reviewed patch set 1, and then you make a rebase in patch > set 2, followed by patch set 3 containing updates in response to my review, what > I want to see really is the diff between 1 and 3. I will of course look at the > diff between 2 and 3, but since my memory is weak and poor, I really need my own > comments in patch set 1 on the left hand to be able to see if you did what I > told you to. :) > > > > > > https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... > > File webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h > > (right): > > > > > https://codereview.webrtc.org/1693823004/diff/40001/webrtc/modules/audio_proc... > > webrtc/modules/audio_processing/intelligibility/intelligibility_utils.h:33: > > template <typename T> > > On 2016/02/22 17:36:53, turaj wrote: > > > > > > Like you, I really don't know what is preferred with respect to templating > > > member function vs templating class, nor header-file implementation versus > > > source-file implementation. However, the reason I proposed to have template, > > > beside saving couple of lines of code, was to prevent a potential bug my > > mixing > > > complex and float averaging. If that concern is valid (which I'm not sure as > > > this is an internal class and only intelligibility is using it) then > > templating > > > like this is not serving the purpose. Things can still go wrong. > > > > > > Templating class, on the other hand, prevents the above issue as > > intelligibility > > > will have > > > > > > PowerEstimator<float> noise_power_; > > > PowerEstimator<std::complex<float>> speech_power_; > > > > > > Given the above, I a bit prefer class templating, but I don't have strong > > > opinion, so I leave it up to you. > > > > That is an excellent point! Changed it to a template class. I see how that can be frustrating. Will do my best to not do it again :) |