|
|
Created:
4 years, 9 months ago by aluebs-webrtc Modified:
4 years, 8 months ago CC:
webrtc-reviews_webrtc.org, Andrew MacDonald, tterriberry_mozilla.com, audio-team_agora.io, hlundin-webrtc, kwiberg-webrtc, minyue-webrtc, the sun, bjornv1 Base URL:
https://chromium.googlesource.com/external/webrtc.git@master Target Ref:
refs/pending/heads/master Project:
webrtc Visibility:
Public. |
DescriptionFix normalization of noise estimate in NoiseSuppressor
R=henrik.lundin@webrtc.org, peah@webrtc.org, turaj@webrtc.org
Committed: https://chromium.googlesource.com/external/webrtc/+/3b149960466f08b5351f5d55a54d517838d7310d
Patch Set 1 #
Total comments: 11
Patch Set 2 : Normalize dynamically #
Total comments: 11
Patch Set 3 : Moved normalization and drop float usage #
Total comments: 4
Patch Set 4 : Combine in a function #
Total comments: 12
Patch Set 5 : Avoid divisions #Patch Set 6 : Rebasing #Patch Set 7 : Disable noise suppressor bit-exactness unittests #
Messages
Total messages: 31 (9 generated)
aluebs@webrtc.org changed reviewers: + henrik.lundin@webrtc.org, peah@webrtc.org, turaj@webrtc.org
https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:32: const double kLambdaBot = -1.0 / (1 << 30); // Extreme values in bisection Why is double precision needed here? https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:270: void IntelligibilityEnhancer::SolveForGainsGivenLambda(double lambda, What is the motivation for this change? Is really a double power precision needed? https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/test/intelligibility_proc.cc (left): https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/test/intelligibility_proc.cc:59: FloatS16ToFloat(in.data(), in.size(), in.data()); Is this change due to that the conversion was redundant? https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/noise_suppression_impl.cc (right): https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/noise_suppression_impl.cc:184: noise_estimate[i] += noise[i] / suppressors_.size(); This change basically removes the normalization of the noise, right? Is that because it is no longer needed and instead done elsewhere? I guess that for the IE functionality to work properly, there should be a corresponding change in IE, but I cannot see that in this CL.
https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/noise_suppression_impl.cc (right): https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/noise_suppression_impl.cc:184: noise_estimate[i] += noise[i] / suppressors_.size(); On 2016/03/21 12:25:48, peah-webrtc wrote: > This change basically removes the normalization of the noise, right? Is that > because it is no longer needed and instead done elsewhere? I guess that for the > IE functionality to work properly, there should be a corresponding change in IE, > but I cannot see that in this CL. My concern is that both forward and reverse streams to be in the same range, not one in int16 and the other one in [-1, 1] range. But I don't know how to enforce it.
https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/noise_suppression_impl.cc (right): https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/noise_suppression_impl.cc:188: const float kNormalizationFactor = 1.f / (1 << 9); Why change from 23 to 9 (a reduction by 14 shifts)? I would have guessed a reduction by 15 to match the deleted normalization in the float path.
https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:32: const double kLambdaBot = -1.0 / (1 << 30); // Extreme values in bisection On 2016/03/21 12:25:48, peah-webrtc wrote: > Why is double precision needed here? The double precision is needed for the quadratic solution below to be stable, so I made lambda also double to be consistent. https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:270: void IntelligibilityEnhancer::SolveForGainsGivenLambda(double lambda, On 2016/03/21 12:25:48, peah-webrtc wrote: > What is the motivation for this change? Is really a double power precision > needed? The double precision is needed for the quadratic solution below to be stable, so I made lambda also double to be consistent. https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/intelligibility/test/intelligibility_proc.cc (left): https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/intelligibility/test/intelligibility_proc.cc:59: FloatS16ToFloat(in.data(), in.size(), in.data()); On 2016/03/21 12:25:48, peah-webrtc wrote: > Is this change due to that the conversion was redundant? This conversion is not present in the APM, so that is why the normalization was done wrong. Now the audioproc_float and intelligibility_proc have similar results. https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/noise_suppression_impl.cc (right): https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/noise_suppression_impl.cc:184: noise_estimate[i] += noise[i] / suppressors_.size(); On 2016/03/21 13:50:23, turaj wrote: > On 2016/03/21 12:25:48, peah-webrtc wrote: > > This change basically removes the normalization of the noise, right? Is that > > because it is no longer needed and instead done elsewhere? I guess that for > the > > IE functionality to work properly, there should be a corresponding change in > IE, > > but I cannot see that in this CL. > > My concern is that both forward and reverse streams to be in the same range, not > one in int16 and the other one in [-1, 1] range. But I don't know how to enforce > it. Exactly, I was testing it in the intelligibility_proc, which had one normalization to much (removed on this CL) compared to the APM, so the results were different. So for the IE to work properly once the normalization in both tests were the same, this had to be corrected here. The good thing is that the APM always works with int16 range in both directions. https://codereview.webrtc.org/1821443003/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/noise_suppression_impl.cc:188: const float kNormalizationFactor = 1.f / (1 << 9); On 2016/03/29 19:19:36, hlundin-webrtc wrote: > Why change from 23 to 9 (a reduction by 14 shifts)? I would have guessed a > reduction by 15 to match the deleted normalization in the float path. Good catch! I was just checking when the output was more similar in the float and fixed cases. But digging some deeper, I realized that the domain is not fixed between chunks (although it tends to stabilize towards a value). So now I fixed the problem by normalizing it dynamically inside the noise_suppressor_x and not having this module need to know about it.
https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.c (right): https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.c:54: self->prevNoiseU16[i] = self->prevNoiseU32[i] >> (self->prevQNoise + 11); Can you update prevNoiseU16 where prevNoiseU32 is updated, and retain the const-ness (w.r.t. nsxInst) of this function?
https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:170: const double lambda = (lambda_bot + lambda_top) / 2.0; lambda is always inbetween lambda_bot and lambda_top, right? And both of these are bounded by kLambdaBot and kLambdaTop, right? Then I don't see why the double precision is needed for lambda as kLambdaBot kLambdaTop are sufficiently close to be possible to represent using floats. Or am I missing something? https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:287: const double gamma0 = 0.5 * kRho * pow_x0[n] * pow_n0[n] + I cannot see from this equations what difference the double precision should make. Is it because beta, alpha and gamma are of highly different magnitudes? https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.c (right): https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.c:54: self->prevNoiseU16[i] = self->prevNoiseU32[i] >> (self->prevQNoise + 11); This change transitions from a noise level of 32 bits with adaptive Q value to a 16 bit value with a fixed Q value of 11, right? How has it been verified that this value is sufficient to avoid saturation/ ensure a proper level? An alternative would be to convert the spectrum to floating point (Q0) as it is anyway only used in floating point format. https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.c:54: self->prevNoiseU16[i] = self->prevNoiseU32[i] >> (self->prevQNoise + 11); I don't see why the prevNoiseU16 is used here. It is only used as a static buffer now, right? And this buffer will remain on the state of the noise suppressor. I think a better solution is to have the WebRtcNsx_noise_estimate and NoiseEstimate() methods accept an array/vector that is owned by the outside which it copies the noise estimate to. In that way: 1) There is no need for a contract in which the noise suppressor guarantees that this internal buffer is not accessed meanwhile an external user is reading it. 2) There is no space wasted on the noise supressor state for doing this.
https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:32: const double kLambdaBot = -1.0 / (1 << 30); // Extreme values in bisection This is relatively large change which scales the search reason by 1/2^30. Is the the change needed because of the change in noise level representation? Does it mean that gain are 2^30 times more sensitive to the value of \lambda? https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.c (right): https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.c:54: self->prevNoiseU16[i] = self->prevNoiseU32[i] >> (self->prevQNoise + 11); On 2016/03/30 13:44:46, peah-webrtc wrote: > This change transitions from a noise level of 32 bits with adaptive Q value to a > 16 bit value with a fixed Q value of 11, right? > > How has it been verified that this value is sufficient to avoid saturation/ > ensure a proper level? > > An alternative would be to convert the spectrum to floating point (Q0) as it is > anyway only used in floating point format. Is the new Q value of noise level 11 or -11? I suppose you want to keep this interface in fixed-point because it is fixed-point implementation of the noise suppressor, however, currently the only module using it is IE which is in floating point. So I really don't see of going through some scaling here that is not meaningful for the caller. I suppose there is a top level API that calls either fixed-point or floating-point NS, depending on the build. I suggest that a conversion from fixed-point (with adaptive Q number) to float has to happen in that top-level API. Then, this API would return the noise level as it is with the associated Q value, and let the top level API do a safe fixed-point to floating-point conversion. This way, the method will be const, as henrik suggested, and we do not need extra buffer, as Per concerned. IMPORTANT: there should be a agreement that whether the noise level is computed for int16 range of input signal or [-1 1] range.
https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:32: const double kLambdaBot = -1.0 / (1 << 30); // Extreme values in bisection On 2016/03/30 14:51:55, turaj wrote: > This is relatively large change which scales the search reason by 1/2^30. Is > the the change needed because of the change in noise level representation? Does > it mean that gain are 2^30 times more sensitive to the value of \lambda? Yes, it needed to be normalized by the new noise scale. But now I changed the implementation to have a different normalization factor that is applied independently. https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:170: const double lambda = (lambda_bot + lambda_top) / 2.0; On 2016/03/30 13:44:45, peah-webrtc wrote: > lambda is always inbetween lambda_bot and lambda_top, right? And both of these > are bounded by kLambdaBot and kLambdaTop, right? Then I don't see why the double > precision is needed for lambda as kLambdaBot kLambdaTop are sufficiently close > to be possible to represent using floats. Or am I missing something? Good point, although I was trying to keep consistency. In any way, I found a way to normalize the power independently so that double precision is not necessary. https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:287: const double gamma0 = 0.5 * kRho * pow_x0[n] * pow_n0[n] + On 2016/03/30 13:44:46, peah-webrtc wrote: > I cannot see from this equations what difference the double precision should > make. Is it because beta, alpha and gamma are of highly different magnitudes? Basically because each alpha0, beta0 or gamma0 has a product of 3 pow_, which are the PSD of a 256 length FFT of a signal of 2^15 maximum amplitude, so it can go up to ((256 * 2^15)^2)^3 = 2^138, which is approximately 256e39 which is above the float maximum (3.40282e38). But now I am normalizing the power independtly, so that double precision is not necessary anymore. https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.c (right): https://codereview.webrtc.org/1821443003/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.c:54: self->prevNoiseU16[i] = self->prevNoiseU32[i] >> (self->prevQNoise + 11); On 2016/03/30 14:51:55, turaj wrote: > On 2016/03/30 13:44:46, peah-webrtc wrote: > > This change transitions from a noise level of 32 bits with adaptive Q value to > a > > 16 bit value with a fixed Q value of 11, right? > > > > How has it been verified that this value is sufficient to avoid saturation/ > > ensure a proper level? > > > > An alternative would be to convert the spectrum to floating point (Q0) as it > is > > anyway only used in floating point format. > > Is the new Q value of noise level 11 or -11? I suppose you want to keep this > interface in fixed-point because it is fixed-point implementation of the noise > suppressor, however, currently the only module using it is IE which is in > floating point. So I really don't see of going through some scaling here that is > not meaningful for the caller. > > I suppose there is a top level API that calls either fixed-point or > floating-point NS, depending on the build. I suggest that a conversion from > fixed-point (with adaptive Q number) to float has to happen in that top-level > API. Then, this API would return the noise level as it is with the associated Q > value, and let the top level API do a safe fixed-point to floating-point > conversion. > > This way, the method will be const, as henrik suggested, and we do not need > extra buffer, as Per concerned. > > IMPORTANT: there should be a agreement that whether the noise level is computed > for int16 range of input signal or [-1 1] range. > I agree with turaj, that we want to keep the interface fixed-point, since this is the fixed-point implementation. So as suggested by turaj I moved the normalization to the noise_suppression_impl. I tried initially to avoid that because it means adding an API to get the dynamical domain and have the caller handle something that should be implementation details of the component. But the bright side is that now the normalization is done safely after converting to float and no additional buffer is necessary. And of course, then the method can be const, as turaj pointed out.
lgtm
LGTM with comments to consider. https://codereview.webrtc.org/1821443003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/noise_suppression_impl.cc (right): https://codereview.webrtc.org/1821443003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/noise_suppression_impl.cc:190: const uint32_t* noise = WebRtcNsx_noise_estimate(suppressor->state()); Just a suggestion, up to you if you want to change, noise estimate can be implemented const uint32_t* WebRtcNsx_noise_estimate(const ns_state*, int* domain) So there will be only one function call. As the domain is adaptive anytime one asks for the noise estimate has to ask for domain as well. And it avoids making mistake of not accounting for the domain. https://codereview.webrtc.org/1821443003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.h (right): https://codereview.webrtc.org/1821443003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.h:96: const uint32_t* WebRtcNsx_noise_estimate(NsxHandle* nsxInst); If you decide to keep this as it is (not going with my suggestion to output domain in this function) then please add a comment that the domain is adaptive and can be retrieved by calling the following API.
peah, do you have any additional comments? https://codereview.webrtc.org/1821443003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/noise_suppression_impl.cc (right): https://codereview.webrtc.org/1821443003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/noise_suppression_impl.cc:190: const uint32_t* noise = WebRtcNsx_noise_estimate(suppressor->state()); On 2016/03/31 15:20:48, turaj wrote: > Just a suggestion, up to you if you want to change, noise estimate can be > implemented > > const uint32_t* WebRtcNsx_noise_estimate(const ns_state*, int* domain) > > So there will be only one function call. As the domain is adaptive anytime one > asks for the noise estimate has to ask for domain as well. And it avoids making > mistake of not accounting for the domain. In general I think it can be confusing when passing in an int*, because it is not clear if it is an input or output. And also, I tried to keep both interfaces, fixed and float, the most similar possible. But I see your point, that it is really easy for a user of this component to forget to get the domain and normalize if it is a separate function, so I implemented your suggestion. https://codereview.webrtc.org/1821443003/diff/40001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.h (right): https://codereview.webrtc.org/1821443003/diff/40001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.h:96: const uint32_t* WebRtcNsx_noise_estimate(NsxHandle* nsxInst); On 2016/03/31 15:20:48, turaj wrote: > If you decide to keep this as it is (not going with my suggestion to output > domain in this function) then please add a comment that the domain is adaptive > and can be retrieved by calling the following API. I accepted your suggestion and also added the respective comment here.
https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:145: const float power_target = std::accumulate( Is this related to the normalization of the noise? https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/noise_suppression_impl.cc (right): https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/noise_suppression_impl.cc:184: noise_estimate[i] += noise[i] / suppressors_.size(); In order to avoid the divisions, please precompute 1/suppressors_.size() and use that to multiply with. https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/noise_suppression_impl.cc:195: noise_estimate[i] += kNormalizationFactor * In order to avoid the divisions, 1/suppressors_.size() should be included in normalization factor, which would also eliminate the need for the cast. https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.c (right): https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.c:55: *domain += self->prevQNoise; What is the Q value of prevNoiseU32? From the name I thought it would be self->prevQNoise but the value in domain is self->prevQNoise+11 https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.h (right): https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.h:91: * - domain : Domain of the noise estimate, which is the number of Why is this called domain? It is the Q-value, right? I think that it would be more clear just to call it something with Q (Q_noise or something similar). https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.h:99: const uint32_t* WebRtcNsx_noise_estimate(const NsxHandle* nsxInst, int* domain); I think it is great to both get the spectrum and the Q value at the same time, but it is a bit confusing having them returned in different ways in the function call.
https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc (right): https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:145: const float power_target = std::accumulate( On 2016/03/31 20:50:57, peah-webrtc wrote: > Is this related to the normalization of the noise? Yes, I need to accumulate over the normalized signal. But also, it makes more sense to accumulate on the ERB bands for the target power, since later the gains are applied and the power to compare is calculated on that domain. https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/noise_suppression_impl.cc (right): https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/noise_suppression_impl.cc:184: noise_estimate[i] += noise[i] / suppressors_.size(); On 2016/03/31 20:50:57, peah-webrtc wrote: > In order to avoid the divisions, please precompute 1/suppressors_.size() and use > that to multiply with. Good point. Done. https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/noise_suppression_impl.cc:195: noise_estimate[i] += kNormalizationFactor * On 2016/03/31 20:50:57, peah-webrtc wrote: > In order to avoid the divisions, 1/suppressors_.size() should be included in > normalization factor, which would also eliminate the need for the cast. Good point. Done. https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.c (right): https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.c:55: *domain += self->prevQNoise; On 2016/03/31 20:50:57, peah-webrtc wrote: > What is the Q value of prevNoiseU32? From the name I thought it would be > self->prevQNoise but the value in domain is self->prevQNoise+11 Yes, I agree that it is confusing, but it is in fact self->prevQNoise+11, which can be seen in the documentation of nsx_core. https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/ns/noise_suppression_x.h (right): https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.h:91: * - domain : Domain of the noise estimate, which is the number of On 2016/03/31 20:50:57, peah-webrtc wrote: > Why is this called domain? It is the Q-value, right? I think that it would be > more clear just to call it something with Q (Q_noise or something similar). I couldn't come up with a better naming, but I agree it is not clear. It is in fact the q-value. Accepted your suggestion. https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/ns/noise_suppression_x.h:99: const uint32_t* WebRtcNsx_noise_estimate(const NsxHandle* nsxInst, int* domain); On 2016/03/31 20:50:57, peah-webrtc wrote: > I think it is great to both get the spectrum and the Q value at the same time, > but it is a bit confusing having them returned in different ways in the function > call. As I said in turaj's comment, I also find this way of returning things slightly confusing, but I see how it forces the user to not forget about normalizing. I can go back to my original solution of having a separate API for the q-value if you want, but it is hard to make all reviewers happy when they have opposite opinions. If you have a better idea, please suggest a specific solution, but keep in mind that this is a C (not C++) file.
On 2016/04/01 01:52:42, aluebs-webrtc wrote: > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc > (right): > > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:145: > const float power_target = std::accumulate( > On 2016/03/31 20:50:57, peah-webrtc wrote: > > Is this related to the normalization of the noise? > > Yes, I need to accumulate over the normalized signal. But also, it makes more > sense to accumulate on the ERB bands for the target power, since later the gains > are applied and the power to compare is calculated on that domain. > > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > File webrtc/modules/audio_processing/noise_suppression_impl.cc (right): > > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > webrtc/modules/audio_processing/noise_suppression_impl.cc:184: noise_estimate[i] > += noise[i] / suppressors_.size(); > On 2016/03/31 20:50:57, peah-webrtc wrote: > > In order to avoid the divisions, please precompute 1/suppressors_.size() and > use > > that to multiply with. > > Good point. Done. > > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > webrtc/modules/audio_processing/noise_suppression_impl.cc:195: noise_estimate[i] > += kNormalizationFactor * > On 2016/03/31 20:50:57, peah-webrtc wrote: > > In order to avoid the divisions, 1/suppressors_.size() should be included in > > normalization factor, which would also eliminate the need for the cast. > > Good point. Done. > > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > File webrtc/modules/audio_processing/ns/noise_suppression_x.c (right): > > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > webrtc/modules/audio_processing/ns/noise_suppression_x.c:55: *domain += > self->prevQNoise; > On 2016/03/31 20:50:57, peah-webrtc wrote: > > What is the Q value of prevNoiseU32? From the name I thought it would be > > self->prevQNoise but the value in domain is self->prevQNoise+11 > > Yes, I agree that it is confusing, but it is in fact self->prevQNoise+11, which > can be seen in the documentation of nsx_core. > > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > File webrtc/modules/audio_processing/ns/noise_suppression_x.h (right): > > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > webrtc/modules/audio_processing/ns/noise_suppression_x.h:91: * - domain > : Domain of the noise estimate, which is the number of > On 2016/03/31 20:50:57, peah-webrtc wrote: > > Why is this called domain? It is the Q-value, right? I think that it would be > > more clear just to call it something with Q (Q_noise or something similar). > > I couldn't come up with a better naming, but I agree it is not clear. It is in > fact the q-value. Accepted your suggestion. > > https://codereview.webrtc.org/1821443003/diff/60001/webrtc/modules/audio_proc... > webrtc/modules/audio_processing/ns/noise_suppression_x.h:99: const uint32_t* > WebRtcNsx_noise_estimate(const NsxHandle* nsxInst, int* domain); > On 2016/03/31 20:50:57, peah-webrtc wrote: > > I think it is great to both get the spectrum and the Q value at the same time, > > but it is a bit confusing having them returned in different ways in the > function > > call. > > As I said in turaj's comment, I also find this way of returning things slightly > confusing, but I see how it forces the user to not forget about normalizing. I > can go back to my original solution of having a separate API for the q-value if > you want, but it is hard to make all reviewers happy when they have opposite > opinions. If you have a better idea, please suggest a specific solution, but > keep in mind that this is a C (not C++) file. Nice!!! lgtm but with the disclaimer that I cannot vouch for the change in IntelligibilityEnhance::ProcessAudioBlock as I don't really understand the underlying algorithm yet.
The CQ bit was checked by aluebs@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from henrik.lundin@webrtc.org, turaj@webrtc.org Link to the patchset: https://codereview.webrtc.org/1821443003/#ps80001 (title: "Avoid divisions")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1821443003/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1821443003/80001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_asan on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/linux_asan/builds/13649)
I am disabling all the noise suppression bit-exactness tests for a couple of reasons: * I have no idea how to update the ARM and ARM64 values, it would be great to add some documentation on that. * On top of changing the noise estimate reference (which makes sense) it seems to change the output reference, which I can't see why it could have changed. This exactly the reason I am opposed to these bit-exactness unit-tests. peah, if you could have a look at it and re-enable, that would be great.
The CQ bit was checked by aluebs@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from henrik.lundin@webrtc.org, peah@webrtc.org, turaj@webrtc.org Link to the patchset: https://codereview.webrtc.org/1821443003/#ps120001 (title: "Disable noise suppressor bit-exactness unittests")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1821443003/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1821443003/120001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: android_arm64_rel on tryserver.webrtc (JOB_TIMED_OUT, no build URL)
Message was sent while issue was closed.
Description was changed from ========== Fix normalization of noise estimate in NoiseSuppressor ========== to ========== Fix normalization of noise estimate in NoiseSuppressor R=henrik.lundin@webrtc.org, peah@webrtc.org, turaj@webrtc.org Committed: https://crrev.com/3b149960466f08b5351f5d55a54d517838d7310d Cr-Commit-Position: refs/heads/master@{#12201} ==========
Message was sent while issue was closed.
Patchset 7 (id:??) landed as https://crrev.com/3b149960466f08b5351f5d55a54d517838d7310d Cr-Commit-Position: refs/heads/master@{#12201}
Message was sent while issue was closed.
Description was changed from ========== Fix normalization of noise estimate in NoiseSuppressor R=henrik.lundin@webrtc.org, peah@webrtc.org, turaj@webrtc.org Committed: https://crrev.com/3b149960466f08b5351f5d55a54d517838d7310d Cr-Commit-Position: refs/heads/master@{#12201} ========== to ========== Fix normalization of noise estimate in NoiseSuppressor R=henrik.lundin@webrtc.org, peah@webrtc.org, turaj@webrtc.org Committed: https://chromium.googlesource.com/external/webrtc/+/3b149960466f08b5351f5d55a... ==========
Message was sent while issue was closed.
Committed patchset #7 (id:120001) manually as 3b149960466f08b5351f5d55a54d517838d7310d (presubmit successful). |