Issue 1878133002: Disable Intelligibility Enhancer for high SNRs

Issue 1878133002: Disable Intelligibility Enhancer for high SNRs (Closed)

Created:
4 years, 8 months ago by aluebs-webrtc

Modified:
4 years, 8 months ago

Reviewers:
hlundin-webrtc, peah-webrtc, turaj

CC:
webrtc-reviews_webrtc.org, peah-webrtc, Andrew MacDonald, tterriberry_mozilla.com, audio-team_agora.io, hlundin-webrtc, kwiberg-webrtc, minyue-webrtc, the sun, bjornv1

Base URL:
https://chromium.googlesource.com/external/webrtc.git@master

Target Ref:
refs/pending/heads/master

Project:
webrtc

Visibility:
Public.

More Reviews

Description

Disable Intelligibility Enhancer for high SNRs Committed: https://crrev.com/2fae89ed0d7dd54d4649b6dcbf5a6f0a33804469 Cr-Commit-Position: refs/heads/master@{#12352}

Patch Set 1 #

Total comments: 6

Patch Set 2 : Naming #

Patch Set 3 : Fix division by zero #

Created: 4 years, 8 months ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+54 lines, -18 lines)			Patch
	M	webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.h	View	1	2 chunks	+6 lines, -0 lines	0 comments	Download
	M	webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc	View	1 2	3 chunks	+48 lines, -18 lines	0 comments	Download

Messages

Total messages: 19 (7 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

peah-webrtc

lgtm, but with the additional comment that I have concerns about whether it makes sense ...

4 years, 8 months ago (2016-04-12 13:39:21 UTC) #3

lgtm, but with the additional comment that I have concerns about whether it
makes sense to control the IE effect based on the estimate of the digital SNR
without having any mapping from this to the acoustic SNR at the ear of the
listener.

https://codereview.webrtc.org/1878133002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1878133002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:168:
void IntelligibilityEnhancer::UpdateActivity() {
What you are updating here is the is_active flag and gains, right? And the
is_active flag is a flag for whether the effect of the IE should be active or
not, and not for whether there is activity in the render signal, right?

If I did not get that wrong, I think this method should be modified or renamed.
since to me, activity means speech activity.
What about renaming it to ControlEffectApplication, or SnrBasedEffectActivation
which in my mind describe in more detail what is being done.

https://codereview.webrtc.org/1878133002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:175:
snr_ = kDecayRate * snr_ + (1.f - kDecayRate) * clear_power / noise_power;
This SNR estimate is an average of the instantaneous SNR. An alternative could
be to use the average of the overall SNR. Have you considered that (I'm not
saying this is wrong).

https://codereview.webrtc.org/1878133002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:175:
snr_ = kDecayRate * snr_ + (1.f - kDecayRate) * clear_power / noise_power;
This SNR estimate is assuming that the ratio of the clear_psd and noise_psd
matches the ratio at the ear of the listener.  What happens if the listener is
using headphones? Then this ratio is very different from what the SNR is at the
ear of the listener. The same is the case if a device is used in speaker mode.

I don't really see any point in adding functionality for tuning the application
of the IE effect based on the digital SNR until a mapping is in place to map
this to the acoustic SNR at the ear. 

But I'm fine with the code change.

aluebs-webrtc

hlundin, turaj, could you please take a look when you have some time? Thanks! :) ...

4 years, 8 months ago (2016-04-12 18:34:28 UTC) #4

hlundin, turaj, could you please take a look when you have some time? Thanks! :)

https://codereview.webrtc.org/1878133002/diff/1/webrtc/modules/audio_processi...
File webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc
(right):

https://codereview.webrtc.org/1878133002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:168:
void IntelligibilityEnhancer::UpdateActivity() {
On 2016/04/12 13:39:21, peah-webrtc wrote:
> What you are updating here is the is_active flag and gains, right? And the
> is_active flag is a flag for whether the effect of the IE should be active or
> not, and not for whether there is activity in the render signal, right?
> 
> If I did not get that wrong, I think this method should be modified or
renamed.
> since to me, activity means speech activity.
> What about renaming it to ControlEffectApplication, or
SnrBasedEffectActivation
> which in my mind describe in more detail what is being done.
> 
> 

Yes, your understanding is completely right. And I agree that your naming
suggestion is more intuitive. Done.

https://codereview.webrtc.org/1878133002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:175:
snr_ = kDecayRate * snr_ + (1.f - kDecayRate) * clear_power / noise_power;
On 2016/04/12 13:39:21, peah-webrtc wrote:
> This SNR estimate is an average of the instantaneous SNR. An alternative could
> be to use the average of the overall SNR. Have you considered that (I'm not
> saying this is wrong).

That is an interesting point. Because the PSDs are already filtered over time
already, this is not exactly an average of the instantaneous SNR, but more of an
average of an averaged SNR, if that makes some sense. This additional filtering
is just to ensure its smoothness.

https://codereview.webrtc.org/1878133002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/intelligibility/intelligibility_enhancer.cc:175:
snr_ = kDecayRate * snr_ + (1.f - kDecayRate) * clear_power / noise_power;
On 2016/04/12 13:39:21, peah-webrtc wrote:
> This SNR estimate is assuming that the ratio of the clear_psd and noise_psd
> matches the ratio at the ear of the listener.  What happens if the listener is
> using headphones? Then this ratio is very different from what the SNR is at
the
> ear of the listener. The same is the case if a device is used in speaker mode.
> 
> I don't really see any point in adding functionality for tuning the
application
> of the IE effect based on the digital SNR until a mapping is in place to map
> this to the acoustic SNR at the ear. 
> 
> But I'm fine with the code change.

As discussed offline at the beginning of this project, with some broad
assumptions we can estimate acoustic SNRs from the digital one good enough for
the IE to improve the intelligibility of the signal. But also, I am testing
right now on a real device if this holds true and at the same time working on a
mapping I suggested to see if it improves the relation between the SNRs.
On the other hand, what we decided was to enable this feature first only for
headphones and phone mode (no speaker phone), so we can focus on that and delay
the additional tweaking to later on the process.
I think this code is valuable as of today, but I agree that the thresholds will
need to be adjusted if we apply any mapping.

turaj

I thought the method you added should be called within intelligibility to automatically switch the ...

4 years, 8 months ago (2016-04-12 19:33:30 UTC) #5

aluebs-webrtc

On 2016/04/12 19:33:30, turaj wrote: > I thought the method you added should be called ...

4 years, 8 months ago (2016-04-12 21:10:19 UTC) #6

turaj

Sorry, I missed that line. I was almost sure I was missing something. LGTM.

4 years, 8 months ago (2016-04-12 21:46:21 UTC) #7

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1878133002/20001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1878133002/20001

4 years, 8 months ago (2016-04-13 16:27:56 UTC) #11

commit-bot: I haz the power

Try jobs failed on following builders: linux_ubsan on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/linux_ubsan/builds/1194)

4 years, 8 months ago (2016-04-13 16:47:30 UTC) #13

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1878133002/40001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1878133002/40001

4 years, 8 months ago (2016-04-13 17:24:17 UTC) #16

commit-bot: I haz the power

4 years, 8 months ago (2016-04-13 18:24:19 UTC) #19

Message was sent while issue was closed.

Patchset 3 (id:??) landed as
https://crrev.com/2fae89ed0d7dd54d4649b6dcbf5a6f0a33804469
Cr-Commit-Position: refs/heads/master@{#12352}

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages