Issue 1542573002: Calculate audio levels in AEC in time domain.

minyue-webrtc

Description was changed from ========== Merge branch 'master' into UpdateLevelInTime Merge branch 'ace_farend_timedomain' into UpdateLevelInTime ...

5 years ago (2015-12-21 11:22:12 UTC) #1

minyue-webrtc

Hi Per, I changed the way of calculating audio level in AEC from frequency domain ...

5 years ago (2015-12-21 13:02:14 UTC) #4

peah-webrtc

https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_processing/aec/aec_core.c File webrtc/modules/audio_processing/aec/aec_core.c (right): https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_processing/aec/aec_core.c#newcode575 webrtc/modules/audio_processing/aec/aec_core.c:575: return energy / num_samples; I'm a bit concerned with ...

5 years ago (2015-12-22 10:54:32 UTC) #5

minyue-webrtc

https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_processing/aec/aec_core.c File webrtc/modules/audio_processing/aec/aec_core.c (right): https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_processing/aec/aec_core.c#newcode575 webrtc/modules/audio_processing/aec/aec_core.c:575: return energy / num_samples; On 2015/12/22 10:54:32, peah-webrtc wrote: ...

5 years ago (2015-12-22 11:20:01 UTC) #6

https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/aec/aec_core.c (right):

https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/aec/aec_core.c:575: return energy / num_samples;
On 2015/12/22 10:54:32, peah-webrtc wrote:
> I'm a bit concerned with this computation, while it is an accurate power
> computation, it differs very much from how it was used before in the sense
that:
> 1) Before it was an energy computation, now it is a power computation.
> 2) The energy was not correctly computed before, and neither does not match
the
> way the power is computed.
> 
> While this may be fine, I would like a more clear explanation to why the
current
> implementation differs, why it is ok that it does that, and what any potential
> differences would be.
> 
> 
> 
> In matlab:
> x = [1:8];
> X = fft(x);
> true_time_domain_power = sum(x.^2)/8
> true_time_domain_energy = sum(x.^2)
> true_frequency_domain_energy = sum(abs(X).^2)/8
> previous_frequency_domain_energy_approximation = (sum(abs(X(1:4).^2)) +
> abs(X(5).^2))/8
> 
> To summarize:
> -In the new function, you compute
> true_time_domain_power
> -Before 
> previous_frequency_domain_energy_approximation
> was used

Your finding is very true, and that is the reason, if you look at the patch set
1, the power I calculate now is (2.0f / length) times the old value. 2.0f is
because that old calculation is half the energy, which is unnatural.

Using power (rather than energy of a predefined length) for various audio levels
makes more sense to me.

The only problem is that these audio levels will have different values. But
looking at where they are used, we can see that they all are used in a form of
"A / B" to obtain various metrics. And that is why normalizing these audio
levels in this CL does not cause any unittest to fail. I have tested using
different normalization on A and B, and by that our unittest fails.

peah-webrtc

https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_processing/aec/aec_core.c File webrtc/modules/audio_processing/aec/aec_core.c (right): https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_processing/aec/aec_core.c#newcode575 webrtc/modules/audio_processing/aec/aec_core.c:575: return energy / num_samples; On 2015/12/22 11:20:01, minyue-webrtc wrote: ...

4 years, 11 months ago (2016-01-08 13:12:59 UTC) #7

https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/aec/aec_core.c (right):

https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/aec/aec_core.c:575: return energy / num_samples;
On 2015/12/22 11:20:01, minyue-webrtc wrote:
> On 2015/12/22 10:54:32, peah-webrtc wrote:
> > I'm a bit concerned with this computation, while it is an accurate power
> > computation, it differs very much from how it was used before in the sense
> that:
> > 1) Before it was an energy computation, now it is a power computation.
> > 2) The energy was not correctly computed before, and neither does not match
> the
> > way the power is computed.
> > 
> > While this may be fine, I would like a more clear explanation to why the
> current
> > implementation differs, why it is ok that it does that, and what any
potential
> > differences would be.
> > 
> > 
> > 
> > In matlab:
> > x = [1:8];
> > X = fft(x);
> > true_time_domain_power = sum(x.^2)/8
> > true_time_domain_energy = sum(x.^2)
> > true_frequency_domain_energy = sum(abs(X).^2)/8
> > previous_frequency_domain_energy_approximation = (sum(abs(X(1:4).^2)) +
> > abs(X(5).^2))/8
> > 
> > To summarize:
> > -In the new function, you compute
> > true_time_domain_power
> > -Before 
> > previous_frequency_domain_energy_approximation
> > was used
> 
> Your finding is very true, and that is the reason, if you look at the patch
set
> 1, the power I calculate now is (2.0f / length) times the old value. 2.0f is
> because that old calculation is half the energy, which is unnatural.
> 
> Using power (rather than energy of a predefined length) for various audio
levels
> makes more sense to me.
> 
> The only problem is that these audio levels will have different values. But
> looking at where they are used, we can see that they all are used in a form of
> "A / B" to obtain various metrics. And that is why normalizing these audio
> levels in this CL does not cause any unittest to fail. I have tested using
> different normalization on A and B, and by that our unittest fails.
> 
> 
> 

I totally agree that it is better to use the power measure. And it is also good
that the unittests do not fail with this change.

I have some concerns, however:

1) Do you have a good understanding of the unittest coverage? Is it sufficient
to ensure that the fact that the unittests don't fail stands as a good guarantee
that this is a valid change?

2) Regarding the former validation code (that was removed by this patch):
float power = CalculatePower(e, PART_LEN) * PART_LEN2 / 4.0f;	      
assert(fabs(CalculatePowerOld(e_fft) - power) <= power * kEpsilonRatio);	
UpdateLevel(linout_level, power);

What kind of test data did you validate it on? CalculatePowerOld(e_fft) 
and 
CalculatePower(e, PART_LEN) * PART_LEN2 / 4.0f 
computes the powers based on two different data sets, one 64 and the other 128
samples long, and this is correctly compensated for using the scaling of the
output of CalculatePower. I cannot, however, see how it can be ensured that the
difference could be lower than kEpsilonRatio as the data sets are different
(albeit overlapping). I would expect the assert to trigger on any speech signal.

Are you sure you have tested it with speech data using aec->metricsMode == 1?

3) Regarding the statement that they are all used in a form of A/B I found one
place where that is not the case: line 620
if (aec->farlevel.minlevel < noisyPower) {
noisePower needs to be corrected to ensure that that statement is correct with
the new code.

2)

https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/aec/aec_core.c:1291: if (aec->metricsMode == 1)
{
On 2015/12/22 10:54:32, peah-webrtc wrote:
> I think it is better to bundle the UpdateLevel calls to happen in the same
place
> as it improves readability. It will also save some lines in the code.

PTAL

minyue-webrtc

On 2016/01/08 13:12:59, peah-webrtc wrote: > https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_processing/aec/aec_core.c > File webrtc/modules/audio_processing/aec/aec_core.c (right): > > https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_processing/aec/aec_core.c#newcode575 > ...

4 years, 11 months ago (2016-01-08 13:52:22 UTC) #8

On 2016/01/08 13:12:59, peah-webrtc wrote:
>
https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_proc...
> File webrtc/modules/audio_processing/aec/aec_core.c (right):
> 
>
https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_proc...
> webrtc/modules/audio_processing/aec/aec_core.c:575: return energy /
num_samples;
> On 2015/12/22 11:20:01, minyue-webrtc wrote:
> > On 2015/12/22 10:54:32, peah-webrtc wrote:
> > > I'm a bit concerned with this computation, while it is an accurate power
> > > computation, it differs very much from how it was used before in the sense
> > that:
> > > 1) Before it was an energy computation, now it is a power computation.
> > > 2) The energy was not correctly computed before, and neither does not
match
> > the
> > > way the power is computed.
> > > 
> > > While this may be fine, I would like a more clear explanation to why the
> > current
> > > implementation differs, why it is ok that it does that, and what any
> potential
> > > differences would be.
> > > 
> > > 
> > > 
> > > In matlab:
> > > x = [1:8];
> > > X = fft(x);
> > > true_time_domain_power = sum(x.^2)/8
> > > true_time_domain_energy = sum(x.^2)
> > > true_frequency_domain_energy = sum(abs(X).^2)/8
> > > previous_frequency_domain_energy_approximation = (sum(abs(X(1:4).^2)) +
> > > abs(X(5).^2))/8
> > > 
> > > To summarize:
> > > -In the new function, you compute
> > > true_time_domain_power
> > > -Before 
> > > previous_frequency_domain_energy_approximation
> > > was used
> > 
> > Your finding is very true, and that is the reason, if you look at the patch
> set
> > 1, the power I calculate now is (2.0f / length) times the old value. 2.0f is
> > because that old calculation is half the energy, which is unnatural.
> > 
> > Using power (rather than energy of a predefined length) for various audio
> levels
> > makes more sense to me.
> > 
> > The only problem is that these audio levels will have different values. But
> > looking at where they are used, we can see that they all are used in a form
of
> > "A / B" to obtain various metrics. And that is why normalizing these audio
> > levels in this CL does not cause any unittest to fail. I have tested using
> > different normalization on A and B, and by that our unittest fails.
> > 
> > 
> > 
> 
> I totally agree that it is better to use the power measure. And it is also
good
> that the unittests do not fail with this change.
> 
> I have some concerns, however:
> 
> 1) Do you have a good understanding of the unittest coverage? Is it sufficient
> to ensure that the fact that the unittests don't fail stands as a good
guarantee
> that this is a valid change?

Good point. I will consider trying more signals in the unittest.

> 
> 2) Regarding the former validation code (that was removed by this patch):
> float power = CalculatePower(e, PART_LEN) * PART_LEN2 / 4.0f;	      
> assert(fabs(CalculatePowerOld(e_fft) - power) <= power * kEpsilonRatio);	
> UpdateLevel(linout_level, power);
> 
> What kind of test data did you validate it on? CalculatePowerOld(e_fft) 
> and 
> CalculatePower(e, PART_LEN) * PART_LEN2 / 4.0f 
> computes the powers based on two different data sets, one 64 and the other 128
> samples long, and this is correctly compensated for using the scaling of the
> output of CalculatePower. I cannot, however, see how it can be ensured that
the
> difference could be lower than kEpsilonRatio as the data sets are different
> (albeit overlapping). I would expect the assert to trigger on any speech
signal.
> 
It holds because we know that e_fft is FFT on a zero padded version of e.


> Are you sure you have tested it with speech data using aec->metricsMode == 1?
Yes, changing kEpsilonRatio can make the test fail.

> 
> 
> 3) Regarding the statement that they are all used in a form of A/B I found one
> place where that is not the case: line 620
> if (aec->farlevel.minlevel < noisyPower) {
> noisePower needs to be corrected to ensure that that statement is correct with
> the new code.
good finding. It seems noisyPower was a heuristic large number. But any ways, I
will take caution there.

> 
> 2)
> 
>
https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_proc...
> webrtc/modules/audio_processing/aec/aec_core.c:1291: if (aec->metricsMode ==
1)
> {
> On 2015/12/22 10:54:32, peah-webrtc wrote:
> > I think it is better to bundle the UpdateLevel calls to happen in the same
> place
> > as it improves readability. It will also save some lines in the code.
> 
> PTAL

peah-webrtc

> > > It holds because we know that e_fft is FFT on a zero ...

4 years, 11 months ago (2016-01-08 13:59:16 UTC) #9

minyue-webrtc

On 2016/01/08 13:59:16, peah-webrtc wrote: > > > > > It holds because we know ...

4 years, 11 months ago (2016-01-14 09:48:51 UTC) #10

On 2016/01/08 13:59:16, peah-webrtc wrote:
> > > 
> > It holds because we know that e_fft is FFT on a zero padded version of e.
> > 
> 
> Ah, good point. Then it should be fine, and I expect that any need for
> kEpsilonRatio larger than 0 (approx) 
> should be due to computational inaccuracies in the FFT.

Thanks for the comments. I did a more careful check on how current unittests
protect the changes. I also did a normalization on |noisyPower| so that the
change should give the same result almost every time.

Unit tests (ApmTest.Process) have a good coverage on the metrics.
They run nontrivial far-end and near-end signals (resourcs/far**_stereo.pcm and
resourcs/near**_stereo.pcm) as input and check with stored references
(data/audio_processing/output_data_float.pb). The test can only pass when the
metrics equal to the reference.

~~~~
Why don't tests fail after this CL?
This CL makes only tiny changes to the audio levels (without normalization). The
changes is within 0.00001 times the original numbers, as shown in Patch set 1.

The metrics are converted to integer, and therefore, it is rare that the metrics
can change.

To verify that the test does a protection, I have tried to introduce some error
in the calculation, then the test fails.

~~~~
Why don't tests fail even when audio levels are normalized?
This is because the metrics are based on devision between audio levels. The
normalization factor cancels out.

To verify that the test does a protection, I have tried to normalize them with
different factors, then the test fails.

~~~~
Why don't tests fail even when |noisyPower| is not normalized accordingly?

|noisyPower| plays as a threshold for noise. The number is very large and is
thus indifferent from being normalized, in normal cases.

To make |noisyPower| effectively unchanged, I have applied the normalization as
on audio levels on it. Now it is true "noisePower" instead of "noiseEnergy"

In addition, to verify that the test does a protection, I have tried to make
|noisyPower| very small, then the test fails.

minyue-webrtc

BTW, putting level updates in one "if (aec->metricsMode == 1)" is also made. https://codereview.webrtc.org/1542573002/diff/40001/webrtc/modules/audio_processing/aec/aec_core.c File ...

4 years, 11 months ago (2016-01-14 16:53:51 UTC) #11

minyue-webrtc

On 2016/01/14 16:53:51, minyue-webrtc wrote: > BTW, putting level updates in one "if (aec->metricsMode == ...

4 years, 11 months ago (2016-01-21 16:58:33 UTC) #12

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1542573002/60001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1542573002/60001

4 years, 11 months ago (2016-01-22 12:46:47 UTC) #15

commit-bot: I haz the power

Description was changed from ========== Calculate audio levels in AEC in time domain. In AEC, ...

4 years, 11 months ago (2016-01-22 13:46:46 UTC) #16

commit-bot: I haz the power

Description was changed from ========== Calculate audio levels in AEC in time domain. In AEC, ...

4 years, 11 months ago (2016-01-22 13:46:54 UTC) #18

commit-bot: I haz the power

4 years, 11 months ago (2016-01-22 13:46:55 UTC) #19

Message was sent while issue was closed.

Patchset 3 (id:??) landed as
https://crrev.com/9846845da6ee88bf16cb5fc62c6839ed7aafe04c
Cr-Commit-Position: refs/heads/master@{#11357}

Issue 1542573002: Calculate audio levels in AEC in time domain. (Closed)

Description

Patch Set 1 #

Patch Set 2 : removing test code #

Patch Set 3 : normalizing |noisePower| #

Messages