Issue 2750413002: Improve stability of the echo detector complexity perf tests.

ivoc

Description was changed from ========== Improve stability of the echo detector complexity perf tests. The ...

3 years, 9 months ago (2017-03-16 17:02:16 UTC) #1

ivoc

Description was changed from ========== Improve stability of the echo detector complexity perf tests. The ...

3 years, 9 months ago (2017-03-16 17:07:21 UTC) #2

ivoc

ivoc@webrtc.org changed reviewers: + henrik.lundin@webrtc.org

3 years, 9 months ago (2017-03-16 17:07:22 UTC) #3

ivoc

Description was changed from ========== Improve stability of the echo detector complexity perf tests. The ...

3 years, 9 months ago (2017-03-16 17:08:01 UTC) #5

hlundin-webrtc

Nice. But I have some suggestions for you. https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc File webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc (right): https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc#newcode29 webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:29: const ...

3 years, 9 months ago (2017-03-17 08:48:03 UTC) #6

Nice. But I have some suggestions for you.

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
File
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc
(right):

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:29:
const size_t kNumFramesToProcess = 20000;
Nit: constexprs, please.

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:49:
test::PerformanceTimer timer(kNumFramesToProcessStandalone /
I suggest you modify PerformanceTimer to help you with the warm-up. You could
either set the warm-up number in the ctor and let it just ignore those first
samples, or add a method where you prune the first samples after the measurement
is done. I think I prefer the former, but can be persuaded otherwise...

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:62:
frame_no % kProcessingBatchSizeStandalone == 0) {
I don't know if this will skew the measurement, but the modulo operator is
surprisingly expensive. You could for instance use the following instead:

// Before the for loop
size_t next_start_frame = kWarmupBatchSizeStandalone;

// In the for loop
if (frame_no == next_start_frame) {
  time.StartTimer();
  next_start_frame += kProcessingBatchSizeStandalone;
}

// ...

if (frame_no == next_start_frame - 1 && 
    frame_no > kWarmupBatchSizeStandalone) {
  timer.StopTimer();
}

Of course, if you follow my suggestion for adding the warm-up to the
PerformanceTimer, you won't have to worry about that here.

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:81:
EXPECT_EQ(0.0f, sum);
Do we know that this will be exactly 0.0? Are there any uncertainties that would
motivate using EXPECT_FLOAT_EQ instead?

ivoc

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc File webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc (right): https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc#newcode29 webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:29: const size_t kNumFramesToProcess = 20000; On 2017/03/17 08:48:03, hlundin-webrtc ...

3 years, 9 months ago (2017-03-17 13:53:20 UTC) #7

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
File
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc
(right):

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:29:
const size_t kNumFramesToProcess = 20000;
On 2017/03/17 08:48:03, hlundin-webrtc wrote:
> Nit: constexprs, please.

Done.

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:49:
test::PerformanceTimer timer(kNumFramesToProcessStandalone /
On 2017/03/17 08:48:03, hlundin-webrtc wrote:
> I suggest you modify PerformanceTimer to help you with the warm-up. You could
> either set the warm-up number in the ctor and let it just ignore those first
> samples, or add a method where you prune the first samples after the
measurement
> is done. I think I prefer the former, but can be persuaded otherwise...

Good idea. I implemented it in a bit of a different way, by passing an argument
to the GetAverageDuration/GetStandardDeviation functions. That way we don't have
to store any extra state in the PerformanceTimer, I don't have to actually
delete any of the measurements stored in there and none of the other code that
uses it has to be modified. Is that okay?

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:62:
frame_no % kProcessingBatchSizeStandalone == 0) {
On 2017/03/17 08:48:03, hlundin-webrtc wrote:
> I don't know if this will skew the measurement, but the modulo operator is
> surprisingly expensive. You could for instance use the following instead:
> 
> // Before the for loop
> size_t next_start_frame = kWarmupBatchSizeStandalone;
> 
> // In the for loop
> if (frame_no == next_start_frame) {
>   time.StartTimer();
>   next_start_frame += kProcessingBatchSizeStandalone;
> }
> 
> // ...
> 
> if (frame_no == next_start_frame - 1 && 
>     frame_no > kWarmupBatchSizeStandalone) {
>   timer.StopTimer();
> }
> 
> Of course, if you follow my suggestion for adding the warm-up to the
> PerformanceTimer, you won't have to worry about that here.

Although I agree that modulo operations are pretty expensive (should be similar
to integer division), I don't think it should be very significant compared to
all of the calculations that are happening in the echo detector (which include
plenty of modulo operations, sqrts and divisions). 
Also, if the modulo operator is equally slow in each iteration, it will not
affect the usefulness of this perf test.

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:81:
EXPECT_EQ(0.0f, sum);
On 2017/03/17 08:48:03, hlundin-webrtc wrote:
> Do we know that this will be exactly 0.0? Are there any uncertainties that
would
> motivate using EXPECT_FLOAT_EQ instead?

In this test both signals are filled with zeros, so a non-zero result would be
highly unexpected. The reason I added this is that I was afraid that if I don't
use the sum it may get optimized out which means the entire benchmark loop could
be optimized out.

hlundin-webrtc

Good. Now some comments on performance_timer.cc. https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc File webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc (right): https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc#newcode49 webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:49: test::PerformanceTimer timer(kNumFramesToProcessStandalone / ...

3 years, 9 months ago (2017-03-17 14:16:37 UTC) #8

Good. Now some comments on performance_timer.cc.

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
File
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc
(right):

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:49:
test::PerformanceTimer timer(kNumFramesToProcessStandalone /
On 2017/03/17 13:53:20, ivoc wrote:
> On 2017/03/17 08:48:03, hlundin-webrtc wrote:
> > I suggest you modify PerformanceTimer to help you with the warm-up. You
could
> > either set the warm-up number in the ctor and let it just ignore those first
> > samples, or add a method where you prune the first samples after the
> measurement
> > is done. I think I prefer the former, but can be persuaded otherwise...
> 
> Good idea. I implemented it in a bit of a different way, by passing an
argument
> to the GetAverageDuration/GetStandardDeviation functions. That way we don't
have
> to store any extra state in the PerformanceTimer, I don't have to actually
> delete any of the measurements stored in there and none of the other code that
> uses it has to be modified. Is that okay?

Good solution!

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:62:
frame_no % kProcessingBatchSizeStandalone == 0) {
On 2017/03/17 13:53:20, ivoc wrote:
> On 2017/03/17 08:48:03, hlundin-webrtc wrote:
> > I don't know if this will skew the measurement, but the modulo operator is
> > surprisingly expensive. You could for instance use the following instead:
> > 
> > // Before the for loop
> > size_t next_start_frame = kWarmupBatchSizeStandalone;
> > 
> > // In the for loop
> > if (frame_no == next_start_frame) {
> >   time.StartTimer();
> >   next_start_frame += kProcessingBatchSizeStandalone;
> > }
> > 
> > // ...
> > 
> > if (frame_no == next_start_frame - 1 && 
> >     frame_no > kWarmupBatchSizeStandalone) {
> >   timer.StopTimer();
> > }
> > 
> > Of course, if you follow my suggestion for adding the warm-up to the
> > PerformanceTimer, you won't have to worry about that here.
> 
> Although I agree that modulo operations are pretty expensive (should be
similar
> to integer division), I don't think it should be very significant compared to
> all of the calculations that are happening in the echo detector (which include
> plenty of modulo operations, sqrts and divisions). 
> Also, if the modulo operator is equally slow in each iteration, it will not
> affect the usefulness of this perf test. 

Acknowledged.

https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi...
webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:81:
EXPECT_EQ(0.0f, sum);
On 2017/03/17 13:53:20, ivoc wrote:
> On 2017/03/17 08:48:03, hlundin-webrtc wrote:
> > Do we know that this will be exactly 0.0? Are there any uncertainties that
> would
> > motivate using EXPECT_FLOAT_EQ instead?
> 
> In this test both signals are filled with zeros, so a non-zero result would be
> highly unexpected. The reason I added this is that I was afraid that if I
don't
> use the sum it may get optimized out which means the entire benchmark loop
could
> be optimized out.

Acknowledged.

https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc...
File webrtc/modules/audio_processing/test/performance_timer.cc (right):

https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/test/performance_timer.cc:47: int
number_of_warmup_samples) const {
This should be a size_t, imo.

https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/test/performance_timer.cc:48: const int
number_of_samples =
RTC_DCHECK_GT(timestamps_us_.size(), number_of_warmup_samples);
size_t number_of_samples = ...

https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/test/performance_timer.cc:50:
RTC_DCHECK_GT(number_of_samples, 0);
... and skip this.

https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/test/performance_timer.cc:53:
timestamps_us_.end(), 0)) /
Is it a problem that the initial value is an int literal? If so, the problem is
the same in the old code.

https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc...
webrtc/modules/audio_processing/test/performance_timer.cc:57: double
PerformanceTimer::GetDurationStandardDeviation(
Essentially the same comments as above.

ivoc

https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_processing/test/performance_timer.cc File webrtc/modules/audio_processing/test/performance_timer.cc (right): https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_processing/test/performance_timer.cc#newcode47 webrtc/modules/audio_processing/test/performance_timer.cc:47: int number_of_warmup_samples) const { On 2017/03/17 14:16:37, hlundin-webrtc wrote: ...

3 years, 9 months ago (2017-03-17 15:14:48 UTC) #9

ivoc

https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_processing/test/performance_timer.cc File webrtc/modules/audio_processing/test/performance_timer.cc (right): https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_processing/test/performance_timer.cc#newcode64 webrtc/modules/audio_processing/test/performance_timer.cc:64: double variance = std::accumulate( On 2017/03/17 15:14:47, ivoc wrote: ...

3 years, 9 months ago (2017-03-17 15:16:35 UTC) #10