|
|
Chromium Code Reviews|
Created:
3 years, 9 months ago by ivoc Modified:
3 years, 9 months ago Reviewers:
hlundin-webrtc CC:
webrtc-reviews_webrtc.org, AleBzk, peah-webrtc, Andrew MacDonald, aleloi, tterriberry_mozilla.com, audio-team_agora.io, hlundin-webrtc, kwiberg-webrtc, minyue-webrtc, the sun, aluebs-webrtc, bjornv1 Target Ref:
refs/heads/master Project:
webrtc Visibility:
Public. |
DescriptionImprove stability of the echo detector complexity perf tests.
The results of the echo detector complexity tests are currently notoriously spiky and unreliable. The following improvements are made in this CL:
- Significantly longer warmup time before starting the test
- More iterations and larger batches
- Different number of iterations for slow and fast tests
- Use the echo likelihood in the test so it cannot get optimized out
BUG=webrtc:7353
Review-Url: https://codereview.webrtc.org/2750413002
Cr-Commit-Position: refs/heads/master@{#17303}
Committed: https://chromium.googlesource.com/external/webrtc/+/1973ba6ef4909aad7c8bce307d8bf4180ce677fe
Patch Set 1 #
Total comments: 11
Patch Set 2 : Review comments. #
Total comments: 12
Patch Set 3 : More review comments. #Patch Set 4 : Fixed mistake in previous patchset. #Patch Set 5 : Changed int to size_t. #
Messages
Total messages: 21 (10 generated)
Description was changed from ========== Improve stability of the echo detector complexity perf tests. The results of the echo detector complexity tests are currently notoriously spiky and unreliable. The following improvements have been made: - Significantly longer warmup time before starting the test - More iterations and larger batches - Use the echo likelihood in the test so it cannot get optimized out BUG= ========== to ========== Improve stability of the echo detector complexity perf tests. The results of the echo detector complexity tests are currently notoriously spiky and unreliable. The following improvements have been made: - Significantly longer warmup time before starting the test - More iterations and larger batches - Different number of iterations for slow and fast tests - Use the echo likelihood in the test so it cannot get optimized out BUG= ==========
Description was changed from ========== Improve stability of the echo detector complexity perf tests. The results of the echo detector complexity tests are currently notoriously spiky and unreliable. The following improvements have been made: - Significantly longer warmup time before starting the test - More iterations and larger batches - Different number of iterations for slow and fast tests - Use the echo likelihood in the test so it cannot get optimized out BUG= ========== to ========== Improve stability of the echo detector complexity perf tests. The results of the echo detector complexity tests are currently notoriously spiky and unreliable. The following improvements have been made: - Significantly longer warmup time before starting the test - More iterations and larger batches - Different number of iterations for slow and fast tests - Use the echo likelihood in the test so it cannot get optimized out BUG=webrtc:7353 ==========
ivoc@webrtc.org changed reviewers: + henrik.lundin@webrtc.org
Hi Henrik, PTAL.
Description was changed from ========== Improve stability of the echo detector complexity perf tests. The results of the echo detector complexity tests are currently notoriously spiky and unreliable. The following improvements have been made: - Significantly longer warmup time before starting the test - More iterations and larger batches - Different number of iterations for slow and fast tests - Use the echo likelihood in the test so it cannot get optimized out BUG=webrtc:7353 ========== to ========== Improve stability of the echo detector complexity perf tests. The results of the echo detector complexity tests are currently notoriously spiky and unreliable. The following improvements are made in this CL: - Significantly longer warmup time before starting the test - More iterations and larger batches - Different number of iterations for slow and fast tests - Use the echo likelihood in the test so it cannot get optimized out BUG=webrtc:7353 ==========
Nice. But I have some suggestions for you. https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc (right): https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:29: const size_t kNumFramesToProcess = 20000; Nit: constexprs, please. https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:49: test::PerformanceTimer timer(kNumFramesToProcessStandalone / I suggest you modify PerformanceTimer to help you with the warm-up. You could either set the warm-up number in the ctor and let it just ignore those first samples, or add a method where you prune the first samples after the measurement is done. I think I prefer the former, but can be persuaded otherwise... https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:62: frame_no % kProcessingBatchSizeStandalone == 0) { I don't know if this will skew the measurement, but the modulo operator is surprisingly expensive. You could for instance use the following instead: // Before the for loop size_t next_start_frame = kWarmupBatchSizeStandalone; // In the for loop if (frame_no == next_start_frame) { time.StartTimer(); next_start_frame += kProcessingBatchSizeStandalone; } // ... if (frame_no == next_start_frame - 1 && frame_no > kWarmupBatchSizeStandalone) { timer.StopTimer(); } Of course, if you follow my suggestion for adding the warm-up to the PerformanceTimer, you won't have to worry about that here. https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:81: EXPECT_EQ(0.0f, sum); Do we know that this will be exactly 0.0? Are there any uncertainties that would motivate using EXPECT_FLOAT_EQ instead?
https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc (right): https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:29: const size_t kNumFramesToProcess = 20000; On 2017/03/17 08:48:03, hlundin-webrtc wrote: > Nit: constexprs, please. Done. https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:49: test::PerformanceTimer timer(kNumFramesToProcessStandalone / On 2017/03/17 08:48:03, hlundin-webrtc wrote: > I suggest you modify PerformanceTimer to help you with the warm-up. You could > either set the warm-up number in the ctor and let it just ignore those first > samples, or add a method where you prune the first samples after the measurement > is done. I think I prefer the former, but can be persuaded otherwise... Good idea. I implemented it in a bit of a different way, by passing an argument to the GetAverageDuration/GetStandardDeviation functions. That way we don't have to store any extra state in the PerformanceTimer, I don't have to actually delete any of the measurements stored in there and none of the other code that uses it has to be modified. Is that okay? https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:62: frame_no % kProcessingBatchSizeStandalone == 0) { On 2017/03/17 08:48:03, hlundin-webrtc wrote: > I don't know if this will skew the measurement, but the modulo operator is > surprisingly expensive. You could for instance use the following instead: > > // Before the for loop > size_t next_start_frame = kWarmupBatchSizeStandalone; > > // In the for loop > if (frame_no == next_start_frame) { > time.StartTimer(); > next_start_frame += kProcessingBatchSizeStandalone; > } > > // ... > > if (frame_no == next_start_frame - 1 && > frame_no > kWarmupBatchSizeStandalone) { > timer.StopTimer(); > } > > Of course, if you follow my suggestion for adding the warm-up to the > PerformanceTimer, you won't have to worry about that here. Although I agree that modulo operations are pretty expensive (should be similar to integer division), I don't think it should be very significant compared to all of the calculations that are happening in the echo detector (which include plenty of modulo operations, sqrts and divisions). Also, if the modulo operator is equally slow in each iteration, it will not affect the usefulness of this perf test. https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:81: EXPECT_EQ(0.0f, sum); On 2017/03/17 08:48:03, hlundin-webrtc wrote: > Do we know that this will be exactly 0.0? Are there any uncertainties that would > motivate using EXPECT_FLOAT_EQ instead? In this test both signals are filled with zeros, so a non-zero result would be highly unexpected. The reason I added this is that I was afraid that if I don't use the sum it may get optimized out which means the entire benchmark loop could be optimized out.
Good. Now some comments on performance_timer.cc. https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... File webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc (right): https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:49: test::PerformanceTimer timer(kNumFramesToProcessStandalone / On 2017/03/17 13:53:20, ivoc wrote: > On 2017/03/17 08:48:03, hlundin-webrtc wrote: > > I suggest you modify PerformanceTimer to help you with the warm-up. You could > > either set the warm-up number in the ctor and let it just ignore those first > > samples, or add a method where you prune the first samples after the > measurement > > is done. I think I prefer the former, but can be persuaded otherwise... > > Good idea. I implemented it in a bit of a different way, by passing an argument > to the GetAverageDuration/GetStandardDeviation functions. That way we don't have > to store any extra state in the PerformanceTimer, I don't have to actually > delete any of the measurements stored in there and none of the other code that > uses it has to be modified. Is that okay? Good solution! https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:62: frame_no % kProcessingBatchSizeStandalone == 0) { On 2017/03/17 13:53:20, ivoc wrote: > On 2017/03/17 08:48:03, hlundin-webrtc wrote: > > I don't know if this will skew the measurement, but the modulo operator is > > surprisingly expensive. You could for instance use the following instead: > > > > // Before the for loop > > size_t next_start_frame = kWarmupBatchSizeStandalone; > > > > // In the for loop > > if (frame_no == next_start_frame) { > > time.StartTimer(); > > next_start_frame += kProcessingBatchSizeStandalone; > > } > > > > // ... > > > > if (frame_no == next_start_frame - 1 && > > frame_no > kWarmupBatchSizeStandalone) { > > timer.StopTimer(); > > } > > > > Of course, if you follow my suggestion for adding the warm-up to the > > PerformanceTimer, you won't have to worry about that here. > > Although I agree that modulo operations are pretty expensive (should be similar > to integer division), I don't think it should be very significant compared to > all of the calculations that are happening in the echo detector (which include > plenty of modulo operations, sqrts and divisions). > Also, if the modulo operator is equally slow in each iteration, it will not > affect the usefulness of this perf test. Acknowledged. https://codereview.webrtc.org/2750413002/diff/1/webrtc/modules/audio_processi... webrtc/modules/audio_processing/residual_echo_detector_complexity_unittest.cc:81: EXPECT_EQ(0.0f, sum); On 2017/03/17 13:53:20, ivoc wrote: > On 2017/03/17 08:48:03, hlundin-webrtc wrote: > > Do we know that this will be exactly 0.0? Are there any uncertainties that > would > > motivate using EXPECT_FLOAT_EQ instead? > > In this test both signals are filled with zeros, so a non-zero result would be > highly unexpected. The reason I added this is that I was afraid that if I don't > use the sum it may get optimized out which means the entire benchmark loop could > be optimized out. Acknowledged. https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/test/performance_timer.cc (right): https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:47: int number_of_warmup_samples) const { This should be a size_t, imo. https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:48: const int number_of_samples = RTC_DCHECK_GT(timestamps_us_.size(), number_of_warmup_samples); size_t number_of_samples = ... https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:50: RTC_DCHECK_GT(number_of_samples, 0); ... and skip this. https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:53: timestamps_us_.end(), 0)) / Is it a problem that the initial value is an int literal? If so, the problem is the same in the old code. https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:57: double PerformanceTimer::GetDurationStandardDeviation( Essentially the same comments as above.
https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/test/performance_timer.cc (right): https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:47: int number_of_warmup_samples) const { On 2017/03/17 14:16:37, hlundin-webrtc wrote: > This should be a size_t, imo. Done. https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:48: const int number_of_samples = On 2017/03/17 14:16:37, hlundin-webrtc wrote: > RTC_DCHECK_GT(timestamps_us_.size(), number_of_warmup_samples); > size_t number_of_samples = ... Done. https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:50: RTC_DCHECK_GT(number_of_samples, 0); On 2017/03/17 14:16:36, hlundin-webrtc wrote: > ... and skip this. Done. https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:53: timestamps_us_.end(), 0)) / On 2017/03/17 14:16:37, hlundin-webrtc wrote: > Is it a problem that the initial value is an int literal? If so, the problem is > the same in the old code. Hmm, I think it makes sense to use int64_t's for the accumulation (more precise), and then cast to a double before the division. However, here it seems like the 0 is probably not considered to be an int64_t, so the accumulation will likely use a regular int as the accumulator. I'll add a cast to help the compiler. https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:57: double PerformanceTimer::GetDurationStandardDeviation( On 2017/03/17 14:16:36, hlundin-webrtc wrote: > Essentially the same comments as above. Done. https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:64: double variance = std::accumulate( I used the same pattern of accumulating in int64_t's and then converting to double here.
https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/test/performance_timer.cc (right): https://codereview.webrtc.org/2750413002/diff/20001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/test/performance_timer.cc:64: double variance = std::accumulate( On 2017/03/17 15:14:47, ivoc wrote: > I used the same pattern of accumulating in int64_t's and then converting to > double here. Oh, oops, that seems like a bad idea on second thought (since the average is a double here). I'll remove this in a new patch set.
lgtm
The CQ bit was checked by ivoc@webrtc.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: win_x64_rel on master.tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/win_x64_rel/builds/23884)
The CQ bit was checked by ivoc@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from henrik.lundin@webrtc.org Link to the patchset: https://codereview.webrtc.org/2750413002/#ps80001 (title: "Changed int to size_t.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
CQ is committing da patch.
Bot data: {"patchset_id": 80001, "attempt_start_ts": 1490001608375700,
"parent_rev": "5e1ca78705f9ac3319fdfc3084aefdea3ed6e675", "commit_rev":
"1973ba6ef4909aad7c8bce307d8bf4180ce677fe"}
Message was sent while issue was closed.
Description was changed from ========== Improve stability of the echo detector complexity perf tests. The results of the echo detector complexity tests are currently notoriously spiky and unreliable. The following improvements are made in this CL: - Significantly longer warmup time before starting the test - More iterations and larger batches - Different number of iterations for slow and fast tests - Use the echo likelihood in the test so it cannot get optimized out BUG=webrtc:7353 ========== to ========== Improve stability of the echo detector complexity perf tests. The results of the echo detector complexity tests are currently notoriously spiky and unreliable. The following improvements are made in this CL: - Significantly longer warmup time before starting the test - More iterations and larger batches - Different number of iterations for slow and fast tests - Use the echo likelihood in the test so it cannot get optimized out BUG=webrtc:7353 Review-Url: https://codereview.webrtc.org/2750413002 Cr-Commit-Position: refs/heads/master@{#17303} Committed: https://chromium.googlesource.com/external/webrtc/+/1973ba6ef4909aad7c8bce307... ==========
Message was sent while issue was closed.
Committed patchset #5 (id:80001) as https://chromium.googlesource.com/external/webrtc/+/1973ba6ef4909aad7c8bce307... |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
