|
|
Created:
4 years, 7 months ago by minyue-webrtc Modified:
4 years, 7 months ago Reviewers:
peah-webrtc CC:
webrtc-reviews_webrtc.org, Andrew MacDonald, tterriberry_mozilla.com, audio-team_agora.io, hlundin-webrtc, kwiberg-webrtc, the sun, aluebs-webrtc, bjornv1 Base URL:
https://chromium.googlesource.com/external/webrtc.git@master Target Ref:
refs/pending/heads/master Project:
webrtc Visibility:
Public. |
DescriptionUpdating APM unittests on the echo metrics.
There were a series of changes in the calculation of echo metrics. There changes made the existing unittests lose, e.g., EXPECT_EQ become EXPECT_NEAR. It is good time to protect the echo calculation more strictly.
The change is not simply generating a new reference file and change EXPECT_NEAR to EXPECT_EQ. It strengthens the test as well. Main changes are
1. the old test only sample a metric at the end of processing, while the new test takes metrics during the call with a certain time interval. This gives a much stronger protection.
2. added protection of a newly added metric, called divergent_filter_fraction.
3. as said, use EXPECT_EQ (actually ASSERT_EQ) instead of EXPECT_NEAR as much as possible, even for float point values. This may be too restrictive. But it can be good to be restrictive at the beginning.
BUG=
Committed: https://crrev.com/58530ed246b7d87e2208b96ab2c4836adb6fd5d7
Cr-Commit-Position: refs/heads/master@{#12871}
Patch Set 1 : #
Total comments: 9
Patch Set 2 : assert to expect #
Messages
Total messages: 31 (12 generated)
Patchset #1 (id:1) has been deleted
Patchset #1 (id:20001) has been deleted
Patchset #1 (id:40001) has been deleted
Description was changed from ========== good updating APM unittests on the metrics. BUG= ========== to ========== Updating APM unittests on the echo metrics. There were a series of changes in the calculation of echo metrics. There changes made the existing unittests lose, e.g., EXPECT_EQ become EXPECT_NEAR. It is good time to protect the echo calculation more strictly. The change is not simply generating a new reference file and change EXPECT_NEAR to EXPECT_EQ. It strengthens the test as well. Main changes are 1. the old test only sample a metric at the end of processing, while the new test takes metrics during the call with a certain time interval. This gives a much stronger protection. 2. added protection of a newly added metric, called divergent_filter_fraction. 3. as said, use EXPECT_EQ (actually ASSERT_EQ) instead of EXPECT_NEAR as much as possible, even for float point values. This may be too restrictive. But it can be good to be restrictive at the beginning. BUG= ==========
minyue@webrtc.org changed reviewers: + peah@webrtc.org
Hi Peah, Would you help review this?
The CQ bit was checked by minyue@webrtc.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1969403003/60001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1969403003/60001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_unittest.cc:207: ASSERT_EQ(reference.instant(), test.instant); Why not use EXPECT_EQ? Does it make sense to continue the test if this fails? https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_unittest.cc:2187: ASSERT_EQ(reference_delay.median(), median); Why have you changed this to ASSERT_EQ? That does not make the test stronger, right? The change makes sense if the test will not make sense at all after one change, but I think it these tests it would make sense to continue the test even though one of these fail. WDYT? https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_unittest.cc:2253: #if defined(WEBRTC_AUDIOPROC_FLOAT_PROFILE) Why were only parts of the stats moved to test for each frame? Would it not be possible to change to testing everything for each frame?
Thanks for the feedback! Please find my reply inline. https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_unittest.cc:207: ASSERT_EQ(reference.instant(), test.instant); On 2016/05/20 12:21:23, peah-webrtc wrote: > Why not use EXPECT_EQ? Does it make sense to continue the test if this fails? It was EXPECT_EQ because the test checked the values only once, now we added multiple checks, therefore, it is better to quit early. https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_unittest.cc:2187: ASSERT_EQ(reference_delay.median(), median); On 2016/05/20 12:21:23, peah-webrtc wrote: > Why have you changed this to ASSERT_EQ? That does not make the test stronger, > right? The change makes sense if the test will not make sense at all after one > change, but I think it these tests it would make sense to continue the test even > though one of these fail. WDYT? I think if it fails one time, no benefits can be obtained by letting it continue to run. In such situation, the maker of the change should investigate whether the change is valid and a new reference file should be created if the change is valid. https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_unittest.cc:2253: #if defined(WEBRTC_AUDIOPROC_FLOAT_PROFILE) On 2016/05/20 12:21:23, peah-webrtc wrote: > Why were only parts of the stats moved to test for each frame? Would it not be > possible to change to testing everything for each frame? Because has_echo_count, has_voice_count, analog_level_average etc. are per call.
https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_unittest.cc:207: ASSERT_EQ(reference.instant(), test.instant); On 2016/05/23 03:19:45, minyue-webrtc wrote: > On 2016/05/20 12:21:23, peah-webrtc wrote: > > Why not use EXPECT_EQ? Does it make sense to continue the test if this fails? > > It was EXPECT_EQ because the test checked the values only once, now we added > multiple checks, therefore, it is better to quit early. I don't agree with that. Please see the related comment below. https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_unittest.cc:2187: ASSERT_EQ(reference_delay.median(), median); On 2016/05/23 03:19:45, minyue-webrtc wrote: > On 2016/05/20 12:21:23, peah-webrtc wrote: > > Why have you changed this to ASSERT_EQ? That does not make the test stronger, > > right? The change makes sense if the test will not make sense at all after one > > change, but I think it these tests it would make sense to continue the test > even > > though one of these fail. WDYT? > > I think if it fails one time, no benefits can be obtained by letting it continue > to run. In such situation, the maker of the change should investigate whether > the change is valid and a new reference file should be created if the change is > valid. That depends on how it is to be used. If it is to be used as a bitexactness test for the metrics, I partly agree as there you are not interested in the actual data points. But even though you are not interested in the actual data points of the metrics, you actually loose all the information about what goes wrong in this way. For instance, if the test fails due to the echo_metrics.echo_return_loss stats it is not possible to tell whether the echo_metrics.echo_return_loss_enhancement failed as well. I think it is better to use EXPECT_EQ here as you will gain more information when the test fails. The drawback is the extra complexity of running more checks than needed when it fails but I think that is definitely worth it if that gives us more information about the failing test.
https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... File webrtc/modules/audio_processing/audio_processing_unittest.cc (right): https://codereview.webrtc.org/1969403003/diff/60001/webrtc/modules/audio_proc... webrtc/modules/audio_processing/audio_processing_unittest.cc:2253: #if defined(WEBRTC_AUDIOPROC_FLOAT_PROFILE) On 2016/05/23 03:19:45, minyue-webrtc wrote: > On 2016/05/20 12:21:23, peah-webrtc wrote: > > Why were only parts of the stats moved to test for each frame? Would it not be > > possible to change to testing everything for each frame? > > Because has_echo_count, has_voice_count, analog_level_average etc. are per call. > Acknowledged.
Hi, I changed Assert to Expect. See patch set 2.
On 2016/05/23 08:20:17, minyue-webrtc wrote: > Hi, > > I changed Assert to Expect. See patch set 2. Great! Thanks! Then I have only one remaining question: Why did the .pb files change due to the changes in patch set 2? The transition from ASSERT to EXPECT should be bitexact, right?
On 2016/05/23 11:27:00, peah-webrtc wrote: > On 2016/05/23 08:20:17, minyue-webrtc wrote: > > Hi, > > > > I changed Assert to Expect. See patch set 2. > > Great! Thanks! Then I have only one remaining question: > Why did the .pb files change due to the changes in patch set 2? The transition > from ASSERT to EXPECT should be bitexact, right? The files in Patchset 2 are actually untouched. But since I work on a different machine, the review system was fulled.
On 2016/05/24 01:49:56, minyue-webrtc wrote: > On 2016/05/23 11:27:00, peah-webrtc wrote: > > On 2016/05/23 08:20:17, minyue-webrtc wrote: > > > Hi, > > > > > > I changed Assert to Expect. See patch set 2. > > > > Great! Thanks! Then I have only one remaining question: > > Why did the .pb files change due to the changes in patch set 2? The transition > > from ASSERT to EXPECT should be bitexact, right? > > The files in Patchset 2 are actually untouched. But since I work on a different > machine, the review system was fulled. fooled I mean
On 2016/05/24 01:50:07, minyue-webrtc wrote: > On 2016/05/24 01:49:56, minyue-webrtc wrote: > > On 2016/05/23 11:27:00, peah-webrtc wrote: > > > On 2016/05/23 08:20:17, minyue-webrtc wrote: > > > > Hi, > > > > > > > > I changed Assert to Expect. See patch set 2. > > > > > > Great! Thanks! Then I have only one remaining question: > > > Why did the .pb files change due to the changes in patch set 2? The > transition > > > from ASSERT to EXPECT should be bitexact, right? > > > > The files in Patchset 2 are actually untouched. But since I work on a > different > > machine, the review system was fulled. > > fooled I mean lgtm with the nit below It seems quite strange that the review system is fooled when you work on another machine. Have you applied the patch on the latest master? I do this very often, on different machines, and this has never happened to me. Please double check that they do not differ from the patch set 1. (it would be much simpler if the reference files were in text format as then the diff would be visible. Eventually we should transition to have them in text format instead).
On 2016/05/24 01:50:07, minyue-webrtc wrote: > On 2016/05/24 01:49:56, minyue-webrtc wrote: > > On 2016/05/23 11:27:00, peah-webrtc wrote: > > > On 2016/05/23 08:20:17, minyue-webrtc wrote: > > > > Hi, > > > > > > > > I changed Assert to Expect. See patch set 2. > > > > > > Great! Thanks! Then I have only one remaining question: > > > Why did the .pb files change due to the changes in patch set 2? The > transition > > > from ASSERT to EXPECT should be bitexact, right? > > > > The files in Patchset 2 are actually untouched. But since I work on a > different > > machine, the review system was fulled. > > fooled I mean lgtm with the nit below It seems quite strange that the review system is fooled when you work on another machine. Have you applied the patch on the latest master? I do this very often, on different machines, and this has never happened to me. Please double check that they do not differ from the patch set 1. (it would be much simpler if the reference files were in text format as then the diff would be visible. Eventually we should transition to have them in text format instead).
On 2016/05/24 05:19:00, peah-webrtc wrote: > On 2016/05/24 01:50:07, minyue-webrtc wrote: > > On 2016/05/24 01:49:56, minyue-webrtc wrote: > > > On 2016/05/23 11:27:00, peah-webrtc wrote: > > > > On 2016/05/23 08:20:17, minyue-webrtc wrote: > > > > > Hi, > > > > > > > > > > I changed Assert to Expect. See patch set 2. > > > > > > > > Great! Thanks! Then I have only one remaining question: > > > > Why did the .pb files change due to the changes in patch set 2? The > > transition > > > > from ASSERT to EXPECT should be bitexact, right? > > > > > > The files in Patchset 2 are actually untouched. But since I work on a > > different > > > machine, the review system was fulled. > > > > fooled I mean > > lgtm with the nit below > > It seems quite strange that the review system is fooled when you work on another > machine. Have you applied the patch on the latest master? I do this very often, > on different machines, and this has never happened to me. Please double check > that they do not differ from the patch set 1. > (it would be much simpler if the reference files were in text format as then the > diff would be visible. Eventually we should transition to have them in text > format instead). I agree that this will be solved if it is text. I think the reason that the review system is fooled is because it does not check the content of binary files. It relies on the commit info or something like that. When we apply the patch to a different machine, the two files is marked changed, and review system considered it as being changed. Anyways, to verify that the binary files did not returned to the version, I compared them against master branch, and verified that they are different.
The CQ bit was checked by minyue@webrtc.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1969403003/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1969403003/80001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_msan on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/linux_msan/builds/10357)
The CQ bit was checked by minyue@webrtc.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1969403003/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1969403003/80001
Message was sent while issue was closed.
Description was changed from ========== Updating APM unittests on the echo metrics. There were a series of changes in the calculation of echo metrics. There changes made the existing unittests lose, e.g., EXPECT_EQ become EXPECT_NEAR. It is good time to protect the echo calculation more strictly. The change is not simply generating a new reference file and change EXPECT_NEAR to EXPECT_EQ. It strengthens the test as well. Main changes are 1. the old test only sample a metric at the end of processing, while the new test takes metrics during the call with a certain time interval. This gives a much stronger protection. 2. added protection of a newly added metric, called divergent_filter_fraction. 3. as said, use EXPECT_EQ (actually ASSERT_EQ) instead of EXPECT_NEAR as much as possible, even for float point values. This may be too restrictive. But it can be good to be restrictive at the beginning. BUG= ========== to ========== Updating APM unittests on the echo metrics. There were a series of changes in the calculation of echo metrics. There changes made the existing unittests lose, e.g., EXPECT_EQ become EXPECT_NEAR. It is good time to protect the echo calculation more strictly. The change is not simply generating a new reference file and change EXPECT_NEAR to EXPECT_EQ. It strengthens the test as well. Main changes are 1. the old test only sample a metric at the end of processing, while the new test takes metrics during the call with a certain time interval. This gives a much stronger protection. 2. added protection of a newly added metric, called divergent_filter_fraction. 3. as said, use EXPECT_EQ (actually ASSERT_EQ) instead of EXPECT_NEAR as much as possible, even for float point values. This may be too restrictive. But it can be good to be restrictive at the beginning. BUG= ==========
Message was sent while issue was closed.
Committed patchset #2 (id:80001)
Message was sent while issue was closed.
Description was changed from ========== Updating APM unittests on the echo metrics. There were a series of changes in the calculation of echo metrics. There changes made the existing unittests lose, e.g., EXPECT_EQ become EXPECT_NEAR. It is good time to protect the echo calculation more strictly. The change is not simply generating a new reference file and change EXPECT_NEAR to EXPECT_EQ. It strengthens the test as well. Main changes are 1. the old test only sample a metric at the end of processing, while the new test takes metrics during the call with a certain time interval. This gives a much stronger protection. 2. added protection of a newly added metric, called divergent_filter_fraction. 3. as said, use EXPECT_EQ (actually ASSERT_EQ) instead of EXPECT_NEAR as much as possible, even for float point values. This may be too restrictive. But it can be good to be restrictive at the beginning. BUG= ========== to ========== Updating APM unittests on the echo metrics. There were a series of changes in the calculation of echo metrics. There changes made the existing unittests lose, e.g., EXPECT_EQ become EXPECT_NEAR. It is good time to protect the echo calculation more strictly. The change is not simply generating a new reference file and change EXPECT_NEAR to EXPECT_EQ. It strengthens the test as well. Main changes are 1. the old test only sample a metric at the end of processing, while the new test takes metrics during the call with a certain time interval. This gives a much stronger protection. 2. added protection of a newly added metric, called divergent_filter_fraction. 3. as said, use EXPECT_EQ (actually ASSERT_EQ) instead of EXPECT_NEAR as much as possible, even for float point values. This may be too restrictive. But it can be good to be restrictive at the beginning. BUG= Committed: https://crrev.com/58530ed246b7d87e2208b96ab2c4836adb6fd5d7 Cr-Commit-Position: refs/heads/master@{#12871} ==========
Message was sent while issue was closed.
Patchset 2 (id:??) landed as https://crrev.com/58530ed246b7d87e2208b96ab2c4836adb6fd5d7 Cr-Commit-Position: refs/heads/master@{#12871} |