webrtc/modules/audio_processing/agc/legacy/analog_agc.c - Issue 2003623003: Re-enable UBsan on AGC.

Unified Diff: webrtc/modules/audio_processing/agc/legacy/analog_agc.c

Issue 2003623003: Re-enable UBsan on AGC. (Closed) Base URL: https://chromium.googlesource.com/external/webrtc.git@master

Patch Set: Created 4 years, 7 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View side-by-side diff with in-line comments

Download patch

« no previous file with comments | « tools/ubsan/blacklist.txt ('k') | webrtc/modules/audio_processing/agc/legacy/digital_agc.c » ('j') | webrtc/modules/audio_processing/agc/legacy/digital_agc.c » ('J')
Expand Comments ('e') | Collapse Comments ('c') | Hide Comments ('s')

Index: webrtc/modules/audio_processing/agc/legacy/analog_agc.c

diff --git a/webrtc/modules/audio_processing/agc/legacy/analog_agc.c b/webrtc/modules/audio_processing/agc/legacy/analog_agc.c

index 36c67c282a02a1b9096f01aa8bda572624f8b67b..030077afe30aae60d2a611670cbaf0a0323e3015 100644

--- a/webrtc/modules/audio_processing/agc/legacy/analog_agc.c

+++ b/webrtc/modules/audio_processing/agc/legacy/analog_agc.c

@@ -476,16 +476,20 @@ void WebRtcAgc_ZeroCtrl(LegacyAgc* stt, int32_t* inMicLevel, int32_t* env) {

int16_t i;

int32_t tmp32 = 0;

int32_t midVal;

+ const int kZeroThreshold = 500;

peah-webrtc 2016/05/23 04:59:36 This solution definitely works, but another varian

minyue-webrtc 2016/05/23 07:40:48 Yes, I think early exiting the loop is more effici

peah-webrtc 2016/05/23 12:00:19 I think that this would be a clear case where a 64

kwiberg-webrtc 2016/05/23 12:44:40 Are the operations int64 += int32 and int64 >= int

minyue-webrtc 2016/05/24 01:32:26 Loop in 481-488 is really to check if env[0..9] ar

peah-webrtc 2016/05/24 06:16:20 What I mean by pipelining is that as the loop is s

What I mean by pipelining is that as the loop is so predictable for the compiler that it should be extremely efficient to perform for the CPU and it utilizes the CPU pipeline in a really good manner. Possibly it could even be done using SIMD instructions. Since the loop limit is fixed, and as low as 10, I expect the compiler would probably do loop-unrolling on the loop, e.g, replace the loop by. tmp32 += env[0]; tmp32 += env[1]; ... tmp32 += env[9]; Therefore, this loop is extremely cheap and doing an early return is probably not worth it, and would potentially even increase the complexity. The compiler would possibly be able to do good pipelining of the early-return variant of the loop but it is at least not as straightforward to do that, and the break statement will anyway break the pipeline for the loop. Also, loop unrolling would be less nice, e.g., if (env[0] >= kZeroThreshold || tmp32 >= kZeroThreshold) { goto early_return_label; } tmp32 += env[0]; if (env[1] >= kZeroThreshold || tmp32 >= kZeroThreshold) { goto early_return_label; } tmp32 += env[1]; ... if (env[9] >= kZeroThreshold || tmp32 >= kZeroThreshold) { goto early_return_label; } tmp32 += env[9]; early_return_label: (The compiler would probably optimize the loop unrolling above to look very different though). If you expect the test to often be invalidated at index 0, I'd then instead change the code to be, int64_t tmp64 = env[0]; if (tmp64 < kZeroThreshold) { for (i = 1; i < 10; ++i) { tmp64 = env[i]; } } That would provide a really efficient early return, and would give a very nice optimize able loop. I don't think it is worth the extra code though, as this is an extremely cheap loop for the compiler to perform, and as these types of optimization only affect the average level of the complexity which means that it is mainly a way to save electrical power of the CPU (i.e., does not affect the algorithmic complexity). As I see it, the only reason for not going for 64 bit variables should be a high cost of the 64 bit additions. Do you know if that is the case?

minyue-webrtc 2016/05/24 07:16:12 I can basically follow your suggestion, but I thin

On 2016/05/24 06:16:20, peah-webrtc wrote: > On 2016/05/24 01:32:26, minyue-webrtc wrote: > > On 2016/05/23 12:00:19, peah-webrtc wrote: > > > On 2016/05/23 07:40:48, minyue-webrtc wrote: > > > > On 2016/05/23 04:59:36, peah-webrtc wrote: > > > > > This solution definitely works, but another variant would have been just > > to > > > > > replace > > > > > int32_t tmp32 = 0; > > > > > by > > > > > int64_t tmp64 = 0; > > > > > and then do the additions in 64 bit instead. > > > > > > > > > > that should probably also be faster than having the if-statement inside > > the > > > > > for-loop. > > > > > > > > > > Or is there a reason not to use 64 bit values here? > > > > > > > > Yes, I think early exiting the loop is more efficient, too. > > > > > > I think that this would be a clear case where a 64 bit value should be used. > > It > > > will lead to cleaner code and I doubt that the early return will save > > anything, > > > as this is a very short loop of fixed size, which should be ideal to > pipeline > > > (meaning that an early return won't save computations as it anyway breaks > the > > > pipeline). > > > > > > The only reason I can see for not using a 64 bit value for the sum is if it > is > > > significantly more complex. Do you know if that is the case? > > > > Loop in 481-488 is really to check if env[0..9] are generally zeros. In most > > cases, this should be invalidated at index 0, it is nice that we return early. > I > > do not fully understand the pipeline argument. > > What I mean by pipelining is that as the loop is so predictable for the compiler > that it should be extremely efficient to perform for the CPU and it utilizes the > CPU pipeline in a really good manner. Possibly it could even be done using SIMD > instructions. Since the loop limit is fixed, and as low as 10, I expect the > compiler would probably do loop-unrolling on the loop, e.g, replace the loop by. > tmp32 += env[0]; > tmp32 += env[1]; > ... > tmp32 += env[9]; > Therefore, this loop is extremely cheap and doing an early return is probably > not worth it, and would potentially even increase the complexity. > > The compiler would possibly be able to do good pipelining of the early-return > variant of the loop but it is at least not as straightforward to do that, and > the break statement will anyway break the pipeline for the loop. Also, loop > unrolling would be less nice, e.g., > if (env[0] >= kZeroThreshold || tmp32 >= kZeroThreshold) { > goto early_return_label; > } > tmp32 += env[0]; > if (env[1] >= kZeroThreshold || tmp32 >= kZeroThreshold) { > goto early_return_label; > } > tmp32 += env[1]; > > ... > if (env[9] >= kZeroThreshold || tmp32 >= kZeroThreshold) { > goto early_return_label; > } > tmp32 += env[9]; > early_return_label: > > > (The compiler would probably optimize the loop unrolling above to look very > different though). > > If you expect the test to often be invalidated at index 0, I'd then instead > change the code to be, > int64_t tmp64 = env[0]; > if (tmp64 < kZeroThreshold) { > for (i = 1; i < 10; ++i) { > tmp64 = env[i]; > } > } > > That would provide a really efficient early return, and would give a very nice > optimize able loop. I don't think it is worth the extra code though, as this is > an extremely cheap loop for the compiler to perform, and as these types of > optimization only affect the average level of the complexity which means that it > is mainly a way to save electrical power of the CPU (i.e., does not affect the > algorithmic complexity). > > > As I see it, the only reason for not going for 64 bit variables should be a high > cost of the 64 bit additions. Do you know if that is the case? > >

I can basically follow your suggestion, but I think it is a chance to learn something. Let me continue the discussion a bit. I think pipelining is useful if the lines in the loop can be parallized, this wiki page has good explanation. https://en.wikipedia.org/wiki/Software_pipelining For summation, parallization is not possible.

peah-webrtc 2016/05/24 10:46:29 I don't know this as well as I should do, but I me

On 2016/05/24 07:16:12, minyue-webrtc wrote: > On 2016/05/24 06:16:20, peah-webrtc wrote: > > On 2016/05/24 01:32:26, minyue-webrtc wrote: > > > On 2016/05/23 12:00:19, peah-webrtc wrote: > > > > On 2016/05/23 07:40:48, minyue-webrtc wrote: > > > > > On 2016/05/23 04:59:36, peah-webrtc wrote: > > > > > > This solution definitely works, but another variant would have been > just > > > to > > > > > > replace > > > > > > int32_t tmp32 = 0; > > > > > > by > > > > > > int64_t tmp64 = 0; > > > > > > and then do the additions in 64 bit instead. > > > > > > > > > > > > that should probably also be faster than having the if-statement > inside > > > the > > > > > > for-loop. > > > > > > > > > > > > Or is there a reason not to use 64 bit values here? > > > > > > > > > > Yes, I think early exiting the loop is more efficient, too. > > > > > > > > I think that this would be a clear case where a 64 bit value should be > used. > > > It > > > > will lead to cleaner code and I doubt that the early return will save > > > anything, > > > > as this is a very short loop of fixed size, which should be ideal to > > pipeline > > > > (meaning that an early return won't save computations as it anyway breaks > > the > > > > pipeline). > > > > > > > > The only reason I can see for not using a 64 bit value for the sum is if > it > > is > > > > significantly more complex. Do you know if that is the case? > > > > > > Loop in 481-488 is really to check if env[0..9] are generally zeros. In most > > > cases, this should be invalidated at index 0, it is nice that we return > early. > > I > > > do not fully understand the pipeline argument. > > > > What I mean by pipelining is that as the loop is so predictable for the > compiler > > that it should be extremely efficient to perform for the CPU and it utilizes > the > > CPU pipeline in a really good manner. Possibly it could even be done using > SIMD > > instructions. Since the loop limit is fixed, and as low as 10, I expect the > > compiler would probably do loop-unrolling on the loop, e.g, replace the loop > by. > > tmp32 += env[0]; > > tmp32 += env[1]; > > ... > > tmp32 += env[9]; > > Therefore, this loop is extremely cheap and doing an early return is probably > > not worth it, and would potentially even increase the complexity. > > > > The compiler would possibly be able to do good pipelining of the early-return > > variant of the loop but it is at least not as straightforward to do that, and > > the break statement will anyway break the pipeline for the loop. Also, loop > > unrolling would be less nice, e.g., > > if (env[0] >= kZeroThreshold || tmp32 >= kZeroThreshold) { > > goto early_return_label; > > } > > tmp32 += env[0]; > > if (env[1] >= kZeroThreshold || tmp32 >= kZeroThreshold) { > > goto early_return_label; > > } > > tmp32 += env[1]; > > > > ... > > if (env[9] >= kZeroThreshold || tmp32 >= kZeroThreshold) { > > goto early_return_label; > > } > > tmp32 += env[9]; > > early_return_label: > > > > > > (The compiler would probably optimize the loop unrolling above to look very > > different though). > > > > If you expect the test to often be invalidated at index 0, I'd then instead > > change the code to be, > > int64_t tmp64 = env[0]; > > if (tmp64 < kZeroThreshold) { > > for (i = 1; i < 10; ++i) { > > tmp64 = env[i]; > > } > > } > > > > That would provide a really efficient early return, and would give a very nice > > optimize able loop. I don't think it is worth the extra code though, as this > is > > an extremely cheap loop for the compiler to perform, and as these types of > > optimization only affect the average level of the complexity which means that > it > > is mainly a way to save electrical power of the CPU (i.e., does not affect the > > algorithmic complexity). > > > > > > As I see it, the only reason for not going for 64 bit variables should be a > high > > cost of the 64 bit additions. Do you know if that is the case? > > > > > I can basically follow your suggestion, but I think it is a chance to learn > something. Let me continue the discussion a bit. > > I think pipelining is useful if the lines in the loop can be parallized, this > wiki page has good explanation. > > https://en.wikipedia.org/wiki/Software_pipelining > > For summation, parallization is not possible.

I don't know this as well as I should do, but I meant primarily hardware pipelining https://en.wikipedia.org/wiki/Instruction_pipelining For summation, the lines in the loop can definitely be parallellized in hardware (probably in this case via loop-unrolling though. E.g. sum = A+B+C+D can be parallelized as E=A+B F=C+D sum=E+F where E and F can be computed in parallel independently of each other.

minyue-webrtc 2016/05/24 13:05:45 ok, I can do int64 sum.

On 2016/05/24 10:46:29, peah-webrtc wrote: > On 2016/05/24 07:16:12, minyue-webrtc wrote: > > On 2016/05/24 06:16:20, peah-webrtc wrote: > > > On 2016/05/24 01:32:26, minyue-webrtc wrote: > > > > On 2016/05/23 12:00:19, peah-webrtc wrote: > > > > > On 2016/05/23 07:40:48, minyue-webrtc wrote: > > > > > > On 2016/05/23 04:59:36, peah-webrtc wrote: > > > > > > > This solution definitely works, but another variant would have been > > just > > > > to > > > > > > > replace > > > > > > > int32_t tmp32 = 0; > > > > > > > by > > > > > > > int64_t tmp64 = 0; > > > > > > > and then do the additions in 64 bit instead. > > > > > > > > > > > > > > that should probably also be faster than having the if-statement > > inside > > > > the > > > > > > > for-loop. > > > > > > > > > > > > > > Or is there a reason not to use 64 bit values here? > > > > > > > > > > > > Yes, I think early exiting the loop is more efficient, too. > > > > > > > > > > I think that this would be a clear case where a 64 bit value should be > > used. > > > > It > > > > > will lead to cleaner code and I doubt that the early return will save > > > > anything, > > > > > as this is a very short loop of fixed size, which should be ideal to > > > pipeline > > > > > (meaning that an early return won't save computations as it anyway > breaks > > > the > > > > > pipeline). > > > > > > > > > > The only reason I can see for not using a 64 bit value for the sum is if > > it > > > is > > > > > significantly more complex. Do you know if that is the case? > > > > > > > > Loop in 481-488 is really to check if env[0..9] are generally zeros. In > most > > > > cases, this should be invalidated at index 0, it is nice that we return > > early. > > > I > > > > do not fully understand the pipeline argument. > > > > > > What I mean by pipelining is that as the loop is so predictable for the > > compiler > > > that it should be extremely efficient to perform for the CPU and it utilizes > > the > > > CPU pipeline in a really good manner. Possibly it could even be done using > > SIMD > > > instructions. Since the loop limit is fixed, and as low as 10, I expect the > > > compiler would probably do loop-unrolling on the loop, e.g, replace the loop > > by. > > > tmp32 += env[0]; > > > tmp32 += env[1]; > > > ... > > > tmp32 += env[9]; > > > Therefore, this loop is extremely cheap and doing an early return is > probably > > > not worth it, and would potentially even increase the complexity. > > > > > > The compiler would possibly be able to do good pipelining of the > early-return > > > variant of the loop but it is at least not as straightforward to do that, > and > > > the break statement will anyway break the pipeline for the loop. Also, loop > > > unrolling would be less nice, e.g., > > > if (env[0] >= kZeroThreshold || tmp32 >= kZeroThreshold) { > > > goto early_return_label; > > > } > > > tmp32 += env[0]; > > > if (env[1] >= kZeroThreshold || tmp32 >= kZeroThreshold) { > > > goto early_return_label; > > > } > > > tmp32 += env[1]; > > > > > > ... > > > if (env[9] >= kZeroThreshold || tmp32 >= kZeroThreshold) { > > > goto early_return_label; > > > } > > > tmp32 += env[9]; > > > early_return_label: > > > > > > > > > (The compiler would probably optimize the loop unrolling above to look very > > > different though). > > > > > > If you expect the test to often be invalidated at index 0, I'd then instead > > > change the code to be, > > > int64_t tmp64 = env[0]; > > > if (tmp64 < kZeroThreshold) { > > > for (i = 1; i < 10; ++i) { > > > tmp64 = env[i]; > > > } > > > } > > > > > > That would provide a really efficient early return, and would give a very > nice > > > optimize able loop. I don't think it is worth the extra code though, as this > > is > > > an extremely cheap loop for the compiler to perform, and as these types of > > > optimization only affect the average level of the complexity which means > that > > it > > > is mainly a way to save electrical power of the CPU (i.e., does not affect > the > > > algorithmic complexity). > > > > > > > > > As I see it, the only reason for not going for 64 bit variables should be a > > high > > > cost of the 64 bit additions. Do you know if that is the case? > > > > > > > > I can basically follow your suggestion, but I think it is a chance to learn > > something. Let me continue the discussion a bit. > > > > I think pipelining is useful if the lines in the loop can be parallized, this > > wiki page has good explanation. > > > > https://en.wikipedia.org/wiki/Software_pipelining > > > > For summation, parallization is not possible. > > I don't know this as well as I should do, but I meant primarily hardware > pipelining https://en.wikipedia.org/wiki/Instruction_pipelining > > For summation, the lines in the loop can definitely be parallellized in hardware > (probably in this case via loop-unrolling though. > > E.g. > sum = A+B+C+D can be parallelized as > E=A+B > F=C+D > sum=E+F > where E and F can be computed in parallel independently of each other. > >

ok, I can do int64 sum.

/* Is the input signal zero? */

for (i = 0; i < 10; i++) {

+ if (env[i] >= kZeroThreshold || tmp32 >= kZeroThreshold) {

+ break;

+ }

tmp32 += env[i];

}

/* Each block is allowed to have a few non-zero

* samples.

- if (tmp32 < 500) {

+ if (i == 10 && tmp32 < kZeroThreshold) {

stt->msZero += 10;

} else {

stt->msZero = 0;