Issue 1257833004: Prevent TimeUntilNextProcess log spam.

pbos-webrtc

pbos@webrtc.org changed reviewers: - tommi@webrtc.org

5 years, 4 months ago (2015-07-29 08:23:15 UTC) #1

stefan-webrtc

stefan@webrtc.org changed reviewers: + stefan@webrtc.org

5 years, 4 months ago (2015-07-29 08:40:15 UTC) #3

juberti1

juberti@webrtc.org changed reviewers: + juberti@webrtc.org

5 years, 4 months ago (2015-07-29 18:19:55 UTC) #5

juberti1

the comment seems to indicate that this is in fact an error, so not sure ...

5 years, 4 months ago (2015-07-29 18:19:55 UTC) #6

stefan-webrtc

On 2015/07/29 18:19:55, juberti1 wrote: > the comment seems to indicate that this is in ...

5 years, 4 months ago (2015-07-30 08:58:19 UTC) #7

pbos-webrtc

On 2015/07/30 08:58:19, stefan-webrtc (holmer) wrote: > On 2015/07/29 18:19:55, juberti1 wrote: > > the ...

5 years, 4 months ago (2015-07-30 09:15:40 UTC) #8

stefan-webrtc

On 2015/07/30 09:15:40, pbos-webrtc wrote: > On 2015/07/30 08:58:19, stefan-webrtc (holmer) wrote: > > On ...

5 years, 4 months ago (2015-07-30 09:29:02 UTC) #9

tommi

On 2015/07/30 09:29:02, stefan-webrtc (holmer) wrote: > On 2015/07/30 09:15:40, pbos-webrtc wrote: > > On ...

5 years, 4 months ago (2015-08-04 11:38:36 UTC) #10

On 2015/07/30 09:29:02, stefan-webrtc (holmer) wrote:
> On 2015/07/30 09:15:40, pbos-webrtc wrote:
> > On 2015/07/30 08:58:19, stefan-webrtc (holmer) wrote:
> > > On 2015/07/29 18:19:55, juberti1 wrote:
> > > > the comment seems to indicate that this is in fact an error, so not sure
> we
> > > > should stop logging completely, but I was hoping we would only log at
most
> N
> > > > times (using a member var to keep track of how often we have logged).
> > > > 
> > > > For now, perhaps N=10, or some such
> > > 
> > > Our thinking is that it isn't really an error that the scheduler is
falling
> > > behind. There could be some heavy task taking time, which results in other
> > tasks
> > > being delayed. We could log it as a warning, but I suspect that this is
> likely
> > > to happen every now and then in most calls.
> > > 
> > > I can however see that it could be useful to know if we frequently have
very
> > > delayed tasks. Maybe we could have a threshold and log every Nth time a
task
> > is
> > > delayed for more than, say, 20 ms?
> > 
> > I also don't think this is an error, a -1 here should not be an error code
as
> > far as I can tell. A common implementation of this could be:
> > 
> > Process(): {
> >   ...
> >   next_run_ms_ = Clock::NowMs() + 20;
> > }
> > 
> > TimeUntilNextProcess() {
> >   return next_run_ms_ - Clock::NowMs();
> > }
> > 
> > I don't think it's useful to enforce clamping by the module in this case,
but
> if
> > the process module falls behind this is something we'll log.
> 
> I agree. We'd prefer to let the process module know that tasks are falling
> behind than to leave it to each module to clamp the time.

There are implementations that are (or at least were) returning errors, and not
"back in time" values.  I suppose it may be a mix.
There's no point in telling the scheduler though that tasks are falling behind
and there's nothing the scheduler can do about it, so I don't think these are
right.  The main concern however was around the places that were returning error
values and not time values and that's why the log exists.

stefan-webrtc

On 2015/08/04 11:38:36, tommi (webrtc) wrote: > On 2015/07/30 09:29:02, stefan-webrtc (holmer) wrote: > > ...

5 years, 4 months ago (2015-08-04 14:43:04 UTC) #11

On 2015/08/04 11:38:36, tommi (webrtc) wrote:
> On 2015/07/30 09:29:02, stefan-webrtc (holmer) wrote:
> > On 2015/07/30 09:15:40, pbos-webrtc wrote:
> > > On 2015/07/30 08:58:19, stefan-webrtc (holmer) wrote:
> > > > On 2015/07/29 18:19:55, juberti1 wrote:
> > > > > the comment seems to indicate that this is in fact an error, so not
sure
> > we
> > > > > should stop logging completely, but I was hoping we would only log at
> most
> > N
> > > > > times (using a member var to keep track of how often we have logged).
> > > > > 
> > > > > For now, perhaps N=10, or some such
> > > > 
> > > > Our thinking is that it isn't really an error that the scheduler is
> falling
> > > > behind. There could be some heavy task taking time, which results in
other
> > > tasks
> > > > being delayed. We could log it as a warning, but I suspect that this is
> > likely
> > > > to happen every now and then in most calls.
> > > > 
> > > > I can however see that it could be useful to know if we frequently have
> very
> > > > delayed tasks. Maybe we could have a threshold and log every Nth time a
> task
> > > is
> > > > delayed for more than, say, 20 ms?
> > > 
> > > I also don't think this is an error, a -1 here should not be an error code
> as
> > > far as I can tell. A common implementation of this could be:
> > > 
> > > Process(): {
> > >   ...
> > >   next_run_ms_ = Clock::NowMs() + 20;
> > > }
> > > 
> > > TimeUntilNextProcess() {
> > >   return next_run_ms_ - Clock::NowMs();
> > > }
> > > 
> > > I don't think it's useful to enforce clamping by the module in this case,
> but
> > if
> > > the process module falls behind this is something we'll log.
> > 
> > I agree. We'd prefer to let the process module know that tasks are falling
> > behind than to leave it to each module to clamp the time.
> 
> There are implementations that are (or at least were) returning errors, and
not
> "back in time" values.  I suppose it may be a mix.
> There's no point in telling the scheduler though that tasks are falling behind
> and there's nothing the scheduler can do about it, so I don't think these are
> right.  The main concern however was around the places that were returning
error
> values and not time values and that's why the log exists.

I see, thanks for that input.

The main reason we would like to accept negative values is to avoid having code
for clamping to 0 in each module separately. Were the modules returning errors
fixed, or are they still around?

juberti1

On 2015/08/04 14:43:04, stefan-webrtc (holmer) wrote: > On 2015/08/04 11:38:36, tommi (webrtc) wrote: > > ...

5 years, 4 months ago (2015-08-13 20:49:32 UTC) #12

On 2015/08/04 14:43:04, stefan-webrtc (holmer) wrote:
> On 2015/08/04 11:38:36, tommi (webrtc) wrote:
> > On 2015/07/30 09:29:02, stefan-webrtc (holmer) wrote:
> > > On 2015/07/30 09:15:40, pbos-webrtc wrote:
> > > > On 2015/07/30 08:58:19, stefan-webrtc (holmer) wrote:
> > > > > On 2015/07/29 18:19:55, juberti1 wrote:
> > > > > > the comment seems to indicate that this is in fact an error, so not
> sure
> > > we
> > > > > > should stop logging completely, but I was hoping we would only log
at
> > most
> > > N
> > > > > > times (using a member var to keep track of how often we have
logged).
> > > > > > 
> > > > > > For now, perhaps N=10, or some such
> > > > > 
> > > > > Our thinking is that it isn't really an error that the scheduler is
> > falling
> > > > > behind. There could be some heavy task taking time, which results in
> other
> > > > tasks
> > > > > being delayed. We could log it as a warning, but I suspect that this
is
> > > likely
> > > > > to happen every now and then in most calls.
> > > > > 
> > > > > I can however see that it could be useful to know if we frequently
have
> > very
> > > > > delayed tasks. Maybe we could have a threshold and log every Nth time
a
> > task
> > > > is
> > > > > delayed for more than, say, 20 ms?
> > > > 
> > > > I also don't think this is an error, a -1 here should not be an error
code
> > as
> > > > far as I can tell. A common implementation of this could be:
> > > > 
> > > > Process(): {
> > > >   ...
> > > >   next_run_ms_ = Clock::NowMs() + 20;
> > > > }
> > > > 
> > > > TimeUntilNextProcess() {
> > > >   return next_run_ms_ - Clock::NowMs();
> > > > }
> > > > 
> > > > I don't think it's useful to enforce clamping by the module in this
case,
> > but
> > > if
> > > > the process module falls behind this is something we'll log.
> > > 
> > > I agree. We'd prefer to let the process module know that tasks are falling
> > > behind than to leave it to each module to clamp the time.
> > 
> > There are implementations that are (or at least were) returning errors, and
> not
> > "back in time" values.  I suppose it may be a mix.
> > There's no point in telling the scheduler though that tasks are falling
behind
> > and there's nothing the scheduler can do about it, so I don't think these
are
> > right.  The main concern however was around the places that were returning
> error
> > values and not time values and that's why the log exists.
> 
> I see, thanks for that input.
> 
> The main reason we would like to accept negative values is to avoid having
code
> for clamping to 0 in each module separately. Were the modules returning errors
> fixed, or are they still around?

Any updates on this? Agree we shouldn't log on falling behind, but would be good
to report when an error is (incorrectly) returned. Is there any way to do this?

pbos-webrtc

On 2015/08/13 20:49:32, juberti1 wrote: > On 2015/08/04 14:43:04, stefan-webrtc (holmer) wrote: > > On ...

5 years, 4 months ago (2015-08-19 00:30:02 UTC) #13

On 2015/08/13 20:49:32, juberti1 wrote:
> On 2015/08/04 14:43:04, stefan-webrtc (holmer) wrote:
> > On 2015/08/04 11:38:36, tommi (webrtc) wrote:
> > > On 2015/07/30 09:29:02, stefan-webrtc (holmer) wrote:
> > > > On 2015/07/30 09:15:40, pbos-webrtc wrote:
> > > > > On 2015/07/30 08:58:19, stefan-webrtc (holmer) wrote:
> > > > > > On 2015/07/29 18:19:55, juberti1 wrote:
> > > > > > > the comment seems to indicate that this is in fact an error, so
not
> > sure
> > > > we
> > > > > > > should stop logging completely, but I was hoping we would only log
> at
> > > most
> > > > N
> > > > > > > times (using a member var to keep track of how often we have
> logged).
> > > > > > > 
> > > > > > > For now, perhaps N=10, or some such
> > > > > > 
> > > > > > Our thinking is that it isn't really an error that the scheduler is
> > > falling
> > > > > > behind. There could be some heavy task taking time, which results in
> > other
> > > > > tasks
> > > > > > being delayed. We could log it as a warning, but I suspect that this
> is
> > > > likely
> > > > > > to happen every now and then in most calls.
> > > > > > 
> > > > > > I can however see that it could be useful to know if we frequently
> have
> > > very
> > > > > > delayed tasks. Maybe we could have a threshold and log every Nth
time
> a
> > > task
> > > > > is
> > > > > > delayed for more than, say, 20 ms?
> > > > > 
> > > > > I also don't think this is an error, a -1 here should not be an error
> code
> > > as
> > > > > far as I can tell. A common implementation of this could be:
> > > > > 
> > > > > Process(): {
> > > > >   ...
> > > > >   next_run_ms_ = Clock::NowMs() + 20;
> > > > > }
> > > > > 
> > > > > TimeUntilNextProcess() {
> > > > >   return next_run_ms_ - Clock::NowMs();
> > > > > }
> > > > > 
> > > > > I don't think it's useful to enforce clamping by the module in this
> case,
> > > but
> > > > if
> > > > > the process module falls behind this is something we'll log.
> > > > 
> > > > I agree. We'd prefer to let the process module know that tasks are
falling
> > > > behind than to leave it to each module to clamp the time.
> > > 
> > > There are implementations that are (or at least were) returning errors,
and
> > not
> > > "back in time" values.  I suppose it may be a mix.
> > > There's no point in telling the scheduler though that tasks are falling
> behind
> > > and there's nothing the scheduler can do about it, so I don't think these
> are
> > > right.  The main concern however was around the places that were returning
> > error
> > > values and not time values and that's why the log exists.
> > 
> > I see, thanks for that input.
> > 
> > The main reason we would like to accept negative values is to avoid having
> code
> > for clamping to 0 in each module separately. Were the modules returning
errors
> > fixed, or are they still around?
> 
> Any updates on this? Agree we shouldn't log on falling behind, but would be
good
> to report when an error is (incorrectly) returned. Is there any way to do
this?

I don't believe they were ever errors really. Stefan do you think that's the
case?

This means that all values are valid and as such we can't detect any of them as
erroneous (-1 which might have been used if that's the case is valid use). I'm
still on vacation this week so if you want it in like this then CQ it as my
email checking is very sparse.

tommi (sloooow) - chröme

tommi@chromium.org changed reviewers: + tommi@chromium.org

5 years, 4 months ago (2015-08-19 12:50:15 UTC) #14

tommi (sloooow) - chröme

The CQ bit was checked by tommi@chromium.org

5 years, 4 months ago (2015-08-19 12:50:15 UTC) #15

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1257833004/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1257833004/1

5 years, 4 months ago (2015-08-19 12:50:25 UTC) #17

tommi (sloooow) - chröme

On 2015/08/19 12:50:16, tommi wrote: > lgtm I figured we should just get this in ...

5 years, 4 months ago (2015-08-19 12:51:00 UTC) #18

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

5 years, 4 months ago (2015-08-19 12:52:01 UTC) #19

commit-bot: I haz the power

Try jobs failed on following builders: presubmit on tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/presubmit/builds/405)

5 years, 4 months ago (2015-08-19 12:52:02 UTC) #20

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1257833004/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1257833004/1

5 years, 4 months ago (2015-08-19 12:59:47 UTC) #24

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

5 years, 4 months ago (2015-08-19 14:50:35 UTC) #25

commit-bot: I haz the power

Try jobs failed on following builders: android on tryserver.webrtc (JOB_TIMED_OUT, no build URL)

5 years, 4 months ago (2015-08-19 14:50:36 UTC) #26

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1257833004/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1257833004/1

5 years, 4 months ago (2015-08-19 15:09:55 UTC) #28

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

5 years, 4 months ago (2015-08-19 17:10:10 UTC) #29

commit-bot: I haz the power

Try jobs failed on following builders: android on tryserver.webrtc (JOB_TIMED_OUT, no build URL)

5 years, 4 months ago (2015-08-19 17:10:10 UTC) #30

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1257833004/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1257833004/1

5 years, 4 months ago (2015-08-20 06:12:27 UTC) #33

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

5 years, 4 months ago (2015-08-20 08:12:34 UTC) #34

commit-bot: I haz the power

Try jobs failed on following builders: android on tryserver.webrtc (JOB_TIMED_OUT, no build URL)

5 years, 4 months ago (2015-08-20 08:12:35 UTC) #35

tommi (sloooow) - chröme

something is up with the cq. bringing out hammer.

5 years, 4 months ago (2015-08-20 08:22:45 UTC) #36

tommi (sloooow) - chröme

The CQ bit was checked by tommi@chromium.org

5 years, 4 months ago (2015-08-20 08:22:50 UTC) #37

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1257833004/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1257833004/1

5 years, 4 months ago (2015-08-20 08:23:04 UTC) #38

commit-bot: I haz the power

5 years, 4 months ago (2015-08-20 08:24:02 UTC) #40

Message was sent while issue was closed.

Patchset 1 (id:??) landed as
https://crrev.com/9e260f184b2ce0d4316267bf5464795dfd8ac883
Cr-Commit-Position: refs/heads/master@{#9740}

Issue 1257833004: Prevent TimeUntilNextProcess log spam. (Closed)

Description

Patch Set 1 #

Messages