|
|
Created:
3 years, 7 months ago by Taylor Brandstetter Modified:
3 years, 4 months ago Reviewers:
kwiberg-webrtc CC:
webrtc-reviews_webrtc.org, tterriberry_mozilla.com, kwiberg-webrtc Target Ref:
refs/heads/master Project:
webrtc Visibility:
Public. |
DescriptionFixing race between ~AsyncInvoker and ~AsyncClosure, using ref-counting.
The AsyncInvoker destructor waits for all invoked tasks to be complete
(in other words, all AsyncClosures to be destructed). They were using an
event to wake up the destructor, but a race made it possible for this
event to be dereferenced after it's destroyed.
This CL makes the event reference counted, such that if the destructor
runs right after AsyncClosure decrements "pending_invocations_",
setting the event will be a no-op, and the event will be destructed
in the AsyncClosure destructor.
This CL also fixes a deadlock that may occur for "re-entrant"
invocations. The deadlock occurs if the AsyncInvoker is destroyed on
thread A while a task on thread B is running, which AsyncInvokes a task
back on thread A.
This was causing pending_invocations_ to end up negative, because
an AsyncClosure that's never added to a thread's message queue (due to
the "destroying_" flag) caused the count to be decremented but not
incremented.
BUG=webrtc:7656
Review-Url: https://codereview.webrtc.org/2885143005
Cr-Commit-Position: refs/heads/master@{#19278}
Committed: https://chromium.googlesource.com/external/webrtc/+/3af63b0dc9d8444ca5480dfcccd1f87fa05eb232
Patch Set 1 #Patch Set 2 : Rebase on revision that still had CancelInvoke, which is a test we need. #Patch Set 3 : . #Patch Set 4 : Remove extra increments of pending_invocations_ #Patch Set 5 : Merging in https://codereview.webrtc.org/2885143006/ #Patch Set 6 : Only clear current thread's message queue in destructor loop #
Total comments: 13
Patch Set 7 : Adding comments and switching to std::atomic. #
Total comments: 15
Patch Set 8 : Adding comments and addressing nit. #Patch Set 9 : Add another TODO. #
Messages
Total messages: 18 (8 generated)
Description was changed from ========== Fix AsyncInvoker destructor race condition using ref-counting. BUG=webrtc:7656 ========== to ========== Fixing race between ~AsyncInvoker and ~AsyncClosure. The AsyncInvoker destructor waits for all invoked tasks to be complete (in other words, all AsyncClosures to be destructed). They were using an event to wake up the destructor, but a race made it possible for this event to be dereferenced after it's destroyed. This CL makes the event reference counted, such that if the destructor runs right after AsyncClosure decrements "pending_invocations_", setting the event will be a no-op, and the event will be destructed in the AsyncClosure destructor. BUG=webrtc:7656 ==========
Description was changed from ========== Fixing race between ~AsyncInvoker and ~AsyncClosure. The AsyncInvoker destructor waits for all invoked tasks to be complete (in other words, all AsyncClosures to be destructed). They were using an event to wake up the destructor, but a race made it possible for this event to be dereferenced after it's destroyed. This CL makes the event reference counted, such that if the destructor runs right after AsyncClosure decrements "pending_invocations_", setting the event will be a no-op, and the event will be destructed in the AsyncClosure destructor. BUG=webrtc:7656 ========== to ========== Fixing race between ~AsyncInvoker and ~AsyncClosure, using ref-counting. The AsyncInvoker destructor waits for all invoked tasks to be complete (in other words, all AsyncClosures to be destructed). They were using an event to wake up the destructor, but a race made it possible for this event to be dereferenced after it's destroyed. This CL makes the event reference counted, such that if the destructor runs right after AsyncClosure decrements "pending_invocations_", setting the event will be a no-op, and the event will be destructed in the AsyncClosure destructor. BUG=webrtc:7656 ==========
Description was changed from ========== Fixing race between ~AsyncInvoker and ~AsyncClosure, using ref-counting. The AsyncInvoker destructor waits for all invoked tasks to be complete (in other words, all AsyncClosures to be destructed). They were using an event to wake up the destructor, but a race made it possible for this event to be dereferenced after it's destroyed. This CL makes the event reference counted, such that if the destructor runs right after AsyncClosure decrements "pending_invocations_", setting the event will be a no-op, and the event will be destructed in the AsyncClosure destructor. BUG=webrtc:7656 ========== to ========== Fixing race between ~AsyncInvoker and ~AsyncClosure, using ref-counting. The AsyncInvoker destructor waits for all invoked tasks to be complete (in other words, all AsyncClosures to be destructed). They were using an event to wake up the destructor, but a race made it possible for this event to be dereferenced after it's destroyed. This CL makes the event reference counted, such that if the destructor runs right after AsyncClosure decrements "pending_invocations_", setting the event will be a no-op, and the event will be destructed in the AsyncClosure destructor. This CL also fixes a deadlock that may occur for "re-entrant" invocations. The deadlock occurs if the AsyncInvoker is destroyed on thread A while a task on thread B is running, which AsyncInvokes a task back on thread A. This was causing pending_invocations_ to end up negative, because an AsyncClosure that's never added to a thread's message queue (due to the "destroying_" flag) caused the count to be decremented but not incremented. BUG=webrtc:7656 ==========
deadbeef@webrtc.org changed reviewers: + kwiberg@webrtc.org
kwiberg: I don't know how familiar you are with AsyncInvoker, but would you be able to review this? It fixes a potential crash and a potential deadlock (both of which people have reported encountering at least once), at the cost of an additional AtomicOps::Decrement (in RefCountedObject). There probably are ways to accomplish this with fewer locks/atomic operations, but I'm having a hard time coming up with anything elegant.
lgtm, but see comments I'm not an AsyncInvoker expert by any means, but it certainly looks like this change does what you claim it does. https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.cc (right): https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:33: Thread::Current()->Clear(this); What does the change on this line do? https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:49: return; A comment explaining why this is a no-op if we're being destroyed? https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:74: // to use it. Doesn't that mean that the warning message is flaky? If we don't know if a situation is an error or not, should we warn about it? https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.h (right): https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.h:129: scoped_refptr<RefCountedObject<Event>> invocation_complete_; Umm... isn't this exactly how scoped_refptr isn't intended to be used? The ref counting is supposed to be an inalienable part of the class, so that it's impossible to create non-refcounted instances. But that isn't the case with Event. Adding reference counting to a class that lacks native support for it is precisely the intended job of std::shared_ptr, on the other hand. I'm thinking we should consider deviating from Chromium's ban on shared_ptr for cases where non-intrusive reference counting is what we want; I'll write something up. In the mean time, I guess using this construction in this CL is OK. Having to create a RefCountedEventContainer class would be painful. https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.h:133: // terminates. That's why you need it. Could you also explain what the value means? Is it a count of the number of currently running destructors, or what? https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.h:134: int destroying_ = 0; Ooohhh... when reading the .cc file, I found that you use atomic operations on this member. That isn't apparent from its type! Consider using std::atomic<int> instead. (Side note: We should probably remove our home-grown atomics and use the std:: atomics everywhere.) Hmm. No, it should be std::atomic<bool>, right? It isn't a counter, but with our homegrown atomics, int is the only available type.
Switched to std::atomic, take another look. https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.cc (right): https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:33: Thread::Current()->Clear(this); On 2017/08/04 09:28:09, kwiberg-webrtc wrote: > What does the change on this line do? It only clears events on the current thread, which is all that's necessary (see comment above). This is something nisse@ pointed out in a previous review: https://codereview.webrtc.org/2876273002/diff/1/webrtc/base/asyncinvoker.cc#n... https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:49: return; On 2017/08/04 09:28:09, kwiberg-webrtc wrote: > A comment explaining why this is a no-op if we're being destroyed? I don't know exactly. I went back to the CL where this was added, in 2014, but didn't find a clear explanation. Though it makes sense that if you're destroying, you don't want to execute even more tasks. The goal is to finish all tasks in progress, not start executing new ones. I'll add a comment. https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:74: // to use it. On 2017/08/04 09:28:09, kwiberg-webrtc wrote: > Doesn't that mean that the warning message is flaky? If we don't know if a > situation is an error or not, should we warn about it? My thinking was "I'd rather have a flaky warning that directs people to this comment than have no warning at all." Our guidelines say "warning" log levels should be used "for undesired but relatively expected events, which *may* indicate a problem." Though, are there any chromium-specific log level guidelines? I couldn't find any. https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.h (right): https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.h:129: scoped_refptr<RefCountedObject<Event>> invocation_complete_; On 2017/08/04 09:28:09, kwiberg-webrtc wrote: > Umm... isn't this exactly how scoped_refptr isn't intended to be used? The ref > counting is supposed to be an inalienable part of the class, so that it's > impossible to create non-refcounted instances. But that isn't the case with > Event. > > Adding reference counting to a class that lacks native support for it is > precisely the intended job of std::shared_ptr, on the other hand. > > I'm thinking we should consider deviating from Chromium's ban on shared_ptr for > cases where non-intrusive reference counting is what we want; I'll write > something up. In the mean time, I guess using this construction in this CL is > OK. Having to create a RefCountedEventContainer class would be painful. Sorry, I still don't understand what the problem is. I'd use shared_ptr if I could, but I don't see what practical difference it makes whether Event inherits from RefCountInterface or not. Can you explain more/give an example? https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.h:134: int destroying_ = 0; On 2017/08/04 09:28:09, kwiberg-webrtc wrote: > Ooohhh... when reading the .cc file, I found that you use atomic operations on > this member. That isn't apparent from its type! Consider using std::atomic<int> > instead. (Side note: We should probably remove our home-grown atomics and use > the std:: atomics everywhere.) > > Hmm. No, it should be std::atomic<bool>, right? It isn't a counter, but with our > homegrown atomics, int is the only available type. Oh, apparently <atomic> is a feature chromium allows now! I'll switch to that.
Lots of subtleties with std::atomic... :-) I'm by no means an expert, so I tried to flag everything I thought *might* be wrong, and provide a rationale; it's quite possible that all you need to do is explain why I'm wrong in each case. https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.cc (right): https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:74: // to use it. On 2017/08/04 19:14:22, Taylor Brandstetter wrote: > On 2017/08/04 09:28:09, kwiberg-webrtc wrote: > > Doesn't that mean that the warning message is flaky? If we don't know if a > > situation is an error or not, should we warn about it? > > My thinking was "I'd rather have a flaky warning that directs people to this > comment than have no warning at all." Our guidelines say "warning" log levels > should be used "for undesired but relatively expected events, which *may* > indicate a problem." Hmm, OK. > Though, are there any chromium-specific log level guidelines? I couldn't find > any. Not that I'm aware of. https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.h (right): https://codereview.webrtc.org/2885143005/diff/100001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.h:129: scoped_refptr<RefCountedObject<Event>> invocation_complete_; On 2017/08/04 19:14:22, Taylor Brandstetter wrote: > On 2017/08/04 09:28:09, kwiberg-webrtc wrote: > > Umm... isn't this exactly how scoped_refptr isn't intended to be used? The ref > > counting is supposed to be an inalienable part of the class, so that it's > > impossible to create non-refcounted instances. But that isn't the case with > > Event. > > > > Adding reference counting to a class that lacks native support for it is > > precisely the intended job of std::shared_ptr, on the other hand. > > > > I'm thinking we should consider deviating from Chromium's ban on shared_ptr > for > > cases where non-intrusive reference counting is what we want; I'll write > > something up. In the mean time, I guess using this construction in this CL is > > OK. Having to create a RefCountedEventContainer class would be painful. > > Sorry, I still don't understand what the problem is. I'd use shared_ptr if I > could, but I don't see what practical difference it makes whether Event inherits > from RefCountInterface or not. Can you explain more/give an example? No, it's just me who's confused. The requirements on a refcounted type is that it must only owned by folks who use its reference count correctly (which almost inevitably implies it must be on the heap). Event isn't such a type, but RefCountedObject<Event> is, and the fact that RefCountedObject<Event> is a subclass of Event doesn't matter. However, 1. The reference counter methods are virtual for no good reason. (This could be fixed by making a template like RefCountedObject that has non-virtual reference counter methods.) 2. The contained class is required to be non-final. (Not a problem in this case, because Event is non-final. Could be fixed by making a refcounted template that has the given class as a member instead of as a subclass.) 3. The contained class is required to be concrete. (Not a problem in this case, because Event is concrete. Could be fixed by having a unique_ptr<T> instead of a plain T as member.) ...at which point we've essentially reinvented a primitive form of std::shared_ptr. So keep doing what you're doing for now (it *is* safe), and we'll see if we can get a consensus on allowing std::shared_ptr in cases like this where it would obviously be a good fit. https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.cc (right): https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:24: destroying_.store(true, std::memory_order_relaxed); Doesn't this need to be std::memory_order_release, so that no reads or writes in the current thread can be reordered after this store? https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:30: while (pending_invocations_.load(std::memory_order_acquire)) { For clarity, avoid implicit int -> bool conversion: while (pending_invocations_.load(std::memory_order_acquire) > 0) https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:52: if (destroying_.load(std::memory_order_relaxed)) std::memory_order_acquire, so that no reads or writes in the current thread can be reordered before this load? https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:74: if (destroying_.load(std::memory_order_relaxed)) { Again, std::memory_order_acquire? https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:91: if (destroying_.load(std::memory_order_relaxed)) { And again... https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:125: invoker_->pending_invocations_.fetch_add(1, std::memory_order_relaxed); std::memory_order_acquire? (Because unlike a reference count, where you never increment from 0 to 1, pending_invocations_ can be incremented from 0 to 1 by this operation, which makes it unsafe to reorder reads and writes from this thread before this increment.)
https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.cc (right): https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:24: destroying_.store(true, std::memory_order_relaxed); On 2017/08/06 04:17:12, kwiberg-webrtc wrote: > Doesn't this need to be std::memory_order_release, so that no reads or writes in > the current thread can be reordered after this store? I don't think so. "destroying_" doesn't need to be synchronized with anything, it just needs to propagate *eventually* to every thread, to potentially break an endless loop of AsyncInvokes. See the comment in the header: // This flag is used to ensure that if an application AsyncInvokes tasks that // recursively AsyncInvoke other tasks ad infinitum, the cycle eventually // terminates. For example, suppose I own an AsyncInvoker, but I need to do some work on a different thread, so I do: invoker_.AsyncInvoke(RTC_FROM_HERE, worker_thread, rtc::Bind(this, Foo::DoWork)); And DoWork does this: void Foo::DoWork() { work_item = work_queue_.pop(); DoWorkInternal(work_item); if (!work_queue_.empty()) { invoker_.AsyncInvoke(RTC_FROM_HERE, worker_thread, rtc::Bind(this, Foo::DoWork)); } } If Foo tries to destroy the invoker on the "main" thread, then without the "destroying" flag, it would be blocked until the work queue is empty. I'm not saying this is a great pattern in the first place, but it's one AsyncInvoker supports. https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:30: while (pending_invocations_.load(std::memory_order_acquire)) { On 2017/08/06 04:17:12, kwiberg-webrtc wrote: > For clarity, avoid implicit int -> bool conversion: > > while (pending_invocations_.load(std::memory_order_acquire) > 0) Done. https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:52: if (destroying_.load(std::memory_order_relaxed)) On 2017/08/06 04:17:12, kwiberg-webrtc wrote: > std::memory_order_acquire, so that no reads or writes in the current thread can > be reordered before this load? Not needed, see above. Same for below comments around destroying_. https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:125: invoker_->pending_invocations_.fetch_add(1, std::memory_order_relaxed); On 2017/08/06 04:17:12, kwiberg-webrtc wrote: > std::memory_order_acquire? (Because unlike a reference count, where you never > increment from 0 to 1, pending_invocations_ can be incremented from 0 to 1 by > this operation, which makes it unsafe to reorder reads and writes from this > thread before this increment.) This would only be an issue if one thread is incrementing the count from 0->1 while another thread is destroying the AsyncInvoker. But that means one thread is calling "AsyncInvoke" while another thread is destroying the invoker, which is inherently unsafe. The only time when a "safe" usage of this class could result in AsyncInvoke being called during destruction is when the AsyncInvoke is happening from *within* an AsyncInvoked functor (the so-called "re-entrant"/"recursive" invocations). But in that case, the outer invocation is holding a reference, so it could only be incremented from 1->2. And it's not possible for the decrement of pending_invocations_ to be reordered relative to the increment, so we don't have to worry about, say, it being decremented from 1->0 before being incremented from 1->2. I added some more comments to the header to make it more clear what things are "safe" and how the class should be properly used.
lgtm https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.cc (right): https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:24: destroying_.store(true, std::memory_order_relaxed); On 2017/08/07 20:28:12, Taylor Brandstetter wrote: > On 2017/08/06 04:17:12, kwiberg-webrtc wrote: > > Doesn't this need to be std::memory_order_release, so that no reads or writes > in > > the current thread can be reordered after this store? > > I don't think so. "destroying_" doesn't need to be synchronized with anything, > it just needs to propagate *eventually* to every thread, to potentially break an > endless loop of AsyncInvokes. See the comment in the header: > > // This flag is used to ensure that if an application AsyncInvokes tasks that > // recursively AsyncInvoke other tasks ad infinitum, the cycle eventually > // terminates. > > For example, suppose I own an AsyncInvoker, but I need to do some work on a > different thread, so I do: > > invoker_.AsyncInvoke(RTC_FROM_HERE, worker_thread, rtc::Bind(this, > Foo::DoWork)); > > And DoWork does this: > > void Foo::DoWork() { > work_item = work_queue_.pop(); > DoWorkInternal(work_item); > if (!work_queue_.empty()) { > invoker_.AsyncInvoke(RTC_FROM_HERE, worker_thread, rtc::Bind(this, > Foo::DoWork)); > } > } > > If Foo tries to destroy the invoker on the "main" thread, then without the > "destroying" flag, it would be blocked until the work queue is empty. > > I'm not saying this is a great pattern in the first place, but it's one > AsyncInvoker supports. Ah, OK. Yeah, that seems good. https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:52: if (destroying_.load(std::memory_order_relaxed)) On 2017/08/07 20:28:12, Taylor Brandstetter wrote: > On 2017/08/06 04:17:12, kwiberg-webrtc wrote: > > std::memory_order_acquire, so that no reads or writes in the current thread > can > > be reordered before this load? > > Not needed, see above. Same for below comments around destroying_. Acknowledged. https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:125: invoker_->pending_invocations_.fetch_add(1, std::memory_order_relaxed); On 2017/08/07 20:28:12, Taylor Brandstetter wrote: > On 2017/08/06 04:17:12, kwiberg-webrtc wrote: > > std::memory_order_acquire? (Because unlike a reference count, where you never > > increment from 0 to 1, pending_invocations_ can be incremented from 0 to 1 by > > this operation, which makes it unsafe to reorder reads and writes from this > > thread before this increment.) > > This would only be an issue if one thread is incrementing the count from 0->1 > while another thread is destroying the AsyncInvoker. But that means one thread > is calling "AsyncInvoke" while another thread is destroying the invoker, which > is inherently unsafe. > > The only time when a "safe" usage of this class could result in AsyncInvoke > being called during destruction is when the AsyncInvoke is happening from > *within* an AsyncInvoked functor (the so-called "re-entrant"/"recursive" > invocations). But in that case, the outer invocation is holding a reference, so > it could only be incremented from 1->2. And it's not possible for the decrement > of pending_invocations_ to be reordered relative to the increment, so we don't > have to worry about, say, it being decremented from 1->0 before being > incremented from 1->2. > > I added some more comments to the header to make it more clear what things are > "safe" and how the class should be properly used. Acknowledged. Ach, this is subtle... I have a strong feeling that the right thing to do from a code maintenance perspective is to not have naked atomic operations mixed in with a heap of other nontrivial code like this, but to package them up in a clearly documented class. Compare with the current best practice of handling heap object ownership in C++: always use a dedicated smart pointer class, except when implementing smart pointers. Not because raw new and delete are impossible to use correctly, but because they have sharp edges that we'd rather not deal with at the same time we're dealing with other nontrivial things.
https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.cc (right): https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:125: invoker_->pending_invocations_.fetch_add(1, std::memory_order_relaxed); On 2017/08/08 22:22:39, kwiberg-webrtc wrote: > On 2017/08/07 20:28:12, Taylor Brandstetter wrote: > > On 2017/08/06 04:17:12, kwiberg-webrtc wrote: > > > std::memory_order_acquire? (Because unlike a reference count, where you > never > > > increment from 0 to 1, pending_invocations_ can be incremented from 0 to 1 > by > > > this operation, which makes it unsafe to reorder reads and writes from this > > > thread before this increment.) > > > > This would only be an issue if one thread is incrementing the count from 0->1 > > while another thread is destroying the AsyncInvoker. But that means one thread > > is calling "AsyncInvoke" while another thread is destroying the invoker, which > > is inherently unsafe. > > > > The only time when a "safe" usage of this class could result in AsyncInvoke > > being called during destruction is when the AsyncInvoke is happening from > > *within* an AsyncInvoked functor (the so-called "re-entrant"/"recursive" > > invocations). But in that case, the outer invocation is holding a reference, > so > > it could only be incremented from 1->2. And it's not possible for the > decrement > > of pending_invocations_ to be reordered relative to the increment, so we don't > > have to worry about, say, it being decremented from 1->0 before being > > incremented from 1->2. > > > > I added some more comments to the header to make it more clear what things are > > "safe" and how the class should be properly used. > > Acknowledged. Ach, this is subtle... I have a strong feeling that the right > thing to do from a code maintenance perspective is to not have naked atomic > operations mixed in with a heap of other nontrivial code like this, but to > package them up in a clearly documented class. Compare with the current best > practice of handling heap object ownership in C++: always use a dedicated smart > pointer class, except when implementing smart pointers. Not because raw new and > delete are impossible to use correctly, but because they have sharp edges that > we'd rather not deal with at the same time we're dealing with other nontrivial > things. Agreed. I can't think of a great way to split this functionality into an independent class at the moment, but I'll add a TODO comment for now and think about it.
The CQ bit was checked by deadbeef@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from kwiberg@webrtc.org Link to the patchset: https://codereview.webrtc.org/2885143005/#ps160001 (title: "Add another TODO.")
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
CQ is committing da patch. Bot data: {"patchset_id": 160001, "attempt_start_ts": 1502235750569520, "parent_rev": "42f96d53f3acf7a721d456b6eae872bb6f37a0a0", "commit_rev": "3af63b0dc9d8444ca5480dfcccd1f87fa05eb232"}
Message was sent while issue was closed.
Description was changed from ========== Fixing race between ~AsyncInvoker and ~AsyncClosure, using ref-counting. The AsyncInvoker destructor waits for all invoked tasks to be complete (in other words, all AsyncClosures to be destructed). They were using an event to wake up the destructor, but a race made it possible for this event to be dereferenced after it's destroyed. This CL makes the event reference counted, such that if the destructor runs right after AsyncClosure decrements "pending_invocations_", setting the event will be a no-op, and the event will be destructed in the AsyncClosure destructor. This CL also fixes a deadlock that may occur for "re-entrant" invocations. The deadlock occurs if the AsyncInvoker is destroyed on thread A while a task on thread B is running, which AsyncInvokes a task back on thread A. This was causing pending_invocations_ to end up negative, because an AsyncClosure that's never added to a thread's message queue (due to the "destroying_" flag) caused the count to be decremented but not incremented. BUG=webrtc:7656 ========== to ========== Fixing race between ~AsyncInvoker and ~AsyncClosure, using ref-counting. The AsyncInvoker destructor waits for all invoked tasks to be complete (in other words, all AsyncClosures to be destructed). They were using an event to wake up the destructor, but a race made it possible for this event to be dereferenced after it's destroyed. This CL makes the event reference counted, such that if the destructor runs right after AsyncClosure decrements "pending_invocations_", setting the event will be a no-op, and the event will be destructed in the AsyncClosure destructor. This CL also fixes a deadlock that may occur for "re-entrant" invocations. The deadlock occurs if the AsyncInvoker is destroyed on thread A while a task on thread B is running, which AsyncInvokes a task back on thread A. This was causing pending_invocations_ to end up negative, because an AsyncClosure that's never added to a thread's message queue (due to the "destroying_" flag) caused the count to be decremented but not incremented. BUG=webrtc:7656 Review-Url: https://codereview.webrtc.org/2885143005 Cr-Commit-Position: refs/heads/master@{#19278} Committed: https://chromium.googlesource.com/external/webrtc/+/3af63b0dc9d8444ca5480dfcc... ==========
Message was sent while issue was closed.
Committed patchset #9 (id:160001) as https://chromium.googlesource.com/external/webrtc/+/3af63b0dc9d8444ca5480dfcc...
Message was sent while issue was closed.
https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... File webrtc/rtc_base/asyncinvoker.cc (right): https://codereview.webrtc.org/2885143005/diff/120001/webrtc/rtc_base/asyncinv... webrtc/rtc_base/asyncinvoker.cc:125: invoker_->pending_invocations_.fetch_add(1, std::memory_order_relaxed); On 2017/08/08 23:42:16, Taylor Brandstetter wrote: > On 2017/08/08 22:22:39, kwiberg-webrtc wrote: > > On 2017/08/07 20:28:12, Taylor Brandstetter wrote: > > > On 2017/08/06 04:17:12, kwiberg-webrtc wrote: > > > > std::memory_order_acquire? (Because unlike a reference count, where you > > never > > > > increment from 0 to 1, pending_invocations_ can be incremented from 0 to 1 > > by > > > > this operation, which makes it unsafe to reorder reads and writes from > this > > > > thread before this increment.) > > > > > > This would only be an issue if one thread is incrementing the count from > 0->1 > > > while another thread is destroying the AsyncInvoker. But that means one > thread > > > is calling "AsyncInvoke" while another thread is destroying the invoker, > which > > > is inherently unsafe. > > > > > > The only time when a "safe" usage of this class could result in AsyncInvoke > > > being called during destruction is when the AsyncInvoke is happening from > > > *within* an AsyncInvoked functor (the so-called "re-entrant"/"recursive" > > > invocations). But in that case, the outer invocation is holding a reference, > > so > > > it could only be incremented from 1->2. And it's not possible for the > > decrement > > > of pending_invocations_ to be reordered relative to the increment, so we > don't > > > have to worry about, say, it being decremented from 1->0 before being > > > incremented from 1->2. > > > > > > I added some more comments to the header to make it more clear what things > are > > > "safe" and how the class should be properly used. > > > > Acknowledged. Ach, this is subtle... I have a strong feeling that the right > > thing to do from a code maintenance perspective is to not have naked atomic > > operations mixed in with a heap of other nontrivial code like this, but to > > package them up in a clearly documented class. Compare with the current best > > practice of handling heap object ownership in C++: always use a dedicated > smart > > pointer class, except when implementing smart pointers. Not because raw new > and > > delete are impossible to use correctly, but because they have sharp edges that > > we'd rather not deal with at the same time we're dealing with other nontrivial > > things. > > Agreed. I can't think of a great way to split this functionality into an > independent class at the moment, but I'll add a TODO comment for now and think > about it. I was thinking along these lines: // Instances of this class are initially unset. The flag is // set and tested with atomic instructions, which makes it // thread-safe, but there are no memory barriers that prevent // the setting and testing of the flag being reordered relative // to surrounding memory accesses. Informally, use this to // signal a condition when it doesn't matter if the signal // arrives a little early or a little late. class NonsynchronizingAtomicOneWayFlag { void Set(); // Atomically set the flag. bool IsSet(); // Has the flag been set? }; // Instances of this class initially have a reference count of 1. // Incref() and Decref() calls can be freely mixed and are // thread-safe. Once the refcount goes to zero, the counter is // spent, and you may no longer call either Incref() or Decref(); // this means that the caller needs to statically ensure that // the refcount is at least 1. class RefCounter { void Incref(); // Atomically increases the refcount by 1. bool Decref(); // Atomically decreases the refcount by 1, // returning true if the refcount reached zero. }; pending_invocations_ is more complicated, so I won't try to design it here. |