|
|
DescriptionNew lock implementation for mac.
According to my measurements, it's about 100x faster than the native mutex implementation in OSX. Google "OSX mutex performance" for more info.
BUG=
Committed: https://crrev.com/ed281e9c9b2fb8562fbfa9737e72ab7c5493722e
Cr-Commit-Position: refs/heads/master@{#11352}
Patch Set 1 #
Total comments: 4
Patch Set 2 : Add unit test and address some comments #Patch Set 3 : Attempt to fix win build error #
Total comments: 6
Patch Set 4 : Rebase #Patch Set 5 : Changed the test for more accuracy, fixed a bug, redid measurements #Patch Set 6 : Disable the test again #Patch Set 7 : Address comments #
Total comments: 4
Messages
Total messages: 29 (9 generated)
tommi@webrtc.org changed reviewers: + torbjorng@webrtc.org
Hej Torbjorn - can you take a look? There might be alternate ways to go and I'd like to include a performance test too. So, consider this more of an fyi than a review request atm.
It is not clear to me that this is an improvement; but I wouldn't deny the possibility of that it is. :-) There is some overlap between the code conditioned under CS_DEBUG_CODE and the new code, such as the recursion count. For readability, perhaps split out system dependent functions rather than having #ifdef partitioning functions into system dependent parts? https://codereview.webrtc.org/1594723003/diff/1/webrtc/base/criticalsection.cc File webrtc/base/criticalsection.cc (right): https://codereview.webrtc.org/1594723003/diff/1/webrtc/base/criticalsection.c... webrtc/base/criticalsection.cc:26: RTC_DCHECK(semaphore_); DCHECK? Shouldn't we barf in all builds if this call fails? https://codereview.webrtc.org/1594723003/diff/1/webrtc/base/criticalsection.c... webrtc/base/criticalsection.cc:58: if (TryEnter()) TryEnter seems to perform a CompareAndSwap unconditionally (except for same thread). That will move the lock cache-line to this thread's current physical core, putting the cache-line in "dirty" state. This is bad when we fail to get the lock, since subsequently the lock owner will need to cache the lock cache-line before unlocking. I leaner poll just reads the lock, and then conditionally immediately attempts to lock it. The racing condition between the poll and attempted locking should not matter (at least not if all contenders use the same non-aggressive approach).
Add unit test and address some comments
Added a test and wrote down some findings from running that test in the test file. The test is disabled by default since I don't think there's a need to run it on the trybots, but it's useful to run locally when tweaking the code. I added a TODO to split the implementation up into separate per-platform files. https://codereview.webrtc.org/1594723003/diff/1/webrtc/base/criticalsection.cc File webrtc/base/criticalsection.cc (right): https://codereview.webrtc.org/1594723003/diff/1/webrtc/base/criticalsection.c... webrtc/base/criticalsection.cc:26: RTC_DCHECK(semaphore_); On 2016/01/20 15:46:43, torbjorng (webrtc) wrote: > DCHECK? Shouldn't we barf in all builds if this call fails? I'll just remove this check. It's almost akin to checking new(), which we don't do. We also don't check calls that instantiate OS based primitives such as locks in general. https://codereview.webrtc.org/1594723003/diff/1/webrtc/base/criticalsection.c... webrtc/base/criticalsection.cc:58: if (TryEnter()) On 2016/01/20 15:46:43, torbjorng (webrtc) wrote: > TryEnter seems to perform a CompareAndSwap unconditionally (except for same > thread). That will move the lock cache-line to this thread's current physical > core, putting the cache-line in "dirty" state. This is bad when we fail to get > the lock, since subsequently the lock owner will need to cache the lock > cache-line before unlocking. > > I leaner poll just reads the lock, and then conditionally immediately attempts > to lock it. The racing condition between the poll and attempted locking should > not matter (at least not if all contenders use the same non-aggressive > approach). > I rewrote the loop to not call TryEnter() but instead do this special case since I don't think it's relevant to the TryEnter case (i.e. there's no spinning in that case). In my timing measurements I didn't see a significant enough difference to be able to say with confidence that there's a clear win, but it's not slower at least and I understand why it makes sense to do this. Included cache line barriers in my test code btw.
oh and I also added a flag to make it easy to flip between the two possible implementations.
Attempt to fix win build error
Rebase
LGTM. Nice perf numbers, now I am convinced! https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... File webrtc/base/criticalsection.cc (right): https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... webrtc/base/criticalsection.cc:62: // operations, first a read-only one in order to avoid affecting the the Nit: The the the there should be fixed. https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... webrtc/base/criticalsection.cc:83: RTC_DCHECK(owning_thread_ != pthread_self()); pthread_self() -> self https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... File webrtc/base/criticalsection_unittest.cc (right): https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... webrtc/base/criticalsection_unittest.cc:399: // Cool. You cannot argue with numbers. Nit: Mention more about the timing platform to simplify future simplifications. E.g., list CPU type and Mac OS version.
Changed the test for more accuracy, fixed a bug, redid measurements
Description was changed from ========== New lock implementation for mac BUG= ========== to ========== New lock implementation for mac. According to my measurements, it's about 100x faster than the native mutex implementation in OSX. Google "OSX mutex performance" for more info. BUG= ==========
Disable the test again
Address comments
https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... File webrtc/base/criticalsection.cc (right): https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... webrtc/base/criticalsection.cc:62: // operations, first a read-only one in order to avoid affecting the the On 2016/01/21 16:57:23, torbjorng (webrtc) wrote: > Nit: The the the there should be fixed. addressed the the the the comment. https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... webrtc/base/criticalsection.cc:83: RTC_DCHECK(owning_thread_ != pthread_self()); On 2016/01/21 16:57:23, torbjorng (webrtc) wrote: > pthread_self() -> self Done. https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... File webrtc/base/criticalsection_unittest.cc (right): https://codereview.webrtc.org/1594723003/diff/40001/webrtc/base/criticalsecti... webrtc/base/criticalsection_unittest.cc:399: // On 2016/01/21 16:57:23, torbjorng (webrtc) wrote: > Cool. You cannot argue with numbers. > > Nit: Mention more about the timing platform to simplify future simplifications. > E.g., list CPU type and Mac OS version. Done.
The CQ bit was checked by tommi@webrtc.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1594723003/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1594723003/120001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_arm64_rel on tryserver.webrtc (JOB_TIMED_OUT, no build URL) linux_baremetal on tryserver.webrtc (JOB_TIMED_OUT, no build URL) win_baremetal on tryserver.webrtc (JOB_TIMED_OUT, no build URL)
The CQ bit was checked by tommi@webrtc.org
The patchset sent to the CQ was uploaded after l-g-t-m from torbjorng@webrtc.org Link to the patchset: https://codereview.webrtc.org/1594723003/#ps120001 (title: "Address comments")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1594723003/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1594723003/120001
Message was sent while issue was closed.
Description was changed from ========== New lock implementation for mac. According to my measurements, it's about 100x faster than the native mutex implementation in OSX. Google "OSX mutex performance" for more info. BUG= ========== to ========== New lock implementation for mac. According to my measurements, it's about 100x faster than the native mutex implementation in OSX. Google "OSX mutex performance" for more info. BUG= ==========
Message was sent while issue was closed.
Committed patchset #7 (id:120001)
Message was sent while issue was closed.
Description was changed from ========== New lock implementation for mac. According to my measurements, it's about 100x faster than the native mutex implementation in OSX. Google "OSX mutex performance" for more info. BUG= ========== to ========== New lock implementation for mac. According to my measurements, it's about 100x faster than the native mutex implementation in OSX. Google "OSX mutex performance" for more info. BUG= Committed: https://crrev.com/ed281e9c9b2fb8562fbfa9737e72ab7c5493722e Cr-Commit-Position: refs/heads/master@{#11352} ==========
Message was sent while issue was closed.
Patchset 7 (id:??) landed as https://crrev.com/ed281e9c9b2fb8562fbfa9737e72ab7c5493722e Cr-Commit-Position: refs/heads/master@{#11352}
Message was sent while issue was closed.
jbauch@webrtc.org changed reviewers: + jbauch@webrtc.org
Message was sent while issue was closed.
Just driving by, nice change! Quick comment about comparing thread ids, not sure if really necessary here though... https://codereview.webrtc.org/1594723003/diff/120001/webrtc/base/criticalsect... File webrtc/base/criticalsection.cc (right): https://codereview.webrtc.org/1594723003/diff/120001/webrtc/base/criticalsect... webrtc/base/criticalsection.cc:64: if (owning_thread_ != self) { Maybe change this to use "pthread_equal" to compare the thread ids. According to the man page "The pthread_equal() function is necessary because thread IDs should be considered opaque: there is no portable way for applications to directly compare two pthread_t values." However as this is MAC-only, you can probably keep it as-is. https://codereview.webrtc.org/1594723003/diff/120001/webrtc/base/criticalsect... webrtc/base/criticalsection.cc:83: RTC_DCHECK(owning_thread_ != self); Same here. https://codereview.webrtc.org/1594723003/diff/120001/webrtc/base/criticalsect... webrtc/base/criticalsection.cc:114: if (owning_thread_ != pthread_self()) { And here.
Message was sent while issue was closed.
Thanks for taking a look joachim. https://codereview.webrtc.org/1594723003/diff/120001/webrtc/base/criticalsect... File webrtc/base/criticalsection.cc (right): https://codereview.webrtc.org/1594723003/diff/120001/webrtc/base/criticalsect... webrtc/base/criticalsection.cc:64: if (owning_thread_ != self) { On 2016/01/22 08:35:38, joachim wrote: > Maybe change this to use "pthread_equal" to compare the thread ids. > > According to the man page "The pthread_equal() function is necessary because > thread IDs should be considered opaque: there is no portable way for > applications to directly compare two pthread_t values." > > However as this is MAC-only, you can probably keep it as-is. Ah, thanks. I had meant to use PlatformThreadRef and IsThreadRefEqual from platform_thread.h. I'll do a follow up cl even though this is ok for Mac.
Message was sent while issue was closed.
On 2016/01/22 08:51:59, tommi-webrtc wrote: > Thanks for taking a look joachim. > > https://codereview.webrtc.org/1594723003/diff/120001/webrtc/base/criticalsect... > File webrtc/base/criticalsection.cc (right): > > https://codereview.webrtc.org/1594723003/diff/120001/webrtc/base/criticalsect... > webrtc/base/criticalsection.cc:64: if (owning_thread_ != self) { > On 2016/01/22 08:35:38, joachim wrote: > > Maybe change this to use "pthread_equal" to compare the thread ids. > > > > According to the man page "The pthread_equal() function is necessary because > > thread IDs should be considered opaque: there is no portable way for > > applications to directly compare two pthread_t values." > > > > However as this is MAC-only, you can probably keep it as-is. > > Ah, thanks. I had meant to use PlatformThreadRef and IsThreadRefEqual from > platform_thread.h. I'll do a follow up cl even though this is ok for Mac. https://codereview.webrtc.org/1619153003/ |