|
|
Created:
3 years, 8 months ago by nisse-webrtc Modified:
3 years, 7 months ago CC:
webrtc-reviews_webrtc.org, tterriberry_mozilla.com, mflodman, kjellander_webrtc Target Ref:
refs/heads/master Project:
webrtc Visibility:
Public. |
DescriptionDelete media type check in Call::NotifyBweOfReceivedPacket.
This was introduced in cl https://codereview.webrtc.org/2673523003/, and at the time needed to not break perf tests. It appears to no longer be needed.
BUG=webrtc:6847, webrtc:7062
Patch Set 1 #Patch Set 2 : Rebase. #Patch Set 3 : Fix/improve voe::Channel destruction race. #
Total comments: 2
Patch Set 4 : Speculative add of encode_stop_lock_. #Patch Set 5 : Update RTCStatsIntegrationTest to tolerate available_incoming_bitrate being set. #Patch Set 6 : Undo voe::Channel workarounds. #
Total comments: 3
Patch Set 7 : Adjust test of available_incoming_bitrate, as suggested by hbos. #Patch Set 8 : Comment update. #
Total comments: 1
Messages
Total messages: 41 (19 generated)
Description was changed from ========== Delete media type check in Call::NotifyBweOfReceivedPacket. BUG= ========== to ========== Delete media type check in Call::NotifyBweOfReceivedPacket. This was introduced in cl https://codereview.webrtc.org/2673523003/, and at the time needed to not break perf tests. It appears to no longer be needed. BUG=webrtc:6847 ==========
nisse@webrtc.org changed reviewers: + stefan@webrtc.org
The CQ bit was checked by nisse@webrtc.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
PTAL. I get a few failures when running webrtc_perf_tests locally, FAILED TESTS (4/110): ./out/Release/webrtc_perf_tests: FullStackTest.ForemanCifPlr5H264 ./out/Release/webrtc_perf_tests: FullStackTest.ForemanCifPlr5H264Flexfec ./out/Release/webrtc_perf_tests: FullStackTest.ForemanCifWithoutPacketlossH264 ./out/Release/webrtc_perf_tests: RampUpTest.UpDownUpAbsSendTimeSimulcastRedRtx but those tests fail for me on master too.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_tsan2 on master.tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/linux_tsan2/builds/22038)
The CQ bit was checked by nisse@webrtc.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_tsan2 on master.tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/linux_tsan2/builds/22040)
On 2017/04/25 13:35:17, nisse-webrtc wrote: > PTAL. > > I get a few failures when running webrtc_perf_tests locally, > > FAILED TESTS (4/110): > ./out/Release/webrtc_perf_tests: FullStackTest.ForemanCifPlr5H264 > ./out/Release/webrtc_perf_tests: FullStackTest.ForemanCifPlr5H264Flexfec > ./out/Release/webrtc_perf_tests: FullStackTest.ForemanCifWithoutPacketlossH264 > ./out/Release/webrtc_perf_tests: RampUpTest.UpDownUpAbsSendTimeSimulcastRedRtx > > but those tests fail for me on master too. I have some trouble understanding what goes wrong on the tsan bot.
nisse@webrtc.org changed reviewers: + henrika@webrtc.org
+ henrika@. Patchset #3 tries to fix the tsan failure.
Thanks for trying to improve. Hope it helps. lgtm
https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... File webrtc/voice_engine/channel.cc (right): https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... webrtc/voice_engine/channel.cc:2716: // other thread calls ChannelManager::DestroyChannel whch calls nit, which Also, in your failing test, why did you not hit the previous DCHECK?
On 2017/04/26 14:16:01, henrika_webrtc wrote: > https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... > File webrtc/voice_engine/channel.cc (right): > > https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... > webrtc/voice_engine/channel.cc:2716: // other thread calls > ChannelManager::DestroyChannel whch calls > nit, which > > Also, in your failing test, why did you not hit the previous DCHECK? No idea, does the tsan bot build with DCHECKs enabled? Anyway, it looks like it still fails :-( And since it's so consistent I guess it has to be this cl changing behavior in some way breaking expectations of the code, which for some reason no other tests catch...
Is it possible to modify your original CL somehow to circumvent the issue? Seems very complicated to sort out all consequences it has.
solenberg@webrtc.org changed reviewers: + solenberg@webrtc.org
https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... File webrtc/voice_engine/channel.cc (right): https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... webrtc/voice_engine/channel.cc:2718: // task, our task can get run at the same time as the destructor. Can we use ChannelOwner to allow the task to keep a reference to the Channel? Seems hard to avoid ref counting in this situation.
On 2017/04/26 20:29:03, the sun wrote: > https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... > File webrtc/voice_engine/channel.cc (right): > > https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... > webrtc/voice_engine/channel.cc:2718: // task, our task can get run at the same > time as the destructor. > Can we use ChannelOwner to allow the task to keep a reference to the Channel? > Seems hard to avoid ref counting in this situation. I'd like to try that, but it's not clear to me how. The reference count lives in the ChannelOwner::ChannelRef class (so a ChannelOwner is more like std::shared_ptr than rtc::ScopedRefPtr). So how can Channel::ProcessAndEncodeAudio get access to the ChannelOwner referring to itself? An alternative which may be easier to try out is to use a lock (either a new one, or somehow reuse the one in channel_state_). In both StopSend and ProcessEncodeAudio the lock must be held by the block of code which accesses the sending flag and posts a task to encoder_queue_.
On 2017/04/27 07:49:10, nisse-webrtc wrote: > On 2017/04/26 20:29:03, the sun wrote: > > > https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... > > File webrtc/voice_engine/channel.cc (right): > > > > > https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... > > webrtc/voice_engine/channel.cc:2718: // task, our task can get run at the same > > time as the destructor. > > Can we use ChannelOwner to allow the task to keep a reference to the Channel? > > Seems hard to avoid ref counting in this situation. > > I'd like to try that, but it's not clear to me how. The reference count lives in > the ChannelOwner::ChannelRef class (so a ChannelOwner is more like > std::shared_ptr than rtc::ScopedRefPtr). So how can > Channel::ProcessAndEncodeAudio get access to the ChannelOwner referring to > itself? The channel will have to ask the ChannelManager for itself to obtain a copy of the ChannelOwner, which will increase the ref count. Channel could learn about the manager in e.g. Channel::SetEngineInformation(). Needless to say, the relationship between ChannelManager and Channel will be a bit weird, but I think we could live with it until we get around to do the other necessary work on Channel. > > An alternative which may be easier to try out is to use a lock (either a new > one, or somehow reuse the one in channel_state_). In both StopSend and > ProcessEncodeAudio the lock must be held by the block of code which accesses the > sending flag and posts a task to encoder_queue_. Sounds more brittle though.
On 2017/04/27 09:10:58, the sun wrote: > On 2017/04/27 07:49:10, nisse-webrtc wrote: > > On 2017/04/26 20:29:03, the sun wrote: > > > > > > https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... > > > File webrtc/voice_engine/channel.cc (right): > > > > > > > > > https://codereview.webrtc.org/2840833002/diff/40001/webrtc/voice_engine/chann... > > > webrtc/voice_engine/channel.cc:2718: // task, our task can get run at the > same > > > time as the destructor. > > > Can we use ChannelOwner to allow the task to keep a reference to the > Channel? > > > Seems hard to avoid ref counting in this situation. > > > > I'd like to try that, but it's not clear to me how. The reference count lives > in > > the ChannelOwner::ChannelRef class (so a ChannelOwner is more like > > std::shared_ptr than rtc::ScopedRefPtr). So how can > > Channel::ProcessAndEncodeAudio get access to the ChannelOwner referring to > > itself? > > The channel will have to ask the ChannelManager for itself to obtain a copy of > the ChannelOwner, which will increase the ref count. Channel could learn about > the manager in e.g. Channel::SetEngineInformation(). > > Needless to say, the relationship between ChannelManager and Channel will be a > bit weird, but I think we could live with it until we get around to do the other > necessary work on Channel. > > > > > An alternative which may be easier to try out is to use a lock (either a new > > one, or somehow reuse the one in channel_state_). In both StopSend and > > ProcessEncodeAudio the lock must be held by the block of code which accesses > the > > sending flag and posts a task to encoder_queue_. > > Sounds more brittle though. Now I'm getting even more confused. The tsan bot keeps failing. But now I'm not able to find any race reports in the logs anymore, instead a failure of the RTCStatsIntegrationTest.GetStatsFromCallee.
The CQ bit was checked by nisse@webrtc.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.webrtc.org/...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_dbg on master.tryserver.webrtc (JOB_FAILED, http://build.chromium.org/p/tryserver.webrtc/builders/android_dbg/builds/23713)
nisse@webrtc.org changed reviewers: + hbos@webrtc.org
Hi, I believe the voe:Channel race (https://bugs.chromium.org/p/webrtc/issues/detail?id=7540) is unrelated to this cl. And the android failure seems to be problem with swarming and hot-word detection. So I'd like to try to get this landed soon. hbos: I had to make the stats tests some more tolerant. They used to assert that candidate_pair.available_incoming_bitrate was always unset. Now it happens to sometimes be set. I have only observed that on the tsan bot, so my best guess is that it's timing-related.
https://codereview.webrtc.org/2840833002/diff/100001/webrtc/pc/rtcstats_integ... File webrtc/pc/rtcstats_integrationtest.cc (right): https://codereview.webrtc.org/2840833002/diff/100001/webrtc/pc/rtcstats_integ... webrtc/pc/rtcstats_integrationtest.cc:402: candidate_pair.available_incoming_bitrate); Can you move this to inside "if (is_selected_pair)" and in the else do TestMemberIsUndefined? Ideally with that change TestMemberIsNonNegative should work but if its flaky it's OK to use TestMemberIsUndefinedOrNonNegative with a // TODO(hbos,nisse): Find out why this is sometimes undefined, flaky on TSAN. I don't know if the flake causes a candidate pair to be created that is not selected or if the flake is that the estimate is not available. Perhaps the flake can be avoided if we wait for something to happen before we getStats. Also with this change you seem to have fixed this stat being missing! Can you add BUG=7062 to the description and remove the comment about populating this value at https://cs.chromium.org/chromium/src/third_party/webrtc/api/stats/rtcstats_ob...
https://codereview.webrtc.org/2840833002/diff/100001/webrtc/pc/rtcstats_integ... File webrtc/pc/rtcstats_integrationtest.cc (right): https://codereview.webrtc.org/2840833002/diff/100001/webrtc/pc/rtcstats_integ... webrtc/pc/rtcstats_integrationtest.cc:402: candidate_pair.available_incoming_bitrate); On 2017/05/02 12:58:55, hbos wrote: > Can you move this to inside "if (is_selected_pair)" and in the else do > TestMemberIsUndefined? > > Ideally with that change TestMemberIsNonNegative should work but if its flaky > it's OK to use TestMemberIsUndefinedOrNonNegative with a // TODO(hbos,nisse): > Find out why this is sometimes undefined, flaky on TSAN. I don't know if the > flake causes a candidate pair to be created that is not selected or if the flake > is that the estimate is not available. Perhaps the flake can be avoided if we > wait for something to happen before we getStats. > > Also with this change you seem to have fixed this stat being missing! Can you > add BUG=7062 to the description and remove the comment about populating this > value at > https://cs.chromium.org/chromium/src/third_party/webrtc/api/stats/rtcstats_ob... (Or perhaps "sometimes is defined" is more accurate than "sometimes is undefined"?)
solenberg@webrtc.org changed reviewers: - solenberg@webrtc.org
Description was changed from ========== Delete media type check in Call::NotifyBweOfReceivedPacket. This was introduced in cl https://codereview.webrtc.org/2673523003/, and at the time needed to not break perf tests. It appears to no longer be needed. BUG=webrtc:6847 ========== to ========== Delete media type check in Call::NotifyBweOfReceivedPacket. This was introduced in cl https://codereview.webrtc.org/2673523003/, and at the time needed to not break perf tests. It appears to no longer be needed. BUG=webrtc:6847,webrtc:7062 ==========
https://codereview.webrtc.org/2840833002/diff/100001/webrtc/pc/rtcstats_integ... File webrtc/pc/rtcstats_integrationtest.cc (right): https://codereview.webrtc.org/2840833002/diff/100001/webrtc/pc/rtcstats_integ... webrtc/pc/rtcstats_integrationtest.cc:402: candidate_pair.available_incoming_bitrate); On 2017/05/02 12:58:55, hbos wrote: > Can you move this to inside "if (is_selected_pair)" and in the else do > TestMemberIsUndefined? Done. Please check if the change is like you intended. > Also with this change you seem to have fixed this stat being missing! Can you > add BUG=7062 to the description and remove the comment about populating this > value at > https://cs.chromium.org/chromium/src/third_party/webrtc/api/stats/rtcstats_ob... "Fixed" is exaggerating, I'm afraid... I've added a TODO and linked to the bug, but I didn't remove the comment in rtcstats_objects.h.
rtcstats_integrationtest.cc lgtm
Oh but can you update the comment int rtcstats_objects.h? Now it says its always undefined which isn't the case. I'm owner of that too so no wait for more reviews on that.
On 2017/05/05 10:53:18, hbos wrote: > Oh but can you update the comment int rtcstats_objects.h? Now it says its always > undefined which isn't the case. I'm owner of that too so no wait for more > reviews on that. I've tweaked the comment now.
https://codereview.webrtc.org/2840833002/diff/140001/webrtc/call/call.cc File webrtc/call/call.cc (left): https://codereview.webrtc.org/2840833002/diff/140001/webrtc/call/call.cc#oldc... webrtc/call/call.cc:1281: if (media_type == MediaType::VIDEO || I don't think it's safe to remove this. Let's say we're receiving audio without header extensions and video with abs-send-time. When receiving the video packets RemoteBitrateEstimatorAbsSendTime will be picked in ReceiveSideCongestionController::WrappingBitrateEstimator::PickEstimatorFromHeader. When we later receive audio packets without header extension RemoteBitrateEstimatorSingleStream will be picked instead. Therefore we may end up toggling between the two, when what we really want is to not include the audio packets in the estimation.
On 2017/05/05 12:26:40, stefan-webrtc wrote: > Let's say we're receiving audio without header extensions and video with > abs-send-time. When receiving the video packets > RemoteBitrateEstimatorAbsSendTime will be picked in > ReceiveSideCongestionController::WrappingBitrateEstimator::PickEstimatorFromHeader. > When we later receive audio packets without header extension > RemoteBitrateEstimatorSingleStream will be picked instead. Therefore we may end > up toggling between the two, when what we really want is to not include the > audio packets in the estimation. So we need to figure out how to handle the separate cases. And try to simplify the logic. There seem to be three different ways to handle received packet: 1. If the transport sequence number extension is present, use RemoteEstimatorProxy, which just generates the transport feedback ack-type messages. 2. If the abs send time extension is present, use RemoteBitrateEstimatorAbsSendTime. 3. Else, use RemoteBitrateEstimatorSingleStream. In the latter two cases, the estimate ends up in a REMB packet, right? And there's some hysteresis, implemented by WrappingBitrateEstimator. Could we base the choice on what's configured, rather than what extensions are present? Changing the rules to 1. If send side BWE is negotiated (both header extension and feedback message), always use RemoteEstimator proxy. And let the BWE ignore packets without transport sequence number. 2. Else, if abs send time has been negotiated, always use RemoteBitrateEstimatorAbsSendTime. Let the BWE ignore packets without abs send time (which in the above scenario would be all audio packets). 3. Else, always (or at least, if REMB is negotiated) use RemoteBitrateEstimatorSingleStream.
On 2017/05/08 11:17:57, nisse-webrtc wrote: > On 2017/05/05 12:26:40, stefan-webrtc wrote: > > Let's say we're receiving audio without header extensions and video with > > abs-send-time. When receiving the video packets > > RemoteBitrateEstimatorAbsSendTime will be picked in > > > ReceiveSideCongestionController::WrappingBitrateEstimator::PickEstimatorFromHeader. > > When we later receive audio packets without header extension > > RemoteBitrateEstimatorSingleStream will be picked instead. Therefore we may > end > > up toggling between the two, when what we really want is to not include the > > audio packets in the estimation. > > So we need to figure out how to handle the separate cases. And try to simplify > the logic. > > There seem to be three different ways to handle received packet: > > 1. If the transport sequence number extension is present, use > RemoteEstimatorProxy, which just generates the transport feedback ack-type > messages. > > 2. If the abs send time extension is present, use > RemoteBitrateEstimatorAbsSendTime. > > 3. Else, use RemoteBitrateEstimatorSingleStream. > > In the latter two cases, the estimate ends up in a REMB packet, right? And > there's some hysteresis, implemented by WrappingBitrateEstimator. > > Could we base the choice on what's configured, rather than what extensions are > present? Changing the rules to > > 1. If send side BWE is negotiated (both header extension and feedback message), > always use RemoteEstimator proxy. And let the BWE ignore packets without > transport sequence number. > > 2. Else, if abs send time has been negotiated, always use > RemoteBitrateEstimatorAbsSendTime. Let the BWE ignore packets without abs send > time (which in the above scenario would be all audio packets). > > 3. Else, always (or at least, if REMB is negotiated) use > RemoteBitrateEstimatorSingleStream. It could work, but it's incorrect from an sdp perspective since we're saying we can receive any of those three extensions, but in practice we only accept one of them. This could mean that someone who wants to use abs-send-time will have to modify chrome's sdp from GetOffer() before calling SetLocalDescription() to make sure only one of the header extensions are present (or at least that transport-seq-num isn't present), which is different from what they have to do today, right?
On 2017/05/08 11:53:01, stefan-webrtc wrote: > Changing the rules to > > > > 1. If send side BWE is negotiated (both header extension and feedback > message), > > always use RemoteEstimator proxy. And let the BWE ignore packets without > > transport sequence number. > > > > 2. Else, if abs send time has been negotiated, always use > > RemoteBitrateEstimatorAbsSendTime. Let the BWE ignore packets without abs send > > time (which in the above scenario would be all audio packets). > > > > 3. Else, always (or at least, if REMB is negotiated) use > > RemoteBitrateEstimatorSingleStream. > > It could work, but it's incorrect from an sdp perspective since we're saying we > can receive any of those three extensions, but in practice we only accept one of > them. This could mean that someone who wants to use abs-send-time will have to > modify chrome's sdp from GetOffer() before calling SetLocalDescription() to make > sure only one of the header extensions are present (or at least that > transport-seq-num isn't present), which is different from what they have to do > today, right? When would that happen? You have two parties, both supporting both transport sequence number and abs send time, so both extensions are negotiated. But one of the parties then prefers abs send time, so on transmitted packets it will only attach abs send time, not transport sequence number? |