Issue 2979393002: [string_util] fix bug in ReplaceSubstringsAfterOffset()

ncarter (slow)

Description was changed from ========== s# Enter a description of the change. Make base::ReplaceChars() run ...

3 years, 5 months ago (2017-07-20 23:14:58 UTC) #1

ncarter (slow)

Description was changed from ========== [string_util] Make base::ReplaceChars() run in linear time. BUG= ========== to ...

3 years, 5 months ago (2017-07-20 23:18:07 UTC) #2

ncarter (slow)

Description was changed from ========== [string_util] Make base::ReplaceChars() run in linear time. The old implementation ...

3 years, 5 months ago (2017-07-20 23:20:36 UTC) #3

ncarter (slow)

Description was changed from ========== [string_util] Make base::ReplaceChars() run in linear time. The old implementation ...

3 years, 5 months ago (2017-07-20 23:20:53 UTC) #4

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-21 17:38:03 UTC) #5

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/20001

3 years, 5 months ago (2017-07-21 17:38:11 UTC) #6

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 5 months ago (2017-07-21 17:41:28 UTC) #7

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presubmit/builds/496622)

3 years, 5 months ago (2017-07-21 17:41:29 UTC) #8

ncarter (slow)

Description was changed from ========== [string_util] Make base::ReplaceChars() run in linear time. The old implementation ...

3 years, 5 months ago (2017-07-21 17:45:56 UTC) #9

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 5 months ago (2017-07-21 23:48:13 UTC) #10

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/40001

3 years, 5 months ago (2017-07-21 23:48:20 UTC) #11

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 5 months ago (2017-07-22 00:07:49 UTC) #12

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: android_cronet on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_cronet/builds/185075)

3 years, 5 months ago (2017-07-22 00:07:51 UTC) #13

ncarter (slow)

Description was changed from ========== [string_util] optimize base::ReplaceChars(); fix buffer overrun in base::ReplaceSubstringsAfterOffset() The old ...

3 years, 4 months ago (2017-07-26 17:20:13 UTC) #14

Description was changed from

==========
[string_util] optimize base::ReplaceChars(); fix buffer overrun in
base::ReplaceSubstringsAfterOffset()

The old implementation of base::ReplaceChars() and base::RemoveChars() was
potentially O(n^2). Fix the speed issue
by reusing the algorithm of DoReplaceSubstringsAfterOffset.

The problem with DoReplaceSubstringsAfterOffset was that it used a mix of find
and rfind in the expansion case, and they don't necessarily yield the same
matches. For a string such as the one in
the unittest, this would write past the beginning of the buffer, and never
terminate.

Fix this by using a temporary in the expansion case, and building the string in
the forward direction. string::resize is likely to reallocate anyway, so this is
probably actually fewer copies.

BUG=
==========

to

==========
[string_util] fix buffer overrun in base::ReplaceSubstringsAfterOffset();
optimize ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate, and overwrite the entire heap:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the
length-contraction case, with the shifted tail as the src.

Additonally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find().

BUG=
==========

ncarter (slow)

Description was changed from ========== [string_util] fix buffer overrun in base::ReplaceSubstringsAfterOffset(); optimize ReplaceChars() The problem ...

3 years, 4 months ago (2017-07-26 17:22:47 UTC) #15

Description was changed from

==========
[string_util] fix buffer overrun in base::ReplaceSubstringsAfterOffset();
optimize ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate, and overwrite the entire heap:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the
length-contraction case, with the shifted tail as the src.

Additonally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find().

BUG=
==========

to

==========
[string_util] fix overrun in ReplaceSubstringsAfterOffset() and perf issue in
ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate, and overwrite the entire heap:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the length-
contraction case, with the shifted tail as the src.

Additionally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find(). This performance improvement
was the original purpose of this change, but testing it uncovered the
correctness bug.

BUG=
==========

ncarter (slow)

Description was changed from ========== [string_util] fix overrun in ReplaceSubstringsAfterOffset() and perf issue in ReplaceChars() ...

3 years, 4 months ago (2017-07-26 17:23:08 UTC) #16

Description was changed from

==========
[string_util] fix overrun in ReplaceSubstringsAfterOffset() and perf issue in
ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate, and overwrite the entire heap:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the length-
contraction case, with the shifted tail as the src.

Additionally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find(). This performance improvement
was the original purpose of this change, but testing it uncovered the
correctness bug.

BUG=
==========

to

==========
[string_util] fix overrun in ReplaceSubstringsAfterOffset(); speed up
ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate, and overwrite the entire heap:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the length-
contraction case, with the shifted tail as the src.

Additionally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find(). This performance improvement
was the original purpose of this change, but testing it uncovered the
correctness bug.

BUG=
==========

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-07-26 17:40:21 UTC) #17

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/80001

3 years, 4 months ago (2017-07-26 17:40:31 UTC) #18

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-07-26 17:50:22 UTC) #19

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: android_cronet on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_cronet/builds/187492)

3 years, 4 months ago (2017-07-26 17:50:23 UTC) #20

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-07-26 19:17:28 UTC) #21

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/100001

3 years, 4 months ago (2017-07-26 19:17:51 UTC) #22

ncarter (slow)

Description was changed from ========== [string_util] fix overrun in ReplaceSubstringsAfterOffset(); speed up ReplaceChars() The problem ...

3 years, 4 months ago (2017-07-26 19:17:55 UTC) #23

Description was changed from

==========
[string_util] fix overrun in ReplaceSubstringsAfterOffset(); speed up
ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate, and overwrite the entire heap:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the length-
contraction case, with the shifted tail as the src.

Additionally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find(). This performance improvement
was the original purpose of this change, but testing it uncovered the
correctness bug.

BUG=
==========

to

==========
[string_util] fix bug in ReplaceSubstringsAfterOffset(); speed up ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the length-
contraction case, with the shifted tail as the src.

Additionally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find(). This performance improvement
was the original purpose of this change, but testing it uncovered the
correctness bug.

BUG=
==========

ncarter (slow)

Description was changed from ========== [string_util] fix bug in ReplaceSubstringsAfterOffset(); speed up ReplaceChars() The problem ...

3 years, 4 months ago (2017-07-26 19:20:03 UTC) #24

Description was changed from

==========
[string_util] fix bug in ReplaceSubstringsAfterOffset(); speed up ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the length-
contraction case, with the shifted tail as the src.

Additionally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find(). This performance improvement
was the original purpose of this change, but testing it uncovered the
correctness bug.

BUG=
==========

to

==========
[string_util] fix bug in ReplaceSubstringsAfterOffset(); speed up ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the length-
contraction case, with the shifted tail as the src.

Additionally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find(). This performance improvement
was the original purpose of this change, but testing it uncovered the
correctness bug.

BUG=749228
==========

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-07-26 19:29:50 UTC) #25

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_chromium_headless_rel on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_headless_rel/builds/134833)

3 years, 4 months ago (2017-07-26 19:29:51 UTC) #26

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-07-26 22:53:07 UTC) #27

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/120001

3 years, 4 months ago (2017-07-26 22:53:22 UTC) #28

ncarter (slow)

Description was changed from ========== [string_util] fix bug in ReplaceSubstringsAfterOffset(); speed up ReplaceChars() The problem ...

3 years, 4 months ago (2017-07-26 23:13:18 UTC) #29

Description was changed from

==========
[string_util] fix bug in ReplaceSubstringsAfterOffset(); speed up ReplaceChars()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate:

  ReplaceSubstringsAfterOffset(std::string('aaabaa'), 0, 'aa', 'ccc');

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the length-
contraction case, with the shifted tail as the src.

Additionally, this CL makes base::ReplaceChars() and base::RemoveChars() use the
new implementation of DoReplaceSubstringsAfterOffset, fixing a potential O(n^2)
performance issue. The algorithms here are identical, except that we use
string::find_first_of() instead of string::find(). This performance improvement
was the original purpose of this change, but testing it uncovered the
correctness bug.

BUG=749228
==========

to

==========
[string_util] fix bug in ReplaceSubstringsAfterOffset()

The problem with DoReplaceSubstringsAfterOffset was that the following search
would not terminate:

  ReplaceSubstringsAfterOffset(std::string("aaabaa"), 0, "aa", "ccc");

The root cause of this is an algorithmic bug, where it was assumed that
string::rfind and string::find would return the same matches in reversed order.

The fix is to only scan forward using find(). For the length-expansion case, if
capacity is insufficient, we swap with a temporary and build the result into new
memory. If existing capacity suffices, we'll shift the tail of the string down
to the new size, and then use left-to-right memmove loop of the length-
contraction case, with the shifted tail as the src.

BUG=749228
==========

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-07-26 23:18:15 UTC) #30

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/140001

3 years, 4 months ago (2017-07-26 23:18:33 UTC) #31

ncarter (slow)

nick@chromium.org changed reviewers: + danakj@chromium.org, pkasting@chromium.org

3 years, 4 months ago (2017-07-26 23:50:13 UTC) #32

ncarter (slow)

Peter, Dana -- Please review this bug fix https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode768 base/strings/string_util.cc:768: if ...

3 years, 4 months ago (2017-07-26 23:50:14 UTC) #33

Peter, Dana --

Please review this bug fix

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
Using a temporary seems to result in fewer copies of the string overall. From
what I can tell, using a temporary can be more efficient than str->replace(_, _,
str, _, _) in MSVC's/dinkumware's STL, since when replace() needs to grow the
capacity, it does a memcpy from the old to the new buffer, and then a memmove
from the new buffer to itself.

So we could actually consider always using a temp and dropping the memmove
codepath, if we don't worry about the memory churn, or the effects of losing the
information embedded in str's capacity.

Alternatively, if we really want to preserve the exponental growth of str's
capacity, we could always do the memmove loop, even when the resize/replace loop
would need to realloc.

In other words: there's two implementations of lengthening here, they should
both be correct in all cases, and I'd be happy to drop one.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:773: str->reserve(final_length);
Should we worry that using a temporary here will break the 'momentum' of str's
capacity?

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:818: do {
Note that this is essentially the old 'string is shortening' loop (line 759 on
the LHS); it just got shifted down here so that it could be shared with the
in-place lengthening case.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:830: memmove(&(*str)[write_offset],
&(*str)[read_offset],
Any reason we preferred memmove over string::replace()?

Peter Kasting

It seems like the non-termination issue was because of the "if (current_match == first_match) return;" ...

3 years, 4 months ago (2017-07-27 01:09:57 UTC) #34

It seems like the non-termination issue was because of the "if (current_match ==
first_match) return;" in the final loop.  We could change that condition to
break out of the loop more accurately.

That would still leave the problem that the difference between find() and
rfind() handles "find 'aa' in 'aaa'" differently; it's not clear whether
"replace 'aa' with 'ccc' in 'aaa'" should return ccca, accc, or... something
else?  (ccccc?  :P)  So the function's behavior would still be a bit
indeterminate, which seems bad.

That said, this rewrite looks less efficient in a way I was specifically trying
to avoid (see first comment below), so I'm not too keen to do it either.  I'd
like to find a way to make the previous algorithm work, just tweak it.

I have to run, so don't have time to think about this more right now.  Maybe I
can think about it later if you haven't written a reply by then.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (left):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:783: // as we're reading it, needing to shift, or
having to copy to a second string
This "needing to shift" is, AFAICT, exactly what your current implementation
does, and it means that in the case of expansions, we basically double the
number of reads and writes of the old code.  I'd really like to avoid that.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:741: size_t offset = first_match;
Nit: It does one unnecessary comparison at the start of the loop, but I'm
wondering if it would be clearer to write this more like:

    for (size_t offset = first_match; offset != StringType::npos;
         offset = str->find(find_this.data(), offset + replace_length,
                            find_this.length()))
      str->replace(offset, find_length, replace_with.data(), replace_length);

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
On 2017/07/26 23:50:14, ncarter (slow) wrote:
> Using a temporary seems to result in fewer copies of the string overall. From
> what I can tell, using a temporary can be more efficient than str->replace(_,
_,
> str, _, _) in MSVC's/dinkumware's STL, since when replace() needs to grow the
> capacity, it does a memcpy from the old to the new buffer, and then a memmove
> from the new buffer to itself.
> 
> So we could actually consider always using a temp and dropping the memmove
> codepath, if we don't worry about the memory churn, or the effects of losing
the
> information embedded in str's capacity.
> 
> Alternatively, if we really want to preserve the exponental growth of str's
> capacity, we could always do the memmove loop, even when the resize/replace
loop
> would need to realloc.
> 
> In other words: there's two implementations of lengthening here, they should
> both be correct in all cases, and I'd be happy to drop one.

I tried to optimize the original code to do as few memory accesses as possible. 
However, I did not look at (nor do I really want to look at) factors like "how
are these STL implementations built under the hood"; that can change at any
time.  The limit of how far I'll go with that is rough guesses like "memmove()
is exactly what I want, string::replace() has to handle other cases too" (see
reply lower down).

Preserving exponential capacity growth is a non-goal, IIUC; that's basically an
issue when people are repeatedly calling DoReplace... with longer strings,
right?  (Since the code right here is not in a loop.)

I am not a big fan of churning memory when we don't need to.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:773: str->reserve(final_length);
On 2017/07/26 23:50:14, ncarter (slow) wrote:
> Should we worry that using a temporary here will break the 'momentum' of str's
> capacity?

See reply above.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:830: memmove(&(*str)[write_offset],
&(*str)[read_offset],
On 2017/07/26 23:50:14, ncarter (slow) wrote:
> Any reason we preferred memmove over string::replace()?

I don't recall having one, just "I know I'm moving bytes, so I don't need the
string class to check whether it needs to expand/contract or the like -- memmove
is probably the most highly-optimized way of doing this".

Peter Kasting

LGTM https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (left): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#oldcode783 base/strings/string_util.cc:783: // as we're reading it, needing to shift, ...

3 years, 4 months ago (2017-07-27 05:25:14 UTC) #35

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-07-27 06:34:09 UTC) #36

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: win_chromium_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/498681)

3 years, 4 months ago (2017-07-27 06:34:10 UTC) #37

danakj

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode745 base/strings/string_util.cc:745: find_this.length()); nit: this could be written find_length? https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode760 base/strings/string_util.cc:760: ...

3 years, 4 months ago (2017-07-27 15:55:29 UTC) #38

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:745: find_this.length());
nit: this could be written find_length?

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:760: for (size_t match = first_match; match !=
StringType::npos;
we should be consistent and either for() always like this, or do{}while always
like ~L746

also we could start at first_match + find_length, with num_matches == 1, and
avoid reading the first match again.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:762: find_this.length())) {
nit: find_length?

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:765: expansion = (replace_length - find_length) *
num_matches;
this is nitty, but as per peter's comment below, it seems that we don't care to
track the num matches so much as we care to count so much to expand. ie inside
the loop we could expansion += replace_length - find_length (which could be
stored before the loop in a var) and then we don't need the multiply here.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
On 2017/07/27 01:09:57, Peter Kasting wrote:
> On 2017/07/26 23:50:14, ncarter (slow) wrote:
> > Using a temporary seems to result in fewer copies of the string overall.
From
> > what I can tell, using a temporary can be more efficient than
str->replace(_,
> _,
> > str, _, _) in MSVC's/dinkumware's STL, since when replace() needs to grow
the
> > capacity, it does a memcpy from the old to the new buffer, and then a
memmove
> > from the new buffer to itself.
> > 
> > So we could actually consider always using a temp and dropping the memmove
> > codepath, if we don't worry about the memory churn, or the effects of losing
> the
> > information embedded in str's capacity.
> > 
> > Alternatively, if we really want to preserve the exponental growth of str's
> > capacity, we could always do the memmove loop, even when the resize/replace
> loop
> > would need to realloc.
> > 
> > In other words: there's two implementations of lengthening here, they should
> > both be correct in all cases, and I'd be happy to drop one.
> 
> I tried to optimize the original code to do as few memory accesses as
possible. 
> However, I did not look at (nor do I really want to look at) factors like "how
> are these STL implementations built under the hood"; that can change at any
> time.  The limit of how far I'll go with that is rough guesses like "memmove()
> is exactly what I want, string::replace() has to handle other cases too" (see
> reply lower down).
> 
> Preserving exponential capacity growth is a non-goal, IIUC; that's basically
an
> issue when people are repeatedly calling DoReplace... with longer strings,
> right?  (Since the code right here is not in a loop.)
> 
> I am not a big fan of churning memory when we don't need to.

I'd like to understand that we really want to use the memmove() one. Is the
claim that StrType::replace() is doing so with memmove()? In that case almost
always the src and dst are going to overlap for the shifting replace(), right?
Then the replace() involves a copy (and surely a malloc). By that time we've
done about as much work as this more concise path (maybe the number of bytes
copied is fewer).

>  The memory areas may overlap: copying takes place as
>  though the bytes in src are first copied into a temporary array that
>  does not overlap src or dest, and the bytes are then copied from the
>  temporary array to dest.
[http://man7.org/linux/man-pages/man3/memmove.3.html]

Then one each replace() to move characters up, in the worst case, many of them
overlap:

eg. replace ""aa" with "bbb"

cccccccaacccccccaacccccccaacccccccaa
->cccccccaa____cccccccaacccccccaacccccccaa
->cccccccbbb___cccccccaacccccccaacccccccaa
->cccccccbbbcccccccaa___cccccccaacccccccaa <- memmove overlaps = copy to temp
buffer (+ malloc?)
->cccccccbbbcccccccbbb__cccccccaacccccccaa
->cccccccbbbcccccccbbbcccccccaa__cccccccaa <- memmove overlaps = copy to temp
buffer (+ malloc?)
->cccccccbbbcccccccbbbcccccccbbb_cccccccaa
->cccccccbbbcccccccbbbcccccccbbbcccccccaa_ <- memmove overlaps = copy to temp
buffer (+ malloc?)
->cccccccbbbcccccccbbbcccccccbbbcccccccbbb

So this would be at least 3 copies into temp buffers, maybe 3 mallocs. Maybe
more if StrType::replace() is implemented with memmove().

Using the swap/reserve/copy seems better to me. Maybe we should
microbenchmark/measure what's faster?

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:779: : src.find(find_this.data(), pos,
find_this.length());
nit: find_length?

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:781: str->append(replace_with.data(),
replace_with.length());
nit: replace_length?

danakj

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode768 base/strings/string_util.cc:768: if (str->capacity() < final_length) { On 2017/07/27 15:55:30, danakj ...

3 years, 4 months ago (2017-07-27 15:58:48 UTC) #39

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
On 2017/07/27 15:55:30, danakj wrote:
> On 2017/07/27 01:09:57, Peter Kasting wrote:
> > On 2017/07/26 23:50:14, ncarter (slow) wrote:
> > > Using a temporary seems to result in fewer copies of the string overall.
> From
> > > what I can tell, using a temporary can be more efficient than
> str->replace(_,
> > _,
> > > str, _, _) in MSVC's/dinkumware's STL, since when replace() needs to grow
> the
> > > capacity, it does a memcpy from the old to the new buffer, and then a
> memmove
> > > from the new buffer to itself.
> > > 
> > > So we could actually consider always using a temp and dropping the memmove
> > > codepath, if we don't worry about the memory churn, or the effects of
losing
> > the
> > > information embedded in str's capacity.
> > > 
> > > Alternatively, if we really want to preserve the exponental growth of
str's
> > > capacity, we could always do the memmove loop, even when the
resize/replace
> > loop
> > > would need to realloc.
> > > 
> > > In other words: there's two implementations of lengthening here, they
should
> > > both be correct in all cases, and I'd be happy to drop one.
> > 
> > I tried to optimize the original code to do as few memory accesses as
> possible. 
> > However, I did not look at (nor do I really want to look at) factors like
"how
> > are these STL implementations built under the hood"; that can change at any
> > time.  The limit of how far I'll go with that is rough guesses like
"memmove()
> > is exactly what I want, string::replace() has to handle other cases too"
(see
> > reply lower down).
> > 
> > Preserving exponential capacity growth is a non-goal, IIUC; that's basically
> an
> > issue when people are repeatedly calling DoReplace... with longer strings,
> > right?  (Since the code right here is not in a loop.)
> > 
> > I am not a big fan of churning memory when we don't need to.
> 
> I'd like to understand that we really want to use the memmove() one. Is the
> claim that StrType::replace() is doing so with memmove()? In that case almost
> always the src and dst are going to overlap for the shifting replace(), right?
> Then the replace() involves a copy (and surely a malloc). By that time we've
> done about as much work as this more concise path (maybe the number of bytes
> copied is fewer).
> 
> >  The memory areas may overlap: copying takes place as
> >  though the bytes in src are first copied into a temporary array that
> >  does not overlap src or dest, and the bytes are then copied from the
> >  temporary array to dest.
> [http://man7.org/linux/man-pages/man3/memmove.3.html]
> 
> Then one each replace() to move characters up, in the worst case, many of them
> overlap:

Oops, I meant memmove() here as it's written explicitly as such for this case.
If I misunderstood the memove algo pls also correct me.

> 
> eg. replace ""aa" with "bbb"
> 
> cccccccaacccccccaacccccccaacccccccaa
> ->cccccccaa____cccccccaacccccccaacccccccaa
> ->cccccccbbb___cccccccaacccccccaacccccccaa
> ->cccccccbbbcccccccaa___cccccccaacccccccaa <- memmove overlaps = copy to temp
> buffer (+ malloc?)
> ->cccccccbbbcccccccbbb__cccccccaacccccccaa
> ->cccccccbbbcccccccbbbcccccccaa__cccccccaa <- memmove overlaps = copy to temp
> buffer (+ malloc?)
> ->cccccccbbbcccccccbbbcccccccbbb_cccccccaa
> ->cccccccbbbcccccccbbbcccccccbbbcccccccaa_ <- memmove overlaps = copy to temp
> buffer (+ malloc?)
> ->cccccccbbbcccccccbbbcccccccbbbcccccccbbb
> 
> So this would be at least 3 copies into temp buffers, maybe 3 mallocs. Maybe
> more if StrType::replace() is implemented with memmove().
> 
> Using the swap/reserve/copy seems better to me. Maybe we should
> microbenchmark/measure what's faster?

Peter Kasting

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode768 base/strings/string_util.cc:768: if (str->capacity() < final_length) { I don't think memmove() ...

3 years, 4 months ago (2017-07-27 22:09:47 UTC) #40

Peter Kasting

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode768 base/strings/string_util.cc:768: if (str->capacity() < final_length) { On 2017/07/27 22:09:47, Peter ...

3 years, 4 months ago (2017-07-27 22:11:21 UTC) #41

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-07-28 01:38:46 UTC) #42

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/220001

3 years, 4 months ago (2017-07-28 01:39:02 UTC) #43

ncarter (slow)

ptal https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (left): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#oldcode783 base/strings/string_util.cc:783: // as we're reading it, needing to shift, ...

3 years, 4 months ago (2017-07-28 02:34:26 UTC) #44

ptal

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (left):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:783: // as we're reading it, needing to shift, or
having to copy to a second string
On 2017/07/27 05:25:14, Peter Kasting wrote:
> On 2017/07/27 01:09:57, Peter Kasting wrote:
> > This "needing to shift" is, AFAICT, exactly what your current implementation
> > does, and it means that in the case of expansions, we basically double the
> > number of reads and writes of the old code.  I'd really like to avoid that.
> 
> That said, the only way I can think of to do this that is behaviorally correct
> is to find() forwards through the string once and keep the match locations in
a
> vector (or similar).  This seems worse than your solution, as now we have to
> alloc separate memory for the vector and read/write to it, even if depending
on
> the inputs we may get faster performance.  The possible (not at all
guaranteed)
> perf win does not feel worth the memory churn to me.

I'd thought about this too. The best I came up with is to special-case it for
the cases where we know rfind is safe: when |find_this| is not self-overlapping
(not 100% sure how to compute that elegantly).

It seems like rfind is only really useful when the number of matches is low (as
the number of matches gets higher, we're more likely to need to realloc, and
then the append approach wins). So maybe a statically-sized, stack-allocated
array of forward-find positions, and use the append codepath when we run out of
slots?

First priority though is fixing the overrun, so I'm thinking that doing
something elaborate would be best left to a follow-on CL.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:741: size_t offset = first_match;
On 2017/07/27 01:09:57, Peter Kasting wrote:
> Nit: It does one unnecessary comparison at the start of the loop, but I'm
> wondering if it would be clearer to write this more like:
> 
>     for (size_t offset = first_match; offset != StringType::npos;
>          offset = str->find(find_this.data(), offset + replace_length,
>                             find_this.length()))
>       str->replace(offset, find_length, replace_with.data(), replace_length);

Done.

FWIW it looks like MSVC manages to omit the initial loop test -- it's known that
(first_match != npos), so it effectively compiles to the equivalent of the
do/while, with the "cmp rax,0FFFFFFFFFFFFFFFFh" happening at the bottom of the
loop body.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:745: find_this.length());
On 2017/07/27 15:55:30, danakj wrote:
> nit: this could be written find_length?

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:753: // O(n).
On 2017/07/27 05:25:14, Peter Kasting wrote:
> Nit: Somewhere, we need comments that say what the overarching strategies are,
> since it's not obvious if you don't know.  Basically, the idea is to use
> reader/writer indexes to replace-in-place, and if the new string is longer,
> first move most of the original string to the back of the buffer to buy enough
> scratch area to do this without overwriting the input.

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:760: for (size_t match = first_match; match !=
StringType::npos;
On 2017/07/27 15:55:30, danakj wrote:
> we should be consistent and either for() always like this, or do{}while always
> like ~L746
> 
> also we could start at first_match + find_length, with num_matches == 1, and
> avoid reading the first match again.

Switched to a for loop above, based on the conversation with pkasting above.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:762: find_this.length())) {
On 2017/07/27 15:55:30, danakj wrote:
> nit: find_length?

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:763: num_matches++;
On 2017/07/27 05:25:14, Peter Kasting wrote:
> Nit: Totally personal style, but postincrements really stick out to me since
> they're more costly with non-BTs.  Preincrement?

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:765: expansion = (replace_length - find_length) *
num_matches;
On 2017/07/27 15:55:30, danakj wrote:
> this is nitty, but as per peter's comment below, it seems that we don't care
to
> track the num matches so much as we care to count so much to expand. ie inside
> the loop we could expansion += replace_length - find_length (which could be
> stored before the loop in a var) and then we don't need the multiply here.

I've eliminated the imul instruction, by accumulating |expansion| inside the
loop.

However, I still have a num_matches-based loop, because it allows us to skip the
last redundant call to find() -- the one that searches the 
nonmatching suffix of the string. In this way, num_matches here serves the same
purpose that tracking/preserving the final match did in the original code.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
On 2017/07/27 15:58:48, danakj wrote:
> On 2017/07/27 15:55:30, danakj wrote:
> > On 2017/07/27 01:09:57, Peter Kasting wrote:
> > > On 2017/07/26 23:50:14, ncarter (slow) wrote:
> > > > Using a temporary seems to result in fewer copies of the string overall.
> > From
> > > > what I can tell, using a temporary can be more efficient than
> > str->replace(_,
> > > _,
> > > > str, _, _) in MSVC's/dinkumware's STL, since when replace() needs to
grow
> > the
> > > > capacity, it does a memcpy from the old to the new buffer, and then a
> > memmove
> > > > from the new buffer to itself.
> > > > 
> > > > So we could actually consider always using a temp and dropping the
memmove
> > > > codepath, if we don't worry about the memory churn, or the effects of
> losing
> > > the
> > > > information embedded in str's capacity.
> > > > 
> > > > Alternatively, if we really want to preserve the exponental growth of
> str's
> > > > capacity, we could always do the memmove loop, even when the
> resize/replace
> > > loop
> > > > would need to realloc.
> > > > 
> > > > In other words: there's two implementations of lengthening here, they
> should
> > > > both be correct in all cases, and I'd be happy to drop one.
> > > 
> > > I tried to optimize the original code to do as few memory accesses as
> > possible. 
> > > However, I did not look at (nor do I really want to look at) factors like
> "how
> > > are these STL implementations built under the hood"; that can change at
any
> > > time.  The limit of how far I'll go with that is rough guesses like
> "memmove()
> > > is exactly what I want, string::replace() has to handle other cases too"
> (see
> > > reply lower down).
> > > 
> > > Preserving exponential capacity growth is a non-goal, IIUC; that's
basically
> > an
> > > issue when people are repeatedly calling DoReplace... with longer strings,
> > > right?  (Since the code right here is not in a loop.)
> > > 
> > > I am not a big fan of churning memory when we don't need to.
> > 
> > I'd like to understand that we really want to use the memmove() one. Is the
> > claim that StrType::replace() is doing so with memmove()? In that case
almost
> > always the src and dst are going to overlap for the shifting replace(),
right?
> > Then the replace() involves a copy (and surely a malloc). By that time we've
> > done about as much work as this more concise path (maybe the number of bytes
> > copied is fewer).
> > 
> > >  The memory areas may overlap: copying takes place as
> > >  though the bytes in src are first copied into a temporary array that
> > >  does not overlap src or dest, and the bytes are then copied from the
> > >  temporary array to dest.
> > [http://man7.org/linux/man-pages/man3/memmove.3.html]
> > 
> > Then one each replace() to move characters up, in the worst case, many of
them
> > overlap:
> 
> Oops, I meant memmove() here as it's written explicitly as such for this case.
> If I misunderstood the memove algo pls also correct me.
> 
> > 
> > eg. replace ""aa" with "bbb"
> > 
> > cccccccaacccccccaacccccccaacccccccaa
> > ->cccccccaa____cccccccaacccccccaacccccccaa
> > ->cccccccbbb___cccccccaacccccccaacccccccaa
> > ->cccccccbbbcccccccaa___cccccccaacccccccaa <- memmove overlaps = copy to
temp
> > buffer (+ malloc?)
> > ->cccccccbbbcccccccbbb__cccccccaacccccccaa
> > ->cccccccbbbcccccccbbbcccccccaa__cccccccaa <- memmove overlaps = copy to
temp
> > buffer (+ malloc?)
> > ->cccccccbbbcccccccbbbcccccccbbb_cccccccaa
> > ->cccccccbbbcccccccbbbcccccccbbbcccccccaa_ <- memmove overlaps = copy to
temp
> > buffer (+ malloc?)
> > ->cccccccbbbcccccccbbbcccccccbbbcccccccbbb
> > 
> > So this would be at least 3 copies into temp buffers, maybe 3 mallocs. Maybe
> > more if StrType::replace() is implemented with memmove().
> > 
> > Using the swap/reserve/copy seems better to me. Maybe we should
> > microbenchmark/measure what's faster?
> 

I agree with danakj's intuition that a temp seems faster overall, but it looks
like we count a different number of the malloc's in the memmove path.

From what I saw: the VC implementation of string::replace starts with a check to
see if the substring region overlaps with |str|, and if so, uses a more careful
path, where copies are done using the 'move' function from the string traits
(i.e. memmove, which is correct in the case of overlapping src/dst) rather than
the 'copy' function (i.e. memcpy). I didn't see any case where replace() creates
a temporary copy of the src, or where replace does a malloc, except to grow the
capacity.

Also, my understanding is that memmove does not ever call malloc (even though
the man page says something about "copying takes place AS THOUGH the bytes in
src are first copied into a temporary array").

In the in-place expansion path, I only expect malloc to happen in two places:
the resize() on line 799, and the subsequent replace() on line 801. But because
of the capacity check on line 768, it was my expectation that neither of these
would actually result in mallocs as written. If we were using the memmove path
unconditionally, we'd want to add a reserve(final_length) operation before the
resize(), and that would then be the only place where a malloc might happen.

Regarding tradeoffs -- here's what I see:

If existing capacity is sufficient, then avoiding a temporary has the following
advantages:
 - If there is only one match, it's optimal.
 - If there are multiple matches, but they're at the end of a long string, we
don't incur a copy for the region before the first match at all.
 - We don't incur the cost of a malloc, and don't churn memory.
 - We preserve the excess capacity; VC's string grows the buffer exponentially
(looks like a factor of 1.5, maybe). This might reduce the total number of
mallocs in usages like this:
https://cs.chromium.org/chromium/src/net/base/filename_util.cc?type=cs&q=Repl...,
where ReplaceSubstrings is called several times in a row. 

Using a temporary has tons of perf advantages, though:
 - Fewer copy operations in the worst case.
 - Since the copies are to non-overlapping regions, they'll use a memcpy()
instead of memmove(), which should be faster (though it's complicated: memmove
might get a speed boost from the fact that a copy operation on an overlapping
region touches fewer cache lines overall -- i.e. an in-place move consumes less
memory bandwidth, and one would hope that large copies are dominated by memory
bandwidth, not computation).

Basically, it comes down to "are the costs of the malloc for the temporary high
enough, that we'd be willing to incur an extra full-length memmove to avoid
it?".

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:773: str->reserve(final_length);
On 2017/07/27 01:09:57, Peter Kasting wrote:
> On 2017/07/26 23:50:14, ncarter (slow) wrote:
> > Should we worry that using a temporary here will break the 'momentum' of
str's
> > capacity?
> 
> See reply above.

Acknowledged.

FWIW this article is what was on my mind when I asked this question:
https://blog.mozilla.org/nnethercote/2014/11/04/please-grow-your-buffers-expo...

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:776: for (size_t i = 0; i < num_matches; ++i) {
On 2017/07/27 05:25:14, Peter Kasting wrote:
> Nit: I feel like we ought to be able to write this loop without |i|, just
using
> |pos|; basically, keep find()ing until you get npos.  Something like:
> 
>   for (size_t pos = first_match; pos != npos; pos = src.find(...)) {
> 
> This would also save a conditional in the loop body.

Writing it as you suggest means there's an extra call to find, though.

I considered three ways to structure this loop so that find() is only called
|num_matches - 1| times:
  * Unroll the first execution of the loop.
  * Add a conditional branch inside the loop that special-cases the first or
last iteration.
  * A mid-tested loop with a break statement prior to the find.

I've switched to the mid-tested loop option -- it's a little harder to read, but
it produces the fewest instructions.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:779: : src.find(find_this.data(), pos,
find_this.length());
On 2017/07/27 15:55:30, danakj wrote:
> nit: find_length?

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:781: str->append(replace_with.data(),
replace_with.length());
On 2017/07/27 15:55:30, danakj wrote:
> nit: replace_length?

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:802: str_length - shift_src);
On 2017/07/27 05:25:14, Peter Kasting wrote:
> Nit: We should probably be consistent about whether we use memmove both here
and
> below, or neither.

This particular replace can't be replaced with memmove, because this is in part
an append operation -- it grows the string. We could resize() the string and
then memmove, but that unnecessarily zeroes out the memory before overwriting
it.

To be consistent, I've switched all the non-length-altering calls to replace to
use either StringType::traits_type::move (essentially a typed memmove) or
StringType::traits_type::copy (essentially a typed memcpy).

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:818: do {
On 2017/07/26 23:50:14, ncarter (slow) wrote:
> Note that this is essentially the old 'string is shortening' loop (line 759 on
> the LHS); it just got shifted down here so that it could be shared with the
> in-place lengthening case.

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:830: memmove(&(*str)[write_offset],
&(*str)[read_offset],
On 2017/07/27 01:09:57, Peter Kasting wrote:
> On 2017/07/26 23:50:14, ncarter (slow) wrote:
> > Any reason we preferred memmove over string::replace()?
> 
> I don't recall having one, just "I know I'm moving bytes, so I don't need the
> string class to check whether it needs to expand/contract or the like --
memmove
> is probably the most highly-optimized way of doing this".

I've switched to StringType::traits_type::move/copy, which is basically a typed
version of memmove/memcpy, and it's what replace needs to call under the hood
anyway (per the standard). It eliminates the bounds-clamping, overlap-checking,
and length-equality checks that replace would have to do, and has three fewer
arguments to push on the stack each call.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:840: return;
On 2017/07/27 05:25:14, Peter Kasting wrote:
> Nit: Trailing return unnecessary

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util_unittest.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util_unittest.cc:625: for (size_t i = 0; i <
arraysize(cases); i++) {
On 2017/07/27 05:25:14, Peter Kasting wrote:
> Nit: Use range-based for (2 places)

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util_unittest.cc:628: str.shrink_to_fit();
On 2017/07/27 05:25:14, Peter Kasting wrote:
> Note that shrink_to_fit() is a non-binding request.  It may not actually
reduce
> the capacity.  The comments should probably note that.  If it's critical to
the
> test, I don't know how to force this, since basic_string is intentionally
pretty
> unwilling to let people force specific capacities.

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util_unittest.cc:635: // Ample capacity case; will try to in
place
On 2017/07/27 05:25:14, Peter Kasting wrote:
> Nit: Will try to ____ in place?

Done.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util_unittest.cc:638: const void* original_buffer =
str.data();
On 2017/07/27 05:25:14, Peter Kasting wrote:
> Nit: Why not char*?
> 
> That said, I'm uncomfortable unittesting this.  It's not part of Replace...'s
> contract that it doesn't realloc.  I think the test should verify behavioral
> correctness per the contract and not try to verify that we're not doing a
> realloc.

Removed.

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-07-28 03:45:28 UTC) #45

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_android_rel_ng on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/linux_android_rel_ng/builds/349937)

3 years, 4 months ago (2017-07-28 03:45:29 UTC) #46

Peter Kasting

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode768 base/strings/string_util.cc:768: if (str->capacity() < final_length) { On 2017/07/28 02:34:25, ncarter ...

3 years, 4 months ago (2017-07-28 08:29:25 UTC) #47

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
On 2017/07/28 02:34:25, ncarter (slow) wrote:
> Using a temporary has tons of perf advantages, though:
>  - Fewer copy operations in the worst case.

By 1, right?  Because we replace "copy to back" with "swap".

>  - Since the copies are to non-overlapping regions, they'll use a memcpy()
> instead of memmove(), which should be faster (though it's complicated: memmove
> might get a speed boost from the fact that a copy operation on an overlapping
> region touches fewer cache lines overall -- i.e. an in-place move consumes
less
> memory bandwidth, and one would hope that large copies are dominated by memory
> bandwidth, not computation).

I'm not sure whether memcpy() will actually be faster than memmove() in practice
in almost any situation.  (See e.g.
https://stackoverflow.com/questions/28623895/why-is-memmove-faster-than-memcpy
or http://vgable.com/blog/2008/05/24/memcopy-memmove-and-speed-over-safety/ ,
and many similar links/arguments.)

In fact
https://stackoverflow.com/questions/4707012/is-it-better-to-use-stdmemcpy-or-...
suggests using std::copy()/std::copy_backward() depending on which direction you
need to go, instead of memcpy() or memmove().

> Basically, it comes down to "are the costs of the malloc for the temporary
high
> enough, that we'd be willing to incur an extra full-length memmove to avoid
> it?".

I would say yes; I would much rather memmove() n bytes than malloc() n bytes. 
That's just because "doing too many mallocs" and "heap fragmentation" have been
drilled into me as Terrible Evils over the last few years, especially in omnibox
code that ends up calling this function a lot (which is why I originally
optimized it from n^2 to n).

However, construct-new-and-swap is simpler.  And I'm not sure malloc is quite as
to-be-avoided as my instinct says.  So I'm not certain.

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:830: memmove(&(*str)[write_offset],
&(*str)[read_offset],
On 2017/07/28 02:34:25, ncarter (slow) wrote:
> On 2017/07/27 01:09:57, Peter Kasting wrote:
> > On 2017/07/26 23:50:14, ncarter (slow) wrote:
> > > Any reason we preferred memmove over string::replace()?
> > 
> > I don't recall having one, just "I know I'm moving bytes, so I don't need
the
> > string class to check whether it needs to expand/contract or the like --
> memmove
> > is probably the most highly-optimized way of doing this".
> 
> I've switched to StringType::traits_type::move/copy, which is basically a
typed
> version of memmove/memcpy, and it's what replace needs to call under the hood
> anyway (per the standard). It eliminates the bounds-clamping,
overlap-checking,
> and length-equality checks that replace would have to do, and has three fewer
> arguments to push on the stack each call.

It's more readable too.  +1.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:744: StringType::traits_type::copy(buffer + offset,
replace_with.data(),
Nit: Wonder if something like either "using traits = StringType::traits_type" or
"const auto& copy = StringType::traits_type::copy" could be used to reduce the
verbosity here and below.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:790: while (true) {
Nit: Or:

  for (size_t match = first_match; ;
       match = src.find(find_this.data(), pos, find_length)) {
    ...

    // Doing this check here instead of in the loop conditional saves one total
    // find() call over the whole loop.
    if (!--num_matches)
      break;
  }

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
File base/strings/string_util_unittest.cc (right):

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util_unittest.cc:630: // Insufficient capacity case; must
realloc the string if it grows.
Nit: Maybe want to hoist this (and the similar comment in the next loop) to
above the loop (and reword slightly accordingly).

danakj

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode768 base/strings/string_util.cc:768: if (str->capacity() < final_length) { On 2017/07/28 08:29:24, Peter ...

3 years, 4 months ago (2017-07-28 15:53:49 UTC) #48

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
On 2017/07/28 08:29:24, Peter Kasting wrote:
> On 2017/07/28 02:34:25, ncarter (slow) wrote:
> > Using a temporary has tons of perf advantages, though:
> >  - Fewer copy operations in the worst case.
> 
> By 1, right?  Because we replace "copy to back" with "swap".
> 
> >  - Since the copies are to non-overlapping regions, they'll use a memcpy()
> > instead of memmove(), which should be faster (though it's complicated:
memmove
> > might get a speed boost from the fact that a copy operation on an
overlapping
> > region touches fewer cache lines overall -- i.e. an in-place move consumes
> less
> > memory bandwidth, and one would hope that large copies are dominated by
memory
> > bandwidth, not computation).
> 
> I'm not sure whether memcpy() will actually be faster than memmove() in
practice
> in almost any situation.  (See e.g.
> https://stackoverflow.com/questions/28623895/why-is-memmove-faster-than-memcpy
> or http://vgable.com/blog/2008/05/24/memcopy-memmove-and-speed-over-safety/ ,
> and many similar links/arguments.)
> 
> In fact
>
https://stackoverflow.com/questions/4707012/is-it-better-to-use-stdmemcpy-or-...
> suggests using std::copy()/std::copy_backward() depending on which direction
you
> need to go, instead of memcpy() or memmove().
> 
> > Basically, it comes down to "are the costs of the malloc for the temporary
> high
> > enough, that we'd be willing to incur an extra full-length memmove to avoid
> > it?".
> 
> I would say yes; I would much rather memmove() n bytes than malloc() n bytes. 
> That's just because "doing too many mallocs" and "heap fragmentation" have
been
> drilled into me as Terrible Evils over the last few years, especially in
omnibox
> code that ends up calling this function a lot (which is why I originally
> optimized it from n^2 to n).
> 
> However, construct-new-and-swap is simpler.  And I'm not sure malloc is quite
as
> to-be-avoided as my instinct says.  So I'm not certain.

In my profiling work in chromium, code has been consistently O(# of mallocs).
They dominate pretty much everything else. If memmove doesn't cause a malloc
then that should perform better. Measuring is how to know for sure tho of
course.

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-07-28 20:45:36 UTC) #49

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/240001

3 years, 4 months ago (2017-07-28 20:45:56 UTC) #50

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-07-28 20:56:35 UTC) #51

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: android_cronet on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_cronet/builds/189559)

3 years, 4 months ago (2017-07-28 20:56:37 UTC) #52

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-07-28 21:05:25 UTC) #53

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/260001

3 years, 4 months ago (2017-07-28 21:05:39 UTC) #54

ncarter (slow)

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode768 base/strings/string_util.cc:768: if (str->capacity() < final_length) { On 2017/07/28 08:29:24, Peter ...

3 years, 4 months ago (2017-07-28 21:46:34 UTC) #55

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
On 2017/07/28 08:29:24, Peter Kasting wrote:
> On 2017/07/28 02:34:25, ncarter (slow) wrote:
> > Using a temporary has tons of perf advantages, though:
> >  - Fewer copy operations in the worst case.
> 
> By 1, right?  Because we replace "copy to back" with "swap".

No, by a difference of 2. In the worse case the whole string would get memcpy'ed
or memmoved nearly 3 times for the "grow in place" path, versus exactly 1 time
for the in-place append path. (there is also one extra find() in the in-place
path, since the loop doesn't know to stop after the last match), Additionally,
there may be some extra memset's if we hit the resize call on line 799 (these
are Patch Set 8 line numbers).

Imagine the following case.

  str = "xyxyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyxy"
  str.capacity() == str.length() == 41
  find_this = "x"
  replace_with = "xx"

Let's say we decide trying to use the in-place copy here. We get to the line 801
replace() operation, which is:

  str->replace(4, 37, str, 1, 40);

  Internally, in the MSVC impl, this will do a reserve(42) operation:

		if (this->_Mysize() < _Newsize)
			_Grow(_Newsize);

Which then hits this case inside of Grow (Myres is the capacity):

		if (this->_Myres() < _Newsize)
			_Copy(_Newsize, this->_Mysize());	// reallocate to grow

Which always does a fresh allocation:

		pointer _Ptr;
		_TRY_BEGIN
			_Ptr = this->_Getal().allocate(_Newres + 1);

And then memcpy's from the old ptr to the new ptr:

		if (0 < _Oldlen)
			_Traits::copy(_Unfancy(_Ptr), this->_Myptr(),
				_Oldlen);	// copy existing elements

Then we're back in replace(), which will hit a path like this:

		else if (_Roff <= _Off)
			{	// hole gets larger, substring begins before hole
			_Traits::move(this->_Myptr() + _Off + _Count,
				this->_Myptr() + _Off + _N0, _Nm);	// move tail down
			_Traits::move(this->_Myptr() + _Off,
				this->_Myptr() + _Roff, _Count);	// fill hole
			}

_Nm here is zero, so really the memmove here is the second one, which is
something like:

   _Traits::move(this->_Myptr() + 4, this->_Myptr() + 1, 37);

After that operation the memory is like this:
  xyxyxyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyxy

And each byte of the new memory has been written this many times:
  1112222222222222222222222222222222222222211

So this replace() operation is itself 2 whole-string memcpys (really because the
MSVC STL is missing a potential optimization here, but it is what it is). Then
we have the third almost-whole-string memcpy in our code, to shift the middle
bank of y's down. So three writes to almost every memory location:

  2223333333333333333333333333333333333333321

The old rfind code, worst-case, was 1 whole-string memcpy (for the reserve()
call implicit in resize()), then a memzero of |expansion| chars (initializing
the extended tail in the resize() call), then the mmemmove pass in our code. So
it looked like this:

reserve (memcpy):
  1111111111111111111111111111111111111111000
resize (memzero):
+ 0000000000000000000000000000000000000000111
our loop (memmove/memcpy):
+ 1111111111111111111111111111111111111111111
---------------------------------------------
= 2222222222222222222222222222222222222222222 (total cost)

Whereas the append approach looks like this, all of the time:
1111111111111111111111111111111111111111111

========================

My apples-to-apples breakdown accounting (which ignores find() cost, which is
the same for all approaches) is something like:

 - append to new memory: requires one malloc and |str_length| char copies. Is
correct.
 - rfind approach, sufficient capacity: requires |str_length - first_match| char
copies (plus |new_length - str_length| memzeros). Is incorrect.
 - rfind approach, insufficient capacity: requires |2*str_length - first_match|
char copies (plus |new_length - str_length| memzeros) and one malloc. Is
incorrect.
 - shift-down approach, sufficient capacity: requires |str_length + last_match -
2*first_match| char copies. Is correct.
 - shift-down approach, insufficient capacity: requires |2*str_length +
last_match - 2*first_match| char copies and one malloc. Is correct.
========================

Given the above table, the choice is obvious except for the sufficient-capacity
case, where we're picking between:

(A) append (one malloc and |str_length| char copies), vs
(B) shift-down (|str_length + last_match - 2*first_match| char copies, which is
2*str_length in the worst case -- as described below).

The best case for shift-down is late-string matches, where the cost is basically
zero (last_match == first_match == str_length). The worst case is when
(last_match == str_length && first_match == 0), and the cost is 2*str_length.

Also -- I think this bears repeating -- our shift-down is optimal (copy-wise)
for the single-match case.

All in all, I think really points us in the direction of including the
shift-down strategy in the sufficient-capacity case. (2*str_length) copies in
the worst case, with 1*str_length copies in the average case, and an optimally
low number of copies in an important common case -- this seems likelt to be
better than a guaranteed cost of (1 malloc + 1*str_length copies) in all cases.

If we want to support an additional block of copy logic, we can add
preprocessing to determine if rfind is safe. rfind is always safe for
single-char |find_this|, and safety is trivial (a[0] == a[len - 1]) to determine
for 2-char and 3-char |find_this|. We don't need to do an rfind determination if
num_matches == 1, or if capacity is insufficient. Maybe as a follow-on CL?

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:744: StringType::traits_type::copy(buffer + offset,
replace_with.data(),
On 2017/07/28 08:29:24, Peter Kasting wrote:
> Nit: Wonder if something like either "using traits = StringType::traits_type"
or
> "const auto& copy = StringType::traits_type::copy" could be used to reduce the
> verbosity here and below.

Great suggestion. Done, with CharTraits as the name, which is suggestive of
char_traits.h where these things are defined.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:790: while (true) {
On 2017/07/28 08:29:24, Peter Kasting wrote:
> Nit: Or:
> 
>   for (size_t match = first_match; ;
>        match = src.find(find_this.data(), pos, find_length)) {
>     ...
> 
>     // Doing this check here instead of in the loop conditional saves one
total
>     // find() call over the whole loop.
>     if (!--num_matches)
>       break;
>   }
> 

Done.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
File base/strings/string_util_unittest.cc (right):

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util_unittest.cc:630: // Insufficient capacity case; must
realloc the string if it grows.
On 2017/07/28 08:29:24, Peter Kasting wrote:
> Nit: Maybe want to hoist this (and the similar comment in the next loop) to
> above the loop (and reword slightly accordingly).

Done.

danakj

I'll be away monday so won't get to look at this again for a little ...

3 years, 4 months ago (2017-07-28 22:00:28 UTC) #56

Peter Kasting

LGTM https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode768 base/strings/string_util.cc:768: if (str->capacity() < final_length) { On 2017/07/28 21:46:34, ...

3 years, 4 months ago (2017-07-29 02:18:01 UTC) #57

LGTM

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
On 2017/07/28 21:46:34, ncarter (slow) wrote:
> No, by a difference of 2. In the worse case the whole string would get
memcpy'ed
> or memmoved nearly 3 times for the "grow in place" path, versus exactly 1 time
> for the in-place append path.

To summarize your explanation: if we have to grow the string, we copy the whole
thing at that point.  So we have copy, shift to end, expand back, versus just
writing the expanded string.  And we don't save a malloc by not using a temp in
this case since growing the string has to alloc.

So when we lack capacity, we should clearly use a temp as you have done.

> All in all, I think really points us in the direction of including the
> shift-down strategy in the sufficient-capacity case. (2*str_length) copies in
> the worst case, with 1*str_length copies in the average case, and an optimally
> low number of copies in an important common case -- this seems likelt to be
> better than a guaranteed cost of (1 malloc + 1*str_length copies) in all
cases.

I agree with this.

> If we want to support an additional block of copy logic, we can add
> preprocessing to determine if rfind is safe. rfind is always safe for
> single-char |find_this|, and safety is trivial (a[0] == a[len - 1]) to
determine
> for 2-char and 3-char |find_this|. We don't need to do an rfind determination
if
> num_matches == 1, or if capacity is insufficient. Maybe as a follow-on CL?

Indeed, I'd do it separately, if at all.  We're talking about reducing constant
factors on reads/writes at that point rather than changing the complexity or
saving mallocs, so I doubt it's worth worrying about.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:740: auto* buffer = &((*str)[0]);
On 2017/07/28 22:00:28, danakj wrote:
> does str->data() not do what you want here?

data() returns pointer-to-const until C++17, unfortunately.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:790: while (true) {
On 2017/07/28 21:46:34, ncarter (slow) wrote:
> On 2017/07/28 08:29:24, Peter Kasting wrote:
> > Nit: Or:
> > 
> >   for (size_t match = first_match; ;
> >        match = src.find(find_this.data(), pos, find_length)) {
> >     ...
> > 
> >     // Doing this check here instead of in the loop conditional saves one
> total
> >     // find() call over the whole loop.
> >     if (!--num_matches)
> >       break;
> >   }
> > 
> 
> Done.

I suggest adding that comment I proposed, or one like it, to the test-at-end
code, since otherwise it looks like a prime place to refactor into doing this in
the conditional.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:840: size_t match = std::min(
On 2017/07/28 22:00:28, danakj wrote:
> this min() call reads very dubiously, as npos is -1, and then we go and
subtract
> from it. I think some comment explaining would be helpful here.

FWIW, I've always thought of npos as SIZE_MAX, not -1, so this code makes total
sense to me.

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-07-29 02:22:14 UTC) #58

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: android_n5x_swarming_rel on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_n5x_swarming_rel/builds/231956)

3 years, 4 months ago (2017-07-29 02:22:16 UTC) #59

ncarter (slow)

The CQ bit was checked by nick@chromium.org to run a CQ dry run

3 years, 4 months ago (2017-07-31 18:50:26 UTC) #60

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/300001

3 years, 4 months ago (2017-07-31 18:50:41 UTC) #61

ncarter (slow)

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_util.cc#newcode768 base/strings/string_util.cc:768: if (str->capacity() < final_length) { On 2017/07/29 02:18:01, Peter ...

3 years, 4 months ago (2017-07-31 18:55:39 UTC) #62

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/140001/base/strings/string_ut...
base/strings/string_util.cc:768: if (str->capacity() < final_length) {
On 2017/07/29 02:18:01, Peter Kasting wrote:
> On 2017/07/28 21:46:34, ncarter (slow) wrote:
> > No, by a difference of 2. In the worse case the whole string would get
> memcpy'ed
> > or memmoved nearly 3 times for the "grow in place" path, versus exactly 1
time
> > for the in-place append path.
> 
> To summarize your explanation: if we have to grow the string, we copy the
whole
> thing at that point.  So we have copy, shift to end, expand back, versus just
> writing the expanded string.  And we don't save a malloc by not using a temp
in
> this case since growing the string has to alloc.
> 
> So when we lack capacity, we should clearly use a temp as you have done.
> 
> > All in all, I think really points us in the direction of including the
> > shift-down strategy in the sufficient-capacity case. (2*str_length) copies
in
> > the worst case, with 1*str_length copies in the average case, and an
optimally
> > low number of copies in an important common case -- this seems likelt to be
> > better than a guaranteed cost of (1 malloc + 1*str_length copies) in all
> cases.
> 
> I agree with this.
> 
> > If we want to support an additional block of copy logic, we can add
> > preprocessing to determine if rfind is safe. rfind is always safe for
> > single-char |find_this|, and safety is trivial (a[0] == a[len - 1]) to
> determine
> > for 2-char and 3-char |find_this|. We don't need to do an rfind
determination
> if
> > num_matches == 1, or if capacity is insufficient. Maybe as a follow-on CL?
> 
> Indeed, I'd do it separately, if at all.  We're talking about reducing
constant
> factors on reads/writes at that point rather than changing the complexity or
> saving mallocs, so I doubt it's worth worrying about.

Agree.

FWIW, I figured out the algorithm for the "determine if a string has
self-overlap that would make rfind potentially dangerous" and it's actually
really elegant: seems to require exactly |find_length/2| char comparisons in the
worst case. But it's also really subtle, and I'm not convinced it's worth the
complexity/documentation cost -- it's a slippery slope from there to full on
boyer-moore. Even so: a recommended fun exercise / interview question.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
File base/strings/string_util.cc (right):

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:740: auto* buffer = &((*str)[0]);
On 2017/07/29 02:18:01, Peter Kasting wrote:
> On 2017/07/28 22:00:28, danakj wrote:
> > does str->data() not do what you want here?
> 
> data() returns pointer-to-const until C++17, unfortunately.

Yes, unfortunately.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:790: while (true) {
On 2017/07/29 02:18:01, Peter Kasting wrote:
> On 2017/07/28 21:46:34, ncarter (slow) wrote:
> > On 2017/07/28 08:29:24, Peter Kasting wrote:
> > > Nit: Or:
> > > 
> > >   for (size_t match = first_match; ;
> > >        match = src.find(find_this.data(), pos, find_length)) {
> > >     ...
> > > 
> > >     // Doing this check here instead of in the loop conditional saves one
> > total
> > >     // find() call over the whole loop.
> > >     if (!--num_matches)
> > >       break;
> > >   }
> > > 
> > 
> > Done.
> 
> I suggest adding that comment I proposed, or one like it, to the test-at-end
> code, since otherwise it looks like a prime place to refactor into doing this
in
> the conditional.

Done.

https://codereview.chromium.org/2979393002/diff/220001/base/strings/string_ut...
base/strings/string_util.cc:840: size_t match = std::min(
On 2017/07/29 02:18:01, Peter Kasting wrote:
> On 2017/07/28 22:00:28, danakj wrote:
> > this min() call reads very dubiously, as npos is -1, and then we go and
> subtract
> > from it. I think some comment explaining would be helpful here.
> 
> FWIW, I've always thought of npos as SIZE_MAX, not -1, so this code makes
total
> sense to me.

I added a comment, since it confused me at first too.

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 4 months ago (2017-07-31 19:56:38 UTC) #63

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: linux_chromium_chromeos_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_chromeos_rel_ng/builds/480281)

3 years, 4 months ago (2017-07-31 19:56:39 UTC) #64

danakj

LGTM https://codereview.chromium.org/2979393002/diff/300001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/300001/base/strings/string_util.cc#newcode855 base/strings/string_util.cc:855: // If we're shortening the string, truncate it ...

3 years, 4 months ago (2017-08-01 19:55:55 UTC) #65

Peter Kasting

https://codereview.chromium.org/2979393002/diff/300001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/300001/base/strings/string_util.cc#newcode855 base/strings/string_util.cc:855: // If we're shortening the string, truncate it now. ...

3 years, 4 months ago (2017-08-01 19:59:21 UTC) #66

danakj

https://codereview.chromium.org/2979393002/diff/300001/base/strings/string_util.cc File base/strings/string_util.cc (right): https://codereview.chromium.org/2979393002/diff/300001/base/strings/string_util.cc#newcode855 base/strings/string_util.cc:855: // If we're shortening the string, truncate it now. ...

3 years, 4 months ago (2017-08-01 20:00:37 UTC) #67

ncarter (slow)

The patchset sent to the CQ was uploaded after l-g-t-m from pkasting@chromium.org Link to the ...

3 years, 4 months ago (2017-08-01 22:21:40 UTC) #69

commit-bot: I haz the power

CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2979393002/300001

3 years, 4 months ago (2017-08-01 22:21:57 UTC) #70

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 300001, "attempt_start_ts": 1501626099946300, "parent_rev": "310fda3e3c3fb4e2c257e3962a81d44397cf7297", "commit_rev": "09d9682b0039e3b1dd16e5be60f5ebe4f698021f"}

3 years, 4 months ago (2017-08-02 00:31:21 UTC) #71

commit-bot: I haz the power

Description was changed from ========== [string_util] fix bug in ReplaceSubstringsAfterOffset() The problem with DoReplaceSubstringsAfterOffset was ...

3 years, 4 months ago (2017-08-02 00:31:36 UTC) #72

commit-bot: I haz the power

3 years, 4 months ago (2017-08-02 00:31:38 UTC) #73

Message was sent while issue was closed.

Committed patchset #16 (id:300001) as
https://chromium.googlesource.com/chromium/src/+/09d9682b0039e3b1dd16e5be60f5...

Issue 2979393002: [string_util] fix bug in ReplaceSubstringsAfterOffset() (Closed)

Description

Patch Set 1 #

Patch Set 2 : Substring replacement. #

Patch Set 3 : Change upstream #

Patch Set 4 : Support in-place expansion if capacity is available #

Patch Set 5 : StringPiece #

Patch Set 6 : typename #

Patch Set 7 : ReplaceCharsT #

Patch Set 8 : Reduce scope of change to just correctness fix. #

Patch Set 9 : Things. #

Patch Set 10 : preincrement. #

Patch Set 11 : Scratch. #

Patch Set 12 : Unittest fixes. #

Patch Set 13 : pkasting fixes #

Patch Set 14 : typename for clang #

Patch Set 15 : npos. #

Patch Set 16 : str_length #

Messages