Martin Ågren <martin.agren@xxxxxxxxx> writes: > If two threads have one freshly initialized string buffer each and call > strbuf_reset on them at roughly the same time, both threads will be > writing a '\0' to strbuf_slopbuf. That is not a problem in practice > since it doesn't matter in which order the writes happen. But > ThreadSanitizer will consider this a race. > > When compiling with GIT_THREAD_SANITIZER, avoid writing to > strbuf_slopbuf. Let's instead assert on the first byte of strbuf_slopbuf > being '\0', since it ensures the promised invariant of "buf[len] == > '\0'". (Writing to strbuf_slopbuf is normally bad, but could become even > more bad if we stop covering it up in strbuf_reset.) > > Signed-off-by: Martin Ågren <martin.agren@xxxxxxxxx> > --- > strbuf.h | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/strbuf.h b/strbuf.h > index e705b94db..295654d39 100644 > --- a/strbuf.h > +++ b/strbuf.h > @@ -153,7 +153,19 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len) > /** > * Empty the buffer by setting the size of it to zero. > */ > +#ifdef GIT_THREAD_SANITIZER > +#define strbuf_reset(sb) \ > + do { \ > + struct strbuf *_sb = sb; \ > + _sb->len = 0; \ > + if (_sb->buf == strbuf_slopbuf) \ > + assert(!strbuf_slopbuf[0]); \ > + else \ > + _sb->buf[0] = '\0'; \ > + } while (0) > +#else > #define strbuf_reset(sb) strbuf_setlen(sb, 0) > +#endif > > > /** The strbuf_slopbuf[] is a shared resource that is expected by everybody to stay a holder of a NUL. Even though it is defined as "char [1]", it in spirit ought to be considered const. And from that point of view, your new definition that is conditionally used only when sanitizer is in use _is_ the more correct one than the current "we do not care if it is slopbuf, we are writing \0 so it will be no-op anyway" code. I wonder if we excessively call strbuf_reset() in the real code to make your version unacceptably expensive? If not, I somehow feel that using this version unconditionally may be a better approach. What happens when a caller calls "strbuf_setlen(&sb, 0)" on a strbuf that happens to have nothing and whose buffer still points at the slopbuf (instead of calling _reset())? Shouldn't your patch fix that function instead, i.e. something like the following without the above? Is that make things noticeably and measurably too expensive? Thanks. strbuf.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/strbuf.h b/strbuf.h index 2075384e0b..1a77fe146a 100644 --- a/strbuf.h +++ b/strbuf.h @@ -147,7 +147,10 @@ static inline void strbuf_setlen(struct strbuf *sb, size_t len) if (len > (sb->alloc ? sb->alloc - 1 : 0)) die("BUG: strbuf_setlen() beyond buffer"); sb->len = len; - sb->buf[len] = '\0'; + if (sb->buf != strbuf_slopbuf) + sb->buf[len] = '\0'; + else + assert(!strbuf_slopbuf[0]); } /**