On Tue, Jan 10, 2023 at 09:32:55AM +0100, Peter Zijlstra wrote: > On Tue, Jan 10, 2023 at 08:23:05AM +0100, Heiko Carstens wrote: > > So, Alexander Gordeev reported that this code was already prior to your > > changes potentially broken with respect to missing READ_ONCE() within the > > cmpxchg_double() loops. > > Unless there's an early exit, that shouldn't matter. If you managed to > read garbage the cmpxchg itself will simply fail and the loop retries. > > > @@ -1294,12 +1306,16 @@ static void hw_perf_event_update(struct perf_event *event, int flush_all) > > num_sdb++; > > > > /* Reset trailer (using compare-double-and-swap) */ > > + /* READ_ONCE() 16 byte header */ > > + prev.val = __cdsg(&te->header.val, 0, 0); > > do { > > + old.val = prev.val; > > + new.val = prev.val; > > + new.f = 0; > > + new.a = 1; > > + new.overflow = 0; > > + prev.val = __cdsg(&te->header.val, old.val, new.val); > > + } while (prev.val != old.val); > > So this, and ... > this case are just silly and expensive. If that initial read is split > and manages to read gibberish the cmpxchg will fail and we retry anyway. While I do agree that there is no need to necessarily read the whole 16 bytes atomically in advance here, there is still the problem about the missing initial READ_ONCE() in the original code. As I tried to outline here: For example: /* Reset trailer (using compare-double-and-swap) */ do { te_flags = te->flags & ~SDB_TE_BUFFER_FULL_MASK; te_flags |= SDB_TE_ALERT_REQ_MASK; } while (!cmpxchg_double(&te->flags, &te->overflow, te->flags, te->overflow, te_flags, 0ULL)); The compiler could generate code where te->flags used within the cmpxchg_double() call may be refetched from memory and which is not necessarily identical to the previous read version which was used to generate te_flags. Which in turn means that an incorrect update could happen. Is there anything that prevents te->flags from being read several times? > > + /* READ_ONCE() 16 byte header */ > > + prev.val = __cdsg(&te->header.val, 0, 0); > > do { > > + old.val = prev.val; > > + new.val = prev.val; > > + *overflow = old.overflow; > > + if (old.f) { > > /* > > * SDB is already set by hardware. > > * Abort and try to set somewhere > > @@ -1490,10 +1509,10 @@ static bool aux_set_alert(struct aux_buffer *aux, unsigned long alert_index, > > */ > > return false; > > } > > + new.a = 1; > > + new.overflow = 0; > > + prev.val = __cdsg(&te->header.val, old.val, new.val); > > + } while (prev.val != old.val); > > And while this case has an early exit, it only cares about a single bit > (although you made it a full word) and so also shouldn't care. If > aux_reset_buffer() returns false, @overflow isn't consumed. Yes, except that it is anything but obvious that @overflow isn't consumed. > So I really don't see the point of this patch. As stated above: READ_ONCE() is missing. And while at it I wanted to have a consistent complete previous value - also considering that cdsg is not very expensive. And while it also reuse the returned values from cdsg, instead of throwing them away and reading from memory again in a splitted and potentially inconsistent way.