On 17.12.21 22:28, Thomas Gleixner wrote: > On Fri, Dec 17 2021 at 12:30, David Hildenbrand wrote: >> Sometimes it is required to have a seqcount implementation that uses >> a structure with a fixed and minimal size -- just a bare unsigned int -- >> independent of the kernel configuration. This is especially valuable, when >> the raw_ variants of the seqlock function will be used and the additional >> lockdep part of the seqcount_t structure remains essentially unused. >> >> Let's provide a lockdep-free raw_seqcount_t variant that can be used via >> the raw functions to have a basic seqlock. >> >> The target use case is embedding a raw_seqcount_t in the "struct page", >> where we really want a minimal size and cannot tolerate a sudden grow of >> the seqcount_t structure resulting in a significant "struct page" >> increase or even a layout change. > Hi Thomas, thanks for your feedback! > Cannot tolerate? Could you please provide a reason and not just a > statement? Absolutely. "struct page" is supposed to have a minimal size with a fixed layout. Embedding something inside such a structure can change the fixed layout in a way that it can just completely breaks any assumptions on location of values. Therefore, embedding a complex structure in it is usually avoided -- and if we have to (spin_lock), we work around sudden size increases. There are ways around it: allocate the lock and only store the pointer in the struct page. But that most certainly adds complexity, which is why I want to avoid it for now. I'll extend that answer and add it to the patch description. > >> Provide raw_read_seqcount_retry(), to make it easy to match to >> raw_read_seqcount_begin() in the code. >> >> Let's add a short documentation as well. >> >> Note: There might be other possible users for raw_seqcount_t where the >> lockdep part might be completely unused and just wastes memory -- >> essentially any users that only use the raw_ function variants. > > Even when the reader side uses raw_seqcount_begin/retry() the writer > side still can use the non-raw variant which validates that the > associated lock is held on write. Yes, that's my understanding as well. > > Aside of that your proposed extra raw sequence count needs extra care > vs. PREEMPT_RT and this want's to be very clearly documented. Why? > > The lock association has two purposes: > > 1) Lockdep coverage which unearthed bugs already Yes, that's a real shame to lose. > > 2) PREEMPT_RT livelock prevention > > Assume the following: > > spin_lock(wrlock); > write_seqcount_begin(seq); > > ---> preemption by a high priority reader > > seqcount_begin(seq); <-- live lock > > The RT substitution does: > > seqcount_begin(seq) > cnt = READ_ONCE(seq->sequence); > > if (cnt & 1) { > lock(s->lock); > unlock(s->lock); > } > > which prevents the deadlock because it makes the reader block on > the associated lock, which allows the preempted writer to make > progress. > > This applies to raw_seqcount_begin() as well. > > I have no objections against the construct itself, but this has to be > properly documented vs. the restriction this imposes. Absolutely, any input is highly appreciated. But to mention it again: whatever you can do with raw_seqcount_t, you can do with seqcount_t, and there are already users relying completely on the raw_ function variants (see my other reply). So the documentation should most probably be extended to cover the raw_ functions and seqcount_t in general. > > As you can see above the writer side therefore has to ensure that it > cannot preempted on PREEMPT_RT, which limits the possibilities what you > can do inside a preemption (or interrupt) disabled section on RT enabled > kernels. See Documentation/locking/locktypes.rst for further information. It's going to be used for THP, which are currently incompatible with PREEMPT_RT (disabled in the Kconfig). But preemption is also disabled because we're using bit_spin_lock(), which does a bit_spin_lock(). Certainly worth documenting! Thanks for your input! -- Thanks, David / dhildenb