> 2023年4月26日 01:50,Paul E. McKenney <paulmck@xxxxxxxxxx> 写道: > > On Wed, Apr 26, 2023 at 01:06:57AM +0800, Alan Huang wrote: >> Hi Akira, >> >>> 2023年4月25日 23:33,Akira Yokosawa <akiyks@xxxxxxxxx> 写道: >>> >>> Hi Alan, >>> >>> On Tue, 25 Apr 2023 21:52:48 +0800, Alan Huang wrote: >>>> Hi, >>>> >>>> I noticed that the modifications of seq in write_seqlock and write_sequnlock of Listing 9.10 use plain ++ operation. >>>> But as Chapter 4 (especially 4.3.4.4) says, there will be store tearing since there are concurrent readers. >>>> >>>> However, the kernel implementation of sequence lock also uses plain ++ operation. >>>> >>>> I’m somewhat confused. >>>> >>>> Why does the modification of seq in write_seqlock not require WRITE_ONCE? >>> >>> Good question... >>> >>> I don't have a straight answer, but the history of include/linux/seqlock.h >>> says those plain increments predate the introduction of ACCESS_ONCE >>> (predecessor to READ_ONCE/WRITE_ONCE) to the Linux kernel. >>> >>> The code at the time (pre v2.6.0) looked like this: >>> >>> static inline void write_seqlock(seqlock_t *sl) >>> { >>> spin_lock(&sl->lock); >>> ++sl->sequence; >>> smp_wmb(); >>> } >>> >>> static inline void write_sequnlock(seqlock_t *sl) >>> { >>> smp_wmb(); >>> sl->sequence++; >>> spin_unlock(&sl->lock); >>> } >>> >>> static inline int write_tryseqlock(seqlock_t *sl) >>> { >>> int ret = spin_trylock(&sl->lock); >>> >>> if (ret) { >>> ++sl->sequence; >>> smp_wmb(); >>> } >>> return ret; >>> } >>> >>> The purpose of WRITE_ONCE() would be to suppress compiler optimization >>> of write accesses. In the code above, smp_wmb(), spin_lock(), >>> spin_unlock(), and spin_trylock() all imply compiler barriers. >>> So, there is not much room for compilers to optimize the store part >>> of increments. >> >> But unsigned long is at least 32 bits, here is the COMPILER BARRIER section of Documentation/memory-barriers.txt: >> >> For example, given an architecture having 16-bit store instructions with 7-bit immediate fields, the compiler >> might be tempted to use two 16-bit store-immediate instructions to implement the following 32-bit store: >> p = 0x00010002; >> >> The kernel defines sequence as unsigned, which is at least 16 bits, I don’t know if there exists an architecture having 16-bit store instructions with 32-bit unsigned… > > There have been CPUs and compilers that would use a pair of 16-bit > store-immediate instructions to store a 32-bit constant. I don't know > of any situation where a compiler would use 16-bit loads or stores for > a 32-bit increment, but there is no law forbidding the compiler from > doing so. > > To answer the higher-level question, once we had the Linux-kernel memory > model in place, I focused more on explaining it and clearly not enough > on bringing the perfook code samples into line with it. > > So I would welcome a patch making sequence locking use READ_ONCE() > and WRITE_ONCE() to do the increments. The Linux kernel might not > be so excited about the corresponding patch because it could result > in slightly worse code being generated, for example, a++ might use > an x86 add-to-memory instruction while the READ_ONCE()/WRITE_ONCE() > counterpart would load, increment, and store. Though I hear rumors that > some compilers might start better optimizing this. I see. Let’s keep the sample code consistent with kernel’s. Thank you both for the explanation! Thanks, Alan > > Thanx, Paul > >> Thanks, >> Alan >> >>> >>> I think those plain-looking increments are safe as far as current >>> compilers are concerned. >>> >>> Does this explanation help you? >>> >>> Paul, please chime in if I'm missing something. >>> >>> Thanks, Akira >>> >>>> >>>> Thanks, >>>> Alan