Re: atomic operation in 32 bit but no in 64!?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29-02-08 01:34, Peter Teoh wrote:

On Sat, Feb 9, 2008 at 8:10 AM, Rene Herman <rene.herman@xxxxxxxxxxxx> wrote:
On 09-02-08 00:22, Diego Woitasen wrote:

 >       I was reading the code of include/linux/fs.h and saw a comment
 >       before i_size_read() that says:
 >
 >       /*
 >        * NOTE: in a 32bit arch with a preemptable kernel and an UP
 >        * compile the i_size_read/write must be atomic with respect to
 >        * the local cpu (unlike with preempt disabled), but they don't
 >        * need to be atomic with respect to other cpus like in true SMP
 >        * (so they need either to either locally disable irq around the
 >        * read or for example on x86 they can be still implemented as a
 >        * cmpxchg8b without the need of the lock prefix). For SMP
 >        * compiles and 64bit archs it makes no difference if preempt is
 >        * enabled or not.
 >        */
 >
 >        I don't understand why this funcion shouldn't be atomic in a 64
 >        bit arch or why it isn't locked. Where is the race condition
 >        prevented?

 In the CPU's ALU. inode->i_size is a 64_bit integer, and access to it is
 atomic on 64-bit. On a 32-bit arch though, a 64-bit load will be split in
 two 32-bit ones where you can get an incoherent value if you're interrupted
 between getting the low and the high 32-bit.

I understood what u are saying, as i_size is loff_t, and loff_t is
defined as "long long".   But the fact is this, of the thousands of
assembly instructions in the kernel, in between any two, it can always
be interrupted, so long as u ensure that the interrupt handler ensure
that all the registers that it modified has been restored back to its
original value upon returning.   So I don't quite understand why it
cannot be interrupted between the upper and lower half of the 32bit
processing.

It's not about registers. i_size_read() is designed to be able to be called without the i_sem held meaning it needs to guard against i_size changing out from under it.

Say process A wants to know inode->i_size. On a 32-bit arch it's going to be split in two 32-bit loads such as (Intel pseudo-syntax):

	mov	eax, [inode->i_size]
	mov	edx, [inode->i_size + 4]

Now imagine process A being preempted just between these two loads by process B and process B changing inode->i_size. When process A resumes it gets the _new_ upper 32-bits while it already had the _old_ lower 32-bits, making for a combined 64-bit value which is complete nonsense.

Now, mind you, exactly how much point there is to any specific code path in checking i_size without grabbing i_sem is open for discussion -- even if with the locking you get a _coherent_ value, it may still be an _outdated_ value if you're preempted exactly after this sequence, but that's a higher-level issue. A bit of googling seems to imply stat() wants it non-locked. You'd have to ask a VFS person for a more detailed answer as to the why at that higher level.

Perhaps Andrew feels chatty...

But the core issue is just that you need to get a coherent value.

Rene.

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ


[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux