[sent again due to stupid HTML mail problems, sorry] On Thu, Nov 6, 2014 at 11:54 PM, Måns Rullgård <mans@xxxxxxxxx> wrote: > Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> writes: > >> On Thu, Nov 06, 2014 at 10:12:54PM +0000, Måns Rullgård wrote: >>> Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> writes: >>> >>> > On Thu, Nov 06, 2014 at 09:38:59PM +0000, Måns Rullgård wrote: >>> >> Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> writes: >>> >> >>> >> > On Thu, Nov 06, 2014 at 09:01:36PM +0000, Måns Rullgård wrote: >>> >> >> Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> writes: >>> >> >> >>> >> >> > On Thu, Nov 06, 2014 at 08:49:01PM +0000, Måns Rullgård wrote: >>> >> >> >> Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> writes: >>> >> >> >> >>> >> >> >> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote: >>> >> >> >> >> The current implementation of put_tty_queue() causes a race condition >>> >> >> >> >> when re-arranged by the compiler. >>> >> >> >> >> >>> >> >> >> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line >>> >> >> >> >> >>> >> >> >> >> *read_buf_addr(ldata, ldata->read_head++) = c; >>> >> >> >> >> >>> >> >> >> >> was re-arranged by the compiler to something like >>> >> >> >> >> >>> >> >> >> >> x = ldata->read_head >>> >> >> >> >> ldata->read_head++ >>> >> >> >> >> *read_buf_addr(ldata, x) = c; >>> >> >> >> >> >>> >> >> >> >> which causes a race condition. Invalid data is read if data is read >>> >> >> >> >> before it is actually written to the read buffer. >>> >> >> >> > >>> >> >> >> > Really? A compiler can rearange things like that and expect things to >>> >> >> >> > actually work? How is that valid? >>> >> >> >> >>> >> >> >> This is actually required by the C spec. There is a sequence point >>> >> >> >> before a function call, after the arguments have been evaluated. Thus >>> >> >> >> all side-effects, such as the post-increment, must be complete before >>> >> >> >> the function is called, just like in the example. >>> >> >> >> >>> >> >> >> There is no "re-arranging" here. The code is simply wrong. >>> >> >> > >>> >> >> > Ah, ok, time to dig out the C spec... >>> >> >> > >>> >> >> > Anyway, because of this, no need for the wmb() calls, just rearrange the >>> >> >> > logic and all should be good, right? Christian, can you test that >>> >> >> > instead? >>> >> >> >>> >> >> Weakly ordered SMP systems probably need some kind of barrier. I didn't >>> >> >> look at it carefully. >>> >> > >>> >> > It shouldn't need a barier, as it is a sequence point with the function >>> >> > call. Well, it's an inline function, but that "shouldn't" matter here, >>> >> > right? >>> >> >>> >> Sequence points say nothing about the order in which stores become >>> >> visible to other CPUs. That's why there are barrier instructions. >>> > >>> > Yes, but "order" matters. >>> > >>> > If I write code that does: >>> > >>> > 100 x = ldata->read_head; >>> > 101 &ldata->read_head[x & SOME_VALUE] = y; >>> > 102 ldata->read_head++; >>> > >>> > the compiler can not reorder lines 102 and 101 just because it feels >>> > like it, right? Or is it time to go spend some reading of the C spec >>> > again... >>> >>> The compiler can't. The hardware can. All the hardware promises is >>> that at some unspecified time in the future, both memory locations will >>> have the correct values. Another CPU might see 'read_head' updated >>> before it sees the corresponding data value. A wmb() between the writes >>> forces the CPU to complete preceding stores before it begins subsequent >>> ones. >> >> Yes, sorry, I'm not talking about other CPUs and what they see, I'm >> talking about the local one. I'm not assuming that this is SMP "safe" >> at all. If it is supposed to be, then yes, we do have problems, but >> there should be a lock _somewhere_ protecting this. > > Within the confines of a single CPU + memory, barriers are never needed. > The moment another CPU or master-capable peripheral enters the mix, > proper ordering must be enforced somehow. > > If the buffer is already protected by a lock of some kind, this will > provide the necessary barriers, so nothing further is necessary. If > it's a lock-less design, there will need to be barriers somewhere. It was changed to lock-less with 3.12 in commit 6d76bd2618535c581f1673047b8341fd291abc67 ("n_tty: Make N_TTY ldisc receive path lockless"). So I will try to read the memory barrier docs again. Of course my little ARM system is no SMP system, but I guess this should also be fixed for the SMP case, right? Thanks, Christian -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html