On Fri, Jun 29, 2018 at 11:55:08AM +0800, Xiao Guangrong wrote: > > > On 06/28/2018 07:55 PM, Wei Wang wrote: > >On 06/28/2018 06:02 PM, Xiao Guangrong wrote: > >> > >>CC: Paul, Peter Zijlstra, Stefani, Lai who are all good at memory barrier. > >> > >> > >>On 06/20/2018 12:52 PM, Peter Xu wrote: > >>>On Mon, Jun 04, 2018 at 05:55:17PM +0800, guangrong.xiao@xxxxxxxxx wrote: > >>>>From: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxx> > >>>> > >>>>It's the simple lockless ring buffer implement which supports both > >>>>single producer vs. single consumer and multiple producers vs. > >>>>single consumer. > >>>> > >>>>Many lessons were learned from Linux Kernel's kfifo (1) and DPDK's > >>>>rte_ring (2) before i wrote this implement. It corrects some bugs of > >>>>memory barriers in kfifo and it is the simpler lockless version of > >>>>rte_ring as currently multiple access is only allowed for producer. > >>> > >>>Could you provide some more information about the kfifo bug? Any > >>>pointer would be appreciated. > >>> > >> > >>Sure, i reported one of the memory barrier issue to linux kernel: > >> https://lkml.org/lkml/2018/5/11/58 > >> > >>Actually, beside that, there is another memory barrier issue in kfifo, > >>please consider this case: > >> > >> at the beginning > >> ring->size = 4 > >> ring->out = 0 > >> ring->in = 4 > >> > >> Consumer Producer > >> --------------- -------------- > >> index = ring->out; /* index == 0 */ > >> ring->out++; /* ring->out == 1 */ > >> < Re-Order > > >> out = ring->out; > >> if (ring->in - out >= ring->mask) > >> return -EFULL; > >> /* see the ring is not full */ > >> index = ring->in & ring->mask; /* index == 0 */ > >> ring->data[index] = new_data; > >> ring->in++; > >> > >> data = ring->data[index]; > >> !!!!!! the old data is lost !!!!!! > >> > >>So we need to make sure: > >>1) for the consumer, we should read the ring->data[] out before updating ring->out > >>2) for the producer, we should read ring->out before updating ring->data[] > >> > >>as followings: > >> Producer Consumer > >> ------------------------------------ ------------------------ > >> Reading ring->out Reading ring->data[index] > >> smp_mb() smp_mb() > >> Setting ring->data[index] = data ring->out++ > >> > >>[ i used atomic_store_release() and atomic_load_acquire() instead of smp_mb() in the > >> patch. ] > >> > >>But i am not sure if we can use smp_acquire__after_ctrl_dep() in the producer? > > > > > >I wonder if this could be solved by simply tweaking the above consumer implementation: > > > >[1] index = ring->out; > >[2] data = ring->data[index]; > >[3] index++; > >[4] ring->out = index; > > > >Now [2] and [3] forms a WAR dependency, which avoids the reordering. > > It can not. [2] and [4] still do not any dependency, CPU and complainer can omit > the 'index'. One thing to try would be the Linux-kernel memory model tools in tools/memory-model in current mainline. There is a README file describing how to install and set it up, with a number of files in Documentation and litmus-tests that can help guide you. Thanx, Paul