On 2010-11-09 12:53, Bart Van Assche wrote: > On Mon, Nov 8, 2010 at 2:03 PM, Jens Axboe <jaxboe@xxxxxxxxxxxx> wrote: >> >> On 2010-11-07 13:58, Bart Van Assche wrote: >>> On Sun, Nov 7, 2010 at 12:43 PM, Jens Axboe <jaxboe@xxxxxxxxxxxx> wrote: >>>> >>>> On 2010-11-06 10:35, Bart Van Assche wrote: >>>>> On multicore non-x86 CPUs fio has been observed to frequently reports false >>>>> data verification failures with I/O engine libaio and I/O depths above one. >>>>> This is because of a race condition in the function fill_pattern(). The code >>>>> in that function only works correct if all CPUs of a multicore system >>>>> observe store instructions in the order they were issued. That is the case for >>>>> multicore x86 systems but not for all other CPU families, such as e.g. the >>>>> POWER CPU family. >>>>> >>>>> [ ... ] >> >> Forgive me, but I'm still a little confused. This second write_barrier() >> is now protecting against the order of the fill and the length >> assignment. IOW, if you see the new length, you are guaranteed to also >> see the new content. This means that the first memory barrier should be >> a read_barrier(). >> >> And ditto for the other case. >> >> Can you verify whether that works as expected and send an updated patch? > > Hello Jens, > > I'm afraid that I will have to do more testing and that I'll have to > make sure that I understand the entire fio code base before I can > develop and send a new patch - something I do not have the time for > now unfortunately. I ran into this issue on 32-bit 2.6.34.7 kernel > while running a test on a local ext3 filesystem, something I will have > to analyze further before I can proceed: > > $ valgrind ./fio --ioengine=libaio --overwrite=1 --verify=md5 > --iodepth=10 --direct=1 --loops=10 --size=1MB --name=test --thread > --numjobs=10 --group_reporting > ==13318== Memcheck, a memory error detector > ==13318== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. > ==13318== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info > ==13318== Command: ./fio --ioengine=libaio --overwrite=1 --verify=md5 > --iodepth=10 --direct=1 --loops=10 --size=1MB --name=test --thread > --numjobs=10 --group_reporting > ==13318== > test: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=10 > ... > test: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=10 This looks pretty straight forward - the file is created, but not filled with a verifiable pattern. You want to run the workload with rw=write at least once first, then you can use a read-only verify workload later if you want. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html