Hi, On Wed, Aug 30, 2017 at 8:21 PM, Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote: > Hi, > > (apologies for the previous duff subject line - the moving process > involved creating a fake mail that had a duff subject and I forgot to > update it) > > On 30 August 2017 at 14:16, Elhamer Oussama AbdelKHalek > <abdelkhalekdev@xxxxxxxxx> wrote: >> >> On Wed, Aug 30, 2017 at 7:54 AM, Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote: >>> >>> On 30 August 2017, Abdelkhalek Oussama Elhamer <Abdelkhalekdev@xxxxxxxxx> wrote: >>> >>>> I want to test how a couple of Intel devices perform with a specific >>>> workload that contains mainly random 128k writes, but the random >>>> offsets must be generated using a Fisher and Yates shufflealgorithm >>>> (basically to make sure that there are zero collisions between >>>> requested offsets ). >>> >>> I'm going to stop you right there: do you know that fio maintains an >>> "axmap" by default (Jens swears it wasn't named after him over in >>> https://github.com/axboe/fio/blob/308d69b5d340577b7886696f39753b7ba5ae9e11/lib/axmap.c >>> ) to ensure when you choose a random workload (which is uniform by >>> default) every block is hit only once per pass with no overlaps (i.e. >>> no collisions)? See >>> http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-norandommap >>> which turns this behvaiour off for more details. Anyway... >> >> I am afraid there there was a bunch of collisions between the offsets >> and i didn't set the norandommap attribute, >> i have printed the generated offsets (the values of *b) for different >> offsets (on a size of 2M using bs=128k): http://prntscr.com/gex940 >> Unfortunately, the collisions are visible :/. > > A challenge! OK: > $ fio --randseed=25 --debug=io --size=2M --ioengine=null > --rw=randwrite --name=checkdup | awk 'match($0, > /fill_io_u.*off=([0-9]+).*/, arr) { print arr[1] }' | sort | uniq -d > | wc -l > 0 > > No duplicate offsets with the default "uniform" distribution. I got the same result at my end. The thing is that at first i didn't know you could dump the generated offsets using --debug=io , so i've printed the values of '*b' for 8 different seeds (--size=2m --bs=128k --rw=randwrite --ioengine=libaio) explicitly ( printf the values in the source code / recompile fio), the result contains collisions! If i am understanding correctly, when a generated random value is already in the axmap table, another one will be generated! Here a dump fo both b* and off values (16 offsets) https://gist.github.com/SamTheDev/29ede9f3ef2466c5b5d2e026f4bd2aec Where exactly the *b offset is updated (according to axmap table) in the code? and does that mean that changing only the *b value for a custom random generator won't do the trick? > > fio --random_distribution=zipf:0.8 --randseed=25 --debug=io --size=2M > --ioengine=null --rw=randwrite --name=checkdup | awk 'match($0, > /fill_io_u.*off=([0-9]+).*/, arr) { print arr[1] }' | sort | uniq -d > | wc -l > 74 > > We see duplicate offsets with a non-uniform random distribution. > > $ ./fio --version > fio-3.0-9-gb427-dirty > > This is on Ubuntu 16.04 x86_64. We'd really need to see the job that > created your duplicate offsets (please don't skip providing the > information in https://github.com/axboe/fio/blob/master/REPORTING-BUGS > :-) > > Even without the randommap fio has options that can pick different > random generators: > http://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-random-generator > Thank you, lfsr suits perfectly my needs :)). > >>>> So, i added an implementation to that random generator to fio, and >>>> added a “get_next_Fisher_and_Yates_offset” function toretrieve the >>>> next offset each time. >>>> >>>> To quickly test that, i replaced the default TAUSWORTHE random offset >>>> function call, more precisely I’ve changed the __get_next_rand_offset >>>> function from the io_u.c file to something like that: >>>> > [...] >>>> >>>> Is that enough to replace the default random generator (for a quick test)?? >>>> >>>> PS: the reason i am asking is that after doing that, i am getting top >>>> performances each run, even on limited fully written device range. >>>> this doesn’t look right! :/ > > I should have asked - what performance were you expecting to see? > Sorry, i forgot to mention that the drive was fully written and partially trimmed before running the randwrite job. On a full nvme drive, i ran a randtrim job, after a reasonable delay i ran the randwrite job using the same rand-seed. My drive bandwidth can go up to 1.4 Gb/s, when using the default TAUSWORTHE random generator the bw was around 150 Mb/s, which sounds reasonable, but when using the Fisher-yates generator i was getting top performance (around 1.3 Gb/s!) the job files look something like that: [global] ioengine=libaio iodepth=256 size=1117G direct=1 do_verify=0 continue_on_error=all filename=/dev/nvme0n1 randseed=23534 sync=1 [write-job] rw=randwrite / randtrim bs=128k >>>> Here a sample of my test fio file: >>>> >>>> [global] >>>> ioengine=libaio >>>> iodepth=256 >>> >>> Fairly high depth which is good for your NVMe disk. >>> >>>> io_size=256G >>> >>> You aren't going to write much data before stopping (so it's highly >>> unlikely you will ever fall to garbage collection speeds)... >>> >> Actually, i did get a drop of bandwidth using randwrites (with the >> default random generator) ! > > Ah well I stand corrected :-). > Again, i forgot to mention that the drive was full :^p! > -- > Sitsofe | http://sucs.org/~sits/ Sam. -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html