On Sat, Feb 19, 2011 at 3:54 AM, Arnd Bergmann <arnd@xxxxxxxx> wrote: > On Friday 18 February 2011 23:40:16 Andrei Warkentin wrote: >> On Fri, Feb 18, 2011 at 1:47 PM, Andrei Warkentin <andreiw@xxxxxxxxxxxx> wrote: >> >> Flashbench timings for both Sandisk and Toshiba cards. Attaching due to size. > > Very nice, thanks for the measurement! > > I don't think having the results inline in the mail is a problem, > it would even make it easier to quote. > >> Some interesting things that I don't understand. For the align test, I >> extended it to do a write align test (-A). I tried two partitions that >> I could write over, and both read and writes behaved differently for >> the two partitions on same device. Odd. They are both 4MB aligned. > > I never did a write align test because the results will be highly > unreliable as soon as you get into thrashing. Your results seem > to be meaningful still, so maybe we should have it after all, but > I'll put a big warning on it. > Actually it would be a good idea to also bail/warn if you do the au test with more open au's than the size of the passed device allows, since it'll just wrap around and skew the results. >> On the sandisk it was the write align that made the page size stand >> out. The read align had pretty constant results. > > I've noticed on other Sandisk media that the read align test is > sometimes useless. It may help to do a full erase of the partition, > or to fill it with data before running the test. > >> On the toshiba the results varied wildly for the two partitions. For >> partition 6, there was a clear pattern in the diff values for read >> align. For 9, it was all over the place. For 9 with the write align, >> 8K and 16K the crossing writes took ~115ms!! Look in attached files >> for all the data. > > Partition 6 is a lot smaller, so you have the accesses less than a > segment apart, so it shows other effects. > >> The AU tests were interesting too, especially how with several open >> AUs the throughput is higher for certain smaller sizes on sandisk, but >> if I interpret it correctly both cards have at least 4 AUs, as I >> didn't see yet a significant drop for small sizes. The larger ones I >> am running now on mmcblk0p9 which is sufficiently larger for these >> tests... (mmcblk0p6 is only 40mb, p9 is 314 mb) > > Right, you should try larger values for --open-au-nr here. It's at > least a good sign that the drive can do random access inside a segment > and that it can have at least 4 segments open. This is much better > than I expected from your descriptions at first. Actually the Toshiba one seems to have 7 AUs if I interpret this correctly. ^C # ./flashbench -O -0 6 -b 512 /dev/block/mmcblk0p9 4MiB 5.91M/s 2MiB 8.84M/s 1MiB 10.8M/s 512KiB 13M/s 256KiB 13.6M/s ^C # ./flashbench -O -0 7 -b 512 /dev/block/mmcblk0p9 4MiB 6.32M/s 2MiB 8.63M/s 1MiB 10.5M/s 512KiB 13.2M/s 256KiB 13M/s ^[[A^[[D^[[A128KiB 12.3M/s ^C # ./flashbench -O -0 8 -b 512 /dev/block/mmcblk0p9 4MiB 6.65M/s 2MiB 7.02M/s 1MiB 6.36M/s 512KiB 3.17M/s 256KiB 1.53M/s The Sandisk one has 20 AUs. # ./flashbench -O -0 20 -b 512 /dev/block/mmcblk0p9 4MiB 11.3M/s 2MiB 12.8M/s 1MiB 9.87M/s 512KiB 9.97M/s 256KiB 9.13M/s 128KiB 8.05M/s ^C # ./flashbench -O -0 50 -b 512 /dev/block/mmcblk0p9 4MiB 7.19M/s ^C # ./flashbench -O -0 2 -b 512 /dev/block/mmcblk0p9 ^C # ./flashbench -O -0 22 -b 512 /dev/block/mmcblk0p9 4MiB 11.6M/s 2MiB 12.3M/s 1MiB 5.13M/s 512KiB 2.57M/s 256KiB 1.59M/s 128KiB 1.16M/s 64KiB 776K/s ^C # ./flashbench -O -0 21 -b 512 /dev/block/mmcblk0p9 4MiB 11.2M/s 2MiB 12.4M/s 1MiB 4.65M/s 512KiB 1.95M/s 256KiB 955K/s > > However, the drop from 32 KB to 16 KB in performance is horrifying > for the Toshiba drive, it's clear that this one does not like > to be accessed smaller than 32 KB at a time, an obvious optimization > for FAT32 with 32 KB clusters. How does this change with your > kernel patches? Since the only performance-increasing patch here would be just the one that splits unaligned accesses, I wouldn't expect any improvements for page-aligned accesses < 32KB. As you can see here... # cat /sys/block/mmcblk0/device/page_size 8192 # ./flashbench -O -0 1 -b 512 /dev/block/mmcblk0p9 4MiB 6.81M/s 2MiB 7.73M/s 1MiB 9.21M/s 512KiB 9.98M/s 256KiB 10.3M/s 128KiB 10.2M/s 64KiB 9.76M/s 32KiB 8.52M/s 16KiB 3.68M/s 8KiB 1.72M/s 4KiB 837K/s ^C # echo 0 > /sys/block/mmcblk0/device/page_size # ./flashbench -O -0 1 -b 512 /dev/block/mmcblk0p9 4MiB 6.42M/s 2MiB 7.79M/s 1MiB 9.22M/s 512KiB 10M/s 256KiB 9.94M/s 128KiB 10.1M/s 64KiB 9.68M/s 32KiB 8.5M/s 16KiB 3.65M/s 8KiB 1.73M/s 4KiB 838K/s 2KiB 417K/s ^C # > > For the sandisk drive, it's funny how it is consistently faster > doing random access than linear access. I don't think I've seem that > before. It does seem to have some cache for linear access using > smaller than 16 KB, and can probably combine them when it's only > writing to a single segment. Yes, that is pretty interesting. Smaller than 16K? Not smaller than 32K? I wonder what it is doing... -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html