On Wed, 2012-02-29 at 12:22 +1100, Dave Chinner wrote: > On Tue, Feb 28, 2012 at 05:56:18PM -0500, Thomas Lynema wrote: > > Please reply to my personal email as well as I am not subscribed to the > > list. > > > > I have a PP120GS25SSDR it does support trim > > > > cat /sys/block/sdc/queue/discard_max_bytes > > 2147450880 > > > > The entire drive is one partition that is totally used by LVM. > > > > I made a test vg and formatted it with mkfs.xfs. Then mounted it with > > discard and got the following result when deleting a kernel source: > > > > /dev/mapper/ssdvg0-testLV on /media/temp type xfs > > (rw,noatime,nodiratime,discard) > > > > time rm -rf linux-3.2.6-gentoo/ > > real 5m7.139s > > user 0m0.080s > > sys 0m1.580s > > > > I'd say your problem is that trim is extremely slow on your > hardware. You've told XFS to execute a discard command for every > single extent that is freed, and that can be very slow if you are > freeing lots of small extents (like a kernel tree contains) and you > have a device that is slow at executing discards. > > > There where lockups where the system would pause for about a minute > > during the process. > > Yup, that's because it runs as part of the journal commit > completion, and if your SSD is extremely slow the journal will stall > waiting for all the discards to complete. > > Basically, online discard is not really a smart thing to use for > consumer SSDs. Indeed, it's just not a smart thign to run for most > workloads and use cases precisely because discard is a very slow > and non-queuable operation on most hardware that supports it. > > If you really need to run discard, just run a background discard > (fstrim) from a cronjob that runs when the system is mostly idle. > You won't have any runtime overhead on every unlink but you'll still > get the benefit of discarding unused blocks regularly. > > > ext4 handles this scenerio fine: > > > > /dev/mapper/ssdvg0-testLV on /media/temp type ext4 > > (rw,noatime,nodiratime,discard) > > > > time rm -rf linux-3.2.6-gentoo/ > > > > real 0m0.943s > > user 0m0.050s > > sys 0m0.830s > > I very much doubt that a single discard IO was issued during that > workload - ext4 uses the same fine-grained discard method XFS does, > and it does it at journal checkpoint completion just like XFS. So > I'd say that ext4 didn't commit the journal during this workload, > and no discards were issued, unlike XFS. > > So, now time how long it takes to run sync to get the discards > issued and completed on ext4. Do the same with XFS and see what > happens. i.e.: > > $ time (rm -rf linux-3.2.6-gentoo/ ; sync) > > is the only real way to compare performance.... > > > xfs mounted without discard seems to handle this fine: > > > > /dev/mapper/ssdvg0-testLV on /media/temp type xfs > > (rw,noatime,nodiratime) > > > > time rm -rf linux-3.2.6-gentoo/ > > real 0m1.634s > > user 0m0.040s > > sys 0m1.420s > > Right, that's how long XFS takes with normal journal checkpoint > IO latency. Add to that the time it takes for all the discards to be > run, and you've got the above number. > > Cheers, > > Dave. Dave and Peter, Thank you both for the replies. Dave, it is actually your article on lwn and presentation that you did recently that lead me to use xfs on my home computer. Let's try this with the sync as Dave suggested and the command that Peter used: mount /dev/ssdvg0/testLV -t xfs -o noatime,nodiratime,discard /media/temp/ time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync' vm.drop_caches = 3 real 6m35.768s user 0m0.110s sys 0m2.090s vmstat samples. Not putting 6 minutes worth in the email unless it is necessary. procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- 0 1 3552 6604412 0 151108 0 0 6675 5982 3109 3477 3 24 55 18 0 1 3552 6594756 0 161032 0 0 9948 0 1655 2006 1 1 74 24 0 1 3552 6587068 0 168672 0 0 7572 8 2799 3130 1 1 74 24 1 0 3552 6580744 0 174852 0 0 6288 0 2880 3215 6 2 74 19 ----i/o wait here---- 1 0 3552 6580496 0 174972 0 0 0 0 782 1110 22 4 74 0 1 0 3552 6580744 0 174972 0 0 0 0 830 1194 22 4 74 0 1 0 3552 6580744 0 174972 0 0 0 0 771 1117 23 3 74 0 1 0 3552 6580744 0 174972 0 0 0 4 1538 2637 30 5 66 0 1 0 3552 6580744 0 174972 0 0 0 0 1168 1946 26 3 72 0 1 0 3552 6580744 0 174976 0 0 0 0 762 1169 23 4 73 0 1 0 3552 6580528 0 175052 0 0 0 0 785 1138 25 2 73 0 2 0 3552 6580528 0 175052 0 0 0 0 868 1350 24 7 69 0 1 0 3552 6580528 0 175052 0 0 0 0 866 1259 24 5 72 0 1 0 3552 6580528 0 175052 0 0 0 8 901 1364 26 5 69 0 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 3552 6586348 0 175540 0 0 728 1069 1187 2057 26 7 66 1 2 0 3552 6583344 0 176068 0 0 1812 4 1427 2350 24 8 65 2 1 0 3552 6580920 0 177116 0 0 1964 0 1220 1961 25 8 67 1 1 0 3552 6566616 0 190232 0 0 13376 0 1291 1938 24 7 62 8 1 1 3552 6561780 0 193380 0 0 3344 12 1081 1953 22 4 58 15 1 1 3552 6532148 0 200548 0 0 7236 0 10488 3630 35 11 42 13 1 0 3552 6518508 0 200748 0 0 200 0 1929 4038 35 11 52 1 2 0 3552 6516516 0 200828 0 0 57 0 1308 2019 24 6 69 0 EXT4 sample mkfs.ext4 /dev/ssdvg0/testLV mount /dev/ssdvg0/testLV -t ext4 -o discard,noatime,nodiratime /media/temp/ time sh -c 'sysctl vm/drop_caches=3; rm -r linux-3.2.6-gentoo; sync' vm.drop_caches = 3 real 0m2.711s user 0m0.030s sys 0m1.330s #because I didn't believe it, I ran the command a second time. time sync real 0m0.157s user 0m0.000s sys 0m0.000s 0m1.420s vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 3548 5474268 19736 1191868 0 0 0 0 1274 2097 25 3 72 0 1 0 3548 5474268 19736 1191872 0 0 0 0 1027 1614 26 3 71 0 2 1 3548 6649292 4688 154264 0 0 9512 8 2256 3267 11 18 58 12 2 2 3548 6633188 15920 161592 0 0 18788 7732 5137 6274 5 17 49 29 0 1 3548 6623044 19624 167936 0 0 9948 10081 3233 4810 4 7 54 35 0 1 3548 6621556 19624 170068 0 0 2112 2642 1294 2135 4 1 72 23 0 2 3548 6611140 19624 179420 0 0 10260 50 1677 2930 7 2 64 27 0 1 3548 6606660 19624 183828 0 0 4181 32 2192 2707 6 2 67 26 1 0 3548 6604700 19624 185864 0 0 2080 0 961 1451 7 2 74 17 1 0 3548 6604700 19624 185864 0 0 0 0 966 1715 24 3 73 0 2 0 3548 6604700 19624 185864 0 0 8 196 1025 1582 24 4 72 0 1 0 3548 6604700 19624 185864 0 0 0 0 1133 1901 24 3 73 0 This time, I ran a sync. That should mean all of the discard operations were completed...right? If it makes a difference, when I get the i/o hang during the xfs deletes, my entire system seems to hang. It doesn't just hang that particular mounted volumes' i/o. Please let me know if there anything obvious that I'm missing from this equation. ~tom
Attachment:
signature.asc
Description: This is a digitally signed message part
_______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs