On Tue, 21 May 2013 11:40:55 +1000, Toby Corkindale wrote: >>> While it is important to let the SSD know about space that can be >>> reclaimed, I gather the operation does not perform well. I *think* >>> current advice is to leave 'discard' off the mount options, and instead >>> run a nightly cron job to call 'fstrim' on the mount point instead. (In >>> really high write situations, you'd be looking at calling that every >>> hour instead I suppose) This is still a good idea - see below. >> The guy who blogged about this a couple of years ago was using a >> Sandforce controller drive. Btw that doesn't mean anything (neither in terms of performance nor stability), since "the controller" also needs to be paired with an - often vendor-dependent - firmware, which is much more relevant. Since LSI acquired Sandforce this situation has gotten much better (unified upstream). >> I'm not sure there is a similar issue with other drives. Certainly we've There is (now), because.. >> never noticed a problematic delay in file deletes. That said, our >> applications don't delete files too often (log file purging is probably >> the only place it happens regularly). >> >> Personally, in the absence of a clear and present issue, I'd prefer to >> go the "kernel guys and drive firmware guys will take care of this" >> route, and just enable discard on the mount. Nope, wrong, because.. (..getting there :) > That is from 2011 though, so you're right that things may have improved by > now.. Has anyone seen benchmarks supporting that though? Unfortunately since 3.8 discards are issued as synchronous commands, effectively disabling any scheduling/merging etc. The result can be seen easily: - mount drive without discard using kernel >= 3.8 - unpack kernel source - time delete of entire tree - remount with discard - unpack kernel tree - start delete of tree - ... - check it hasn't crashed - ... - go plant a tree or make babies while waiting for it to finish Online discard has gotten so slow that it's now a good idea to turn off for anything but light write workloads. Metadata-heavy writes are obviously the worst case. I experienced this on Samsung, Intel & a Sandforce-based drives, so "the controller" is no longer the primary reason for the performance impact. Extremely enterprisey drives *might* behave slightly better, but I doubt it; flash erase cycles are what they are. -h -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general