On Tue, 16 Jan 2018 16:50:28 +0000, Michael Loftis <mloftis@xxxxxxxxx> wrote: >Alignment definitely makes a difference for writes. It can also make a >difference for random reads as well since the underlying read may not line >up to the hardware add in a read ahead (at drive or OS Level) and you?re >reading far more data in the drive than the OS asks for. Best performance will be when the filesystem block size matches the SSD's writeable *data* block size. The SSD also has a separate erase sector size which is some (large) multiple of the data block size. <background> Recall that an SSD doesn't overwrite existing data blocks. When you update a file, the updates are written out to *new* "clean" data blocks, and the file's block index is updated to reflect the new structure. The old data blocks are marked "free+dirty". They must be erased (become "free+clean") before reuse. Depending on the drive size, the SSD's erase sectors may be anywhere from 64MB..512MB in size, and so a single erase sector will hold many individually writeable data blocks. When an erase sector is cleaned, ALL the data blocks it contains are erased. If any still contain good data, they must be relocated before the erase can be done. </background> You don't want your filesystem block to be smaller than the SSD data block, because then you are subject to *unnecessary* write applification: the drive controller has to read/modify/write a whole data block to change any part of it. But, conversely, filesystem blocks that are larger than the SSD write block typically are not a problem because ... unless you do something really stupid [with really low level code] ... the large filesystem blocks will end up be an exact multiple of data blocks. Much of the literature re: alignment actually is related to the erase sectors rather than the data blocks and is targeted at embedded systems that are not using conventional filesystems but rather are accessing the raw SSD. You do want your partitions to start on erase sector boundaries, but that usually is trivial to do. >Stupidly a lot of this isn?t published by a lot of SSD manufacturers, but >through benchmarks it shows up. Yes. The advice to match your filesystem to the data block size is not often given. >Another potential difference here with SAS vs SATA is the maximum queue >depth supported by the protocol and drive. Yes. The interface, and how it is configured, matters greatly. >SSD drives also do internal housekeeping tasks for wear leveling on writing. The biggest of which is always writing to a new location. Enterprise grade SSD's sometimes do perform erases ahead of time during idle periods, but cheap drives often wait until the free+dirty space is to be reused. >I?ve seen SSD drives benchmark with 80-90MB sequential read or write, >change the alignment, and you?ll get 400+ on the same drive with sequential >reads (changing nothing else) > >A specific example >https://www.servethehome.com/ssd-alignment-quickly-benchmark-ssd/ I believe you have seen it, but if the read performance changed that drastically, then the controller/driver was doing something awfully stupid ... e.g., re-reading the same data block for each filesystem block it contains. YMMV. George