Hello, Hopefully this is the right place for this, and apologies for the lengthy mail. I'm struggling with an issue with SCSI UNMAP/discard in newer kernels, and I'm hoping to find a resolution or at least to better understand why this has changed. Some background info: Our Linux boxes are primarily VMs running on VMware backed by NetApp storage. We have a fair number of systems that directly mount LUNs (due to i/o requirements, snapshot scheduling, dedupe issues, etc.). On newer LUNs, the 'space_alloc' option is enabled, which causes the LUN to report unmap support and free unused blocks on the underlying storage. The problem: I noticed multiple LUNs with space_alloc enabled reported 100% utilization on the netapp but much less from the Linux. I verified they were mounted with discard option and also ran fstrim, which reported success but did not change the utilization reported by the netapp. I eventually was able to isolate kernel version as the only factor in whether discard worked. Further testing showed 3.10.x handled discard correctly, but 4.4.x would never free blocks. This was verified on a single machine with the only change being the kernel. The only notable difference I could find was in /sys/block/sdX/discard* values - on 3.10.x the discard granularity was reported as 4096, while on 4.4.x it was 512 (logical block size is 512, physical is 4096 on the LUNs). Eventually that led me to these patches for sd.c: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/drivers/scsi/sd.c?id=397737223c59e89dca7305feb6528caef8fbef84 and https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/drivers/scsi/sd.c?id=f4327a95dd080ed6aecb185478a88ce1ee4fa3c4. They result in discard granularity being forced to logical block size if the disk reports LBPRZ is enabled (which the netapp luns do). It seems that this change is responsible for the difference in discard granularity, and my assumption is that because wafl is actually a 4k block filesystem the netapp requires 4k granularity and ignores the 512b discard requests. It's not clear to me whether this is a bug in sd or an issue in the way the LUNs are presented from the netapp side (I've opened a case with them as well and am waiting to hear back). However, minimum_io_size is 4096, so it seems a bit odd that discard_granularity would be smaller. And earlier kernel versions work as expected, which seems to indicate the problem is in sd. As far as fixes or workarounds, it seems that there are three potential options: 1) The netapp could change the reported logical block size to match the physical block size 2) The netapp could report LBPRZ=0 3) The sd code could be updated to use max(logical_block_size, physical_block_size) or max(logical_block_size, minimum_io_size) or otherwise changed to ensure discard_granularity is set to a supported value I'm not sure of the implications of either the netapp changes, though reporting 4k logical blocks seems potential as this is supported in newer OS at least. The sd change potentially would at least partially undo the patches referenced above. But it would seem that (assuming an aligned filesystem with 4k blocks and minimum_io_size=4096) there is no possibility of a partial block discard or advantage to sending the discard requests in 512 blocks? Any help is greatly appreciated. Thanks, -David