On 11/22/19 3:10 PM, Eric Sandeen wrote: > On 11/21/19 5:18 PM, Dave Chinner wrote: >> On Thu, Nov 21, 2019 at 10:44:44PM +0100, Pavel Reichl wrote: >>> Signed-off-by: Pavel Reichl <preichl@xxxxxxxxxx> >>> --- >> >> This is mixing an explanation about why the change is being made >> and what was considered when making decisions about the change. >> >> e.g. my first questions on looking at the patch were: >> >> - why do we need to break up the discards into 2GB chunks? >> - why 2GB? >> - why not use libblkid to query the maximum discard size >> and use that as the step size instead? > > Just wondering, can we trust that to be reasonably performant? > (the whole motivation here is for hardware that takes inordinately > long to do discard, I wonder if we can count on such hardware to > properly fill out this info....) Looking at the docs in kernel/Documentation/block/queue-sysfs.rst: discard_max_hw_bytes (RO) ------------------------- Devices that support discard functionality may have internal limits on the number of bytes that can be trimmed or unmapped in a single operation. The discard_max_bytes parameter is set by the device driver to the maximum number of bytes that can be discarded in a single operation. Discard requests issued to the device must not exceed this limit. A discard_max_bytes value of 0 means that the device does not support discard functionality. discard_max_bytes (RW) ---------------------- While discard_max_hw_bytes is the hardware limit for the device, this setting is the software limit. Some devices exhibit large latencies when large discards are issued, setting this value lower will make Linux issue smaller discards and potentially help reduce latencies induced by large discard operations. it seems like a strong suggestion that the discard_max_hw_bytes value may still be problematic, and discard_max_bytes can be hand-tuned to something smaller if it's a problem. To me that indicates that discard_max_hw_bytes probably can't be trusted to be performant, and presumably discard_max_bytes won't be either in that case unless it's been hand-tuned by the admin? -Eric