On 5/3/18 9:20 AM, Jens Axboe wrote: > I implemented support for discards in blk-wbt, so we treat them like > background writes. If we have competing foreground IO, then we may > throttle discards. Otherwise they should run at full speed. Ran a quick local test case on a NVMe device. The test case here is doing reads from a file, while other files are being delete. File system is XFS. Three separate runs: 1) 'justreader' is just running the reader workload by itself. 2) 'nothrottle' is running the reader with the unlinks on the stock kernel. 3) 'throttled' is running the reader with the unlinks with the patches applied. The fio run logs latencies in windows of 50msec, logging only the worst latency seen in that window: [read] filename=file0 rw=randread bs=4k direct=1 ioengine=libaio iodepth=8 write_lat_log=read_lat log_avg_msec=50 log_max_value=1 As you can see, with the throttling, the latencies and performance is the same as not having the unlinks running. They finish within 100msec of each other, after having read 10G of data. Without the throttling and unlinks running, we see spikes that are almost an order of magnitude worse. The reader also takes 1 second longer to complete. 322MB/sec vs 313MB/sec. It's also visible in the higher Pxx. Here's the throttled end of the percentiles: | 99.00th=[ 145], 99.50th=[ 153], 99.90th=[ 180], 99.95th=[ 198], | 99.99th=[ 889] and here's the unthrottled equivalent: | 99.00th=[ 145], 99.50th=[ 155], 99.90th=[ 188], 99.95th=[ 322], | 99.99th=[ 8586] -- Jens Axboe
Attachment:
Reader latencies-lat.jpg
Description: JPEG image