On Wed, Dec 19, 2018 at 2:18 PM Paolo Valente <paolo.valente@xxxxxxxxxx> wrote: > > > > > Il giorno 19 dic 2018, alle ore 04:45, Ming Lei <tom.leiming@xxxxxxxxx> ha scritto: > > > > On Wed, Dec 19, 2018 at 2:52 AM Jens Axboe <axboe@xxxxxxxxx> wrote: > >> > >> On 12/18/18 5:45 AM, Paolo Valente wrote: > >>> Hi Jens, > >>> sorry for the following silly question, but maybe you can solve very > >>> quickly a doubt for which I'd spend much more time investigating. > >>> > >>> While doing some tests with scsi_debug, I've just seen that (at least) > >>> with direct I/O, the maximum number of pending I/O requests (at least > >>> in the I/O schedulers) is equal, unexpectedly, to the queue depth of > >>> the drive and not to > >>> /sys/block/<dev>/queue/nr_requests > >>> > >>> For example, after: > >>> > >>> sudo modprobe scsi_debug max_queue=4 > >>> > >>> and with fio executed as follows: > >>> > >>> job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=20 > >>> > >>> I get this periodic trace, where four insertions are followed by four > >>> completions, and so on, till the end of the I/O. This trace is taken > >>> with none, but the result is the same with bfq. > >>> > >>> fio-20275 [001] d... 7560.655213: 8,48 I R 281088 + 8 [fio] > >>> fio-20275 [001] d... 7560.655288: 8,48 I R 281096 + 8 [fio] > >>> fio-20275 [001] d... 7560.655311: 8,48 I R 281104 + 8 [fio] > >>> fio-20275 [001] d... 7560.655331: 8,48 I R 281112 + 8 [fio] > >>> <idle>-0 [001] d.h. 7560.749868: 8,48 C R 281088 + 8 [0] > >>> <idle>-0 [001] dNh. 7560.749912: 8,48 C R 281096 + 8 [0] > >>> <idle>-0 [001] dNh. 7560.749928: 8,48 C R 281104 + 8 [0] > >>> <idle>-0 [001] dNh. 7560.749934: 8,48 C R 281112 + 8 [0] > >>> fio-20275 [001] d... 7560.750023: 8,48 I R 281120 + 8 [fio] > >>> fio-20275 [001] d... 7560.750196: 8,48 I R 281128 + 8 [fio] > >>> fio-20275 [001] d... 7560.750229: 8,48 I R 281136 + 8 [fio] > >>> fio-20275 [001] d... 7560.750250: 8,48 I R 281144 + 8 [fio] > >>> <idle>-0 [001] d.h. 7560.842510: 8,48 C R 281120 + 8 [0] > >>> <idle>-0 [001] dNh. 7560.842551: 8,48 C R 281128 + 8 [0] > >>> <idle>-0 [001] dNh. 7560.842556: 8,48 C R 281136 + 8 [0] > >>> <idle>-0 [001] dNh. 7560.842562: 8,48 C R 281144 + 8 [0] > >>> > >>> Shouldn't the total number of pending requests reach > >>> /sys/block/<dev>/queue/nr_requests ? > >>> > >>> The latter is of course equal to 8. > >> > >> With a scheduler, the depth is what the scheduler provides. You cannot > >> exceed the hardware queue depth in any situation. You just have 8 > >> requests available for scheduling, with a max of 4 being inflight on > >> the device side. > >> > >> If both were 4, for instance, then you would have nothing to schedule > >> with, as all of them could reside on the hardware side. That's why > >> the scheduler defaults to twice the hardware queue depth. > > > > The default twice to hw queue depth might not be reasonable for multi LUN. > > > > Maybe it should be set twice of sdev->queue_depth for SCSI or > > hw queue depth/hctx->nr_active. But either way may become complicated > > because both can be adjusted runtime. > > > > Could you please explain why it is not working (if it is not working) > in my example, where there should be only one LUN? I didn't say it isn't working, and I mean it isn't perfect. The hardware queue depth is host-wide, that means it is shared by all LUNs. Of course, lots of LUNs may be attached to one single HBA. You can setup this setting via 'modprobe scsi_debug max_luns=16 max_queue=4' easily, then all 4 LUNs share the 4 tags. Thanks, Ming Lei