Re: [PATCH] zbd: Improve read randomness

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bart,

On Tue, 2018-08-28 at 20:23 -0700, Bart Van Assche wrote:
> On 08/28/18 17:57, Damien Le Moal wrote:
> > Thinking more about this, since zbd_find_zone() only guarantees that the
> > zone
> > returned has min_bs data in it, range can in fact still be <= 0 if
> > io_u->buflen is larger than min_bs. Is it assumed here that min_bs and
> > io_u->buflen are always equal ?
> 
> There is no such assumption. io_u->buflen can be larger than min_bs.
> E.g. test11 in test-zbd-support triggers that case because it uses the
> bsrange option.

OK. We are in sync. And this means that range can be negative after
zbc_find_zone() runs and finds a new zone, since the search returns once a
zone with min_bs bytes is found. That may be less than io_u->buflen.

> 
> > If yes, then range < 0 is not possible, but range == 0 is definitely still
> > possible and so the +1 is needed.
> 
> Have you considered to leave out the "+ 1" and to handle range == 0
> separately?

Sure I can do that. But again what about zbd_find_zone() ? Shouldn't it be
changed to look for a zone with at least io_u->buflen bytes rather than only
min_bs ? Otherwise, range can be negative and the test "if (io_u->offset +
io_u->buflen > zb->wp)" at the end of the DDIR_READ case could terminate reads
with eof prematuraly.

> > Looking at get_next_buflen(), io_u->buflen can be larger than min_bs. So
> > either we need to modify zbd_find_zone() to return a zone with at least
> > io_u->buflen bytes in it, or we need to handle the range < 0 case...
> 
> The range < 0 case is already handled by the following code at the end
> of the DDIR_WRITE case:

That is fine, but that is for the DDIR_WRITE case. There is no such code in
the DDIR_READ case. Reads I/Os buflen are not modified based on the zone wp
position... Do you think it is better to add a similar I/O size reduction code
for reads or change zbd_find_zone() as suggested above ? I think the latter is
better to ensure consistent sizes for reads.

> 
> 	/*
> 	 * Make sure that the buflen is a multiple of the minimal
> 	 * block size. Give up if shrinking would make the request too
> 	 * small.
> 	 */
> 	new_len = min((unsigned long long)io_u->buflen,
> 		      ((zb + 1)->start << 9) - io_u->offset);
> 	new_len = new_len / min_bs * min_bs;
> 	if (new_len == io_u->buflen)
> 		goto accept;
> 	if (new_len >= min_bs) {
> 		io_u->buflen = new_len;
> 		dprint(FD_IO, "Changed length from %u into %llu\n",
> 		       orig_len, io_u->buflen);
> 		goto accept;
> 	}
> 	log_err("Zone remainder %lld smaller than minimum block size %d\n",
> 		(((zb + 1)->start << 9) - io_u->offset),
> 		min_bs);
> 	goto eof;
> 
> Bart.

-- 
Damien Le Moal
Western Digital




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux