Re: [PATCH v7 04/11] block: Introduce REQ_OP_ZONE_APPEND

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/19/20 5:30 PM, Damien Le Moal wrote:
On 2020/04/19 1:46, Bart Van Assche wrote:
On 2020-04-17 05:15, Johannes Thumshirn wrote:
From: Keith Busch <kbusch@xxxxxxxxxx>
+/*
+ * Check write append to a zoned block device.
+ */
+static inline blk_status_t blk_check_zone_append(struct request_queue *q,
+						 struct bio *bio)
+{
+	sector_t pos = bio->bi_iter.bi_sector;
+	int nr_sectors = bio_sectors(bio);
+
+	/* Only applicable to zoned block devices */
+	if (!blk_queue_is_zoned(q))
+		return BLK_STS_NOTSUPP;
+
+	/* The bio sector must point to the start of a sequential zone */
+	if (pos & (blk_queue_zone_sectors(q) - 1) ||
+	    !blk_queue_zone_is_seq(q, pos))
+		return BLK_STS_IOERR;
+
+	/*
+	 * Not allowed to cross zone boundaries. Otherwise, the BIO will be
+	 * split and could result in non-contiguous sectors being written in
+	 * different zones.
+	 */
+	if (blk_queue_zone_no(q, pos) != blk_queue_zone_no(q, pos + nr_sectors))
+		return BLK_STS_IOERR;

Can the above statement be simplified into the following?

	if (nr_sectors > q->limits.chunk_sectors)
		return BLK_STS_IOERR;

That would be equivalent only if the zone is empty. If the zone is not empty, we
need to check that the zone append request does not cross over to the next zone,
which would result in the BIO being split by the block layer.

At the start of blk_check_zone_append() function there is a check that 'pos' is aligned with a zone boundary. How can 'pos' at the same time represent a zone boundary and the exact offset at which the write will happen? I do not understand this.

+	/* Make sure the BIO is small enough and will not get split */
+	if (nr_sectors > q->limits.max_zone_append_sectors)
+		return BLK_STS_IOERR;

Do we really need a new request queue limit parameter? In which cases
will max_zone_append_sectors differ from the zone size?

Yes it can differ from the zone size. On real hardware, max_zone_append_sectors
will most of the time be equal to max_hw_sectors_kb. But it could be smaller
than that too. For the host, since a zone append is a write, it is also subject
to the max_sector_kb limit and cannot exceed that value.

Would it be possible to use min(max_hw_sectors, zone_size) instead of introducing a new user-visible parameter?

Thanks,

Bart.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux