Re: [PATCH] block : add larger order folio size instead of pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 22/04/24 01:14PM, Christoph Hellwig wrote:
+		folio = page_folio(page);
+
+		if (!folio_test_large(folio) ||
+		   (bio_op(bio) == REQ_OP_ZONE_APPEND)) {

I don't understand why you need this branch.  All the arithmetics
below should also work just fine for non-large folios

The branch helps to skip these calculations for zero order folio:
A) folio_offset = (folio_page_idx(folio, page) << PAGE_SHIFT) + offset;
B) folio_size(folio)

, and there
while the same_page logic in bio_iov_add_zone_append_page probablyg
needs to be folio-ized first, it should be handled the same way here
as well.

Regarding the same_page logic, if we add same page twice then we release
the page on second addition. It seemed to me that this logic will work even
if we merge large order folios. Please let me know if I am missing something.

If we pass a large size of folio to bio_iov_add_zone_append_page then we fail
early due queue_max_zone_append_sectors limit. This can be modified to add
lesser pages which are a part of bigger folio. Let me know if I shall proceed
this way or if it is fine not to add the entire folio.

bio_iov_add_page should also be moved to take a folio
before the (otherwise nice) changes here.

If we convert bio_iov_add_page() to bio_iov_add_folio()/bio_add_folio(),
we see a decline of about 11% for 4K I/O. When mTHP is enabled we may get
a large order folio even for a 4K I/O. The folio_offset may become larger
than 4K and we endup using expensive mempool_alloc during nvme_map_data in
NVMe driver[1].

[1]
static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req,
               struct nvme_command *cmnd)
{
...
...
                       if (bv.bv_offset + bv.bv_len <= NVME_CTRL_PAGE_SIZE * 2)
                               return nvme_setup_prp_simple(dev, req,
                                                            &cmnd->rw, &bv);
...
...
      iod->sgt.sgl = mempool_alloc(dev->iod_mempool, GFP_ATOMIC);
...
...
}

--
Kundan







[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux