Re: [PATCH v2] lightnvm: pblk: fix bio leak on large sized io

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 7 Mar 2019, at 08.23, Chansol Kim <chansol.kim@xxxxxxxxxxx> wrote:
> 
> For large size io where blk_queue_split needs to be called inside
> pblk_rw_io, results in bio leak as bio_endio is not called on the
> newly allocated. One way to observe this is to mounting ext4
> filesystem on the target and issuing 1MB io with dd, e.g., dd bs=1MB
> if=/dev/null of=/mount/myvolume. kmemleak reports:
> 
> unreferenced object 0xffff88803d7d0100 (size 256):
>  comm "kworker/u16:1", pid 68, jiffies 4294899333 (age 284.120s)
>  hex dump (first 32 bytes):
>    00 00 00 00 00 00 00 00 00 60 e8 31 81 88 ff ff  .........`.1....
>    01 40 00 00 06 06 00 00 00 00 00 00 05 00 00 00  .@..............
>  backtrace:
>    [<000000001f5aa04f>] kmem_cache_alloc+0x204/0x3c0
>    [<0000000040945aab>] mempool_alloc_slab+0x1d/0x30
>    [<00000000b4959ab4>] mempool_alloc+0x83/0x220
>    [<00000000646bad9b>] bio_alloc_bioset+0x229/0x320
>    [<000000009264b251>] bio_clone_fast+0x26/0xc0
>    [<0000000008250252>] bio_split+0x41/0x110
>    [<00000000e365cad0>] blk_queue_split+0x349/0x930
>    [<00000000eb5426bc>] pblk_make_rq+0x1b5/0x1f0
>    [<00000000eea09cec>] generic_make_request+0x2f9/0x690
>    [<00000000ae6acede>] submit_bio+0x12e/0x1f0
>    [<00000000f9b8b82a>] ext4_io_submit+0x64/0x80
>    [<000000009e4f817d>] ext4_bio_write_page+0x32e/0x890
>    [<00000000cbd0d106>] mpage_submit_page+0x65/0xc0
>    [<000000000eec7359>] mpage_map_and_submit_buffers+0x171/0x330
>    [<000000009a7afcb6>] ext4_writepages+0xd5e/0x1650
>    [<000000004476b096>] do_writepages+0x39/0xc0
> 
> In case there is a need for a split, blk_queue_split returns the newly
> allocated bio to the caller by changing the value of pointer passed as
> a reference, while the original is passed to generic_make_requests.
> 
> Although pblk_rw_io's local variable bio* has changed and passed to
> pblk_submit_read and pblk_write_to_cache, work is done on this new
> bio*, and pblk_rw_io returns NVM_IO_DONE, pblk_make_rq calls bio_endio
> on the old bio* because it passed bio pointer by value to pblk_rw_io.
> 
> pblk_rw_io is unfolded into pblk_make_rq so that there is no copying
> of bio* and bio_endio is called on the correct bio*.
> 
> Signed-off-by: Chansol Kim <chansol.kim@xxxxxxxxxxx>
> ---
> v2:
> - Instead of changing pblk_rw_io's parameter, unfolded it into pblk_make_rq
> 
> drivers/lightnvm/pblk-init.c | 47 ++++++++++++++++++--------------------------
> 1 file changed, 19 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
> index 8b643d0..0c98305 100644
> --- a/drivers/lightnvm/pblk-init.c
> +++ b/drivers/lightnvm/pblk-init.c
> @@ -47,36 +47,10 @@ static struct pblk_global_caches pblk_caches = {
> 
> struct bio_set pblk_bio_set;
> 
> -static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
> -			  struct bio *bio)
> -{
> -	int ret;
> -
> -	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
> -	 * constraint. Writes can be of arbitrary size.
> -	 */
> -	if (bio_data_dir(bio) == READ) {
> -		blk_queue_split(q, &bio);
> -		ret = pblk_submit_read(pblk, bio);
> -		if (ret == NVM_IO_DONE && bio_flagged(bio, BIO_CLONED))
> -			bio_put(bio);
> -
> -		return ret;
> -	}
> -
> -	/* Prevent deadlock in the case of a modest LUN configuration and large
> -	 * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
> -	 * available for user I/O.
> -	 */
> -	if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
> -		blk_queue_split(q, &bio);
> -
> -	return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
> -}
> -
> static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> {
> 	struct pblk *pblk = q->queuedata;
> +	int ret;
> 
> 	if (bio_op(bio) == REQ_OP_DISCARD) {
> 		pblk_discard(pblk, bio);
> @@ -86,7 +60,24 @@ static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
> 		}
> 	}
> 
> -	switch (pblk_rw_io(q, pblk, bio)) {
> +	/* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
> +	 * constraint. Writes can be of arbitrary size.
> +	 */
> +	if (bio_data_dir(bio) == READ) {
> +		blk_queue_split(q, &bio);
> +		ret = pblk_submit_read(pblk, bio);
> +	} else {
> +		/* Prevent deadlock in the case of a modest LUN configuration
> +		 * and large user I/Os. Unless stalled, the rate limiter
> +		 * leaves at least 256KB available for user I/O.
> +		 */
> +		if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
> +			blk_queue_split(q, &bio);
> +
> +		ret = pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
> +	}
> +
> +	switch (ret) {
> 	case NVM_IO_ERR:
> 		bio_io_error(bio);
> 		break;
> --
> 2.7.4

Looks good to me.

Reviewed-by: Javier González <javier@xxxxxxxxxxx>

Attachment: signature.asc
Description: Message signed with OpenPGP


[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux