Re: [PATCH] block: fix sanity checks in blk_rq_map_user_bvec

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 23, 2024 at 03:15:19PM -0600, Uday Shankar wrote:
> From: Xinyu Zhang <xizhang@xxxxxxxxxxxxxxx>
> 
> blk_rq_map_user_bvec contains a check bytes + bv->bv_len > nr_iter which
> causes unnecessary failures in NVMe passthrough I/O, reproducible as
> follows:
> 
> - register a 2 page, page-aligned buffer against a ring
> - use that buffer to do a 1 page io_uring NVMe passthrough read
> 
> The second (i = 1) iteration of the loop in blk_rq_map_user_bvec will
> then have nr_iter == 1 page, bytes == 1 page, bv->bv_len == 1 page, so
> the check bytes + bv->bv_len > nr_iter will succeed, causing the I/O to
> fail. This failure is unnecessary, as when the check succeeds, it means
> we've checked the entire buffer that will be used by the request - i.e.
> blk_rq_map_user_bvec should complete successfully. Therefore, terminate
> the loop early and return successfully when the check bytes + bv->bv_len
> > nr_iter succeeds.

For anyone interested, here are the details on how to reproduce the
issue described above:

# cat test.c
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/ioctl.h>
#include <liburing.h>
#include <stdlib.h>
#include <assert.h>
#include <linux/nvme_ioctl.h>

int main(int argc, char *argv[]) {
    struct io_uring ring;

    assert(io_uring_queue_init(1, &ring, IORING_SETUP_SQE128 | IORING_SETUP_CQE32) == 0);

    void *buf = memalign(4096, 2 * 4096);
    printf("buf %p\n", buf);
    struct iovec iov = {
        .iov_base = buf,
        .iov_len = 2 * 4096,
    };

    assert(io_uring_register_buffers(&ring, &iov, 1) == 0);

    struct io_uring_sqe *sqe = io_uring_get_sqe(&ring);
    assert(sqe != NULL);

    int fd = open("/dev/ng0n1", O_RDONLY);
    assert(fd > 0);
    sqe->fd = fd;
    sqe->opcode = IORING_OP_URING_CMD;
    sqe->cmd_op = NVME_URING_CMD_IO;
    sqe->buf_index = 0;
    sqe->flags = 0;
    sqe->uring_cmd_flags = IORING_URING_CMD_FIXED;

    struct nvme_passthru_cmd *cmd = &sqe->cmd;
    cmd->opcode = 2; // read
    cmd->nsid = 1;
    cmd->data_len = 1 * 4096;
    cmd->addr = buf;

    struct io_uring_cqe *cqe;
    assert(io_uring_submit(&ring) == 1);
    assert(io_uring_wait_cqe(&ring, &cqe) == 0);

    printf("res %d\n", cqe->res);

    return 0;
}
# gcc -o test -luring test.c
test.c: In function ‘main’:
test.c:15:17: warning: implicit declaration of function ‘memalign’ [-Wimplicit-function-declaration]
   15 |     void *buf = memalign(4096, 2 * 4096);
      |                 ^~~~~~~~
test.c:15:17: warning: initialization of ‘void *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
test.c:36:37: warning: initialization of ‘struct nvme_passthru_cmd *’ from incompatible pointer type ‘__u8 (*)[0]’ {aka ‘unsigned char (*)[]’} [-Wincompatible-pointer-types]
   36 |     struct nvme_passthru_cmd *cmd = &sqe->cmd;
      |                                     ^
test.c:40:15: warning: assignment to ‘__u64’ {aka ‘long long unsigned int’} from ‘void *’ makes integer from pointer without a cast [-Wint-conversion]
   40 |     cmd->addr = buf;
      |
# ./test
buf 0x406000
res -22




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux