From: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> commit 8c4e0f212398cdd1eb4310a5981d06a723cdd24f upstream. 1) If remaining ring space before the end of the ring is smaller then the next cmd to write, tcmu writes a padding entry which fills the remaining space at the end of the ring. Then tcmu calls tcmu_flush_dcache_range() with the size of struct tcmu_cmd_entry as data length to flush. If the space filled by the padding was smaller then tcmu_cmd_entry, tcmu_flush_dcache_range() is called for an address range reaching behind the end of the vmalloc'ed ring. tcmu_flush_dcache_range() in a loop calls flush_dcache_page(virt_to_page(start)); for every page being part of the range. On x86 the line is optimized out by the compiler, as flush_dcache_page() is empty on x86. But I assume the above can cause trouble on other architectures that really have a flush_dcache_page(). For paddings only the header part of an entry is relevant due to alignment rules the header always fits in the remaining space, if padding is needed. So tcmu_flush_dcache_range() can safely be called with sizeof(entry->hdr) as the length here. 2) After it has written a command to cmd ring, tcmu calls tcmu_flush_dcache_range() using the size of a struct tcmu_cmd_entry as data length to flush. But if a command needs many iovecs, the real size of the command may be bigger then tcmu_cmd_entry, so a part of the written command is not flushed then. Link: https://lore.kernel.org/r/20200528193108.9085-1-bstroesser@xxxxxxxxxxxxxx Acked-by: Mike Christie <michael.christie@xxxxxxxxxx> Signed-off-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx> Signed-off-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/target/target_core_user.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/target/target_core_user.c +++ b/drivers/target/target_core_user.c @@ -1018,7 +1018,7 @@ static int queue_cmd_ring(struct tcmu_cm entry->hdr.cmd_id = 0; /* not used for PAD */ entry->hdr.kflags = 0; entry->hdr.uflags = 0; - tcmu_flush_dcache_range(entry, sizeof(*entry)); + tcmu_flush_dcache_range(entry, sizeof(entry->hdr)); UPDATE_HEAD(mb->cmd_head, pad_size, udev->cmdr_size); tcmu_flush_dcache_range(mb, sizeof(*mb)); @@ -1083,7 +1083,7 @@ static int queue_cmd_ring(struct tcmu_cm cdb_off = CMDR_OFF + cmd_head + base_command_size; memcpy((void *) mb + cdb_off, se_cmd->t_task_cdb, scsi_command_size(se_cmd->t_task_cdb)); entry->req.cdb_off = cdb_off; - tcmu_flush_dcache_range(entry, sizeof(*entry)); + tcmu_flush_dcache_range(entry, command_size); UPDATE_HEAD(mb->cmd_head, command_size, udev->cmdr_size); tcmu_flush_dcache_range(mb, sizeof(*mb));