On 6/18/20 10:41 AM, Bodo Stroesser wrote:
On 2020-06-18 17:00, Mike Christie wrote:
On 6/18/20 8:16 AM, Bodo Stroesser wrote:
This patch fixes the following crash
(see
https://urldefense.com/v3/__https://bugzilla.kernel.org/show_bug.cgi?id=208045__;!!GqivPVa7Brio!O7JgZIL3VPAzIqwJfZjL48y8M90K3HXv3pUVeoCzZ-vXovCpx5g7xMg-z5aAiVXVtkfE$
)
Process iscsi_trx (pid: 7496, stack limit = 0x0000000010dd111a)
CPU: 0 PID: 7496 Comm: iscsi_trx Not tainted 4.19.118-0419118-generic
#202004230533
Hardware name: Greatwall QingTian DF720/F601, BIOS 601FBE20 Sep 26
2019
pstate: 80400005 (Nzcv daif +PAN -UAO)
pc : flush_dcache_page+0x18/0x40
lr : is_ring_space_avail+0x68/0x2f8 [target_core_user]
sp : ffff000015123a80
x29: ffff000015123a80 x28: 0000000000000000
x27: 0000000000001000 x26: ffff000023ea5000
x25: ffffcfa25bbe08b8 x24: 0000000000000078
x23: ffff7e0000000000 x22: ffff000023ea5001
x21: ffffcfa24b79c000 x20: 0000000000000fff
x19: ffff7e00008fa940 x18: 0000000000000000
x17: 0000000000000000 x16: ffff2d047e709138
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: ffff2d047fbd0a40
x11: 0000000000000000 x10: 0000000000000030
x9 : 0000000000000000 x8 : ffffc9a254820a00
x7 : 00000000000013b0 x6 : 000000000000003f
x5 : 0000000000000040 x4 : ffffcfa25bbe08e8
x3 : 0000000000001000 x2 : 0000000000000078
x1 : ffffcfa25bbe08b8 x0 : ffff2d040bc88a18
Call trace:
flush_dcache_page+0x18/0x40
is_ring_space_avail+0x68/0x2f8 [target_core_user]
queue_cmd_ring+0x1f8/0x680 [target_core_user]
tcmu_queue_cmd+0xe4/0x158 [target_core_user]
__target_execute_cmd+0x30/0xf0 [target_core_mod]
target_execute_cmd+0x294/0x390 [target_core_mod]
transport_generic_new_cmd+0x1e8/0x358 [target_core_mod]
transport_handle_cdb_direct+0x50/0xb0 [target_core_mod]
iscsit_execute_cmd+0x2b4/0x350 [iscsi_target_mod]
iscsit_sequence_cmd+0xd8/0x1d8 [iscsi_target_mod]
iscsit_process_scsi_cmd+0xac/0xf8 [iscsi_target_mod]
iscsit_get_rx_pdu+0x404/0xd00 [iscsi_target_mod]
iscsi_target_rx_thread+0xb8/0x130 [iscsi_target_mod]
kthread+0x130/0x138
ret_from_fork+0x10/0x18
Code: f9000bf3 aa0003f3 aa1e03e0 d503201f (f9400260)
---[ end trace 1e451c73f4266776 ]---
The solution is based on patch:
"scsi: target: tcmu: Optimize use of flush_dcache_page"
which restricts the use of tcmu_flush_dcache_range() to
addresses from vmalloc'ed areas only.
This patch now replaces the virt_to_page() call in
tcmu_flush_dcache_range() - which is wrong for vmalloced addrs -
by vmalloc_to_page().
The patch was tested on ARM with kernel 4.19.118 and 5.7.2
Signed-off-by: Bodo Stroesser <bstroesser@xxxxxxxxxxxxxx>
Tested-by: JiangYu <lnsyyj@xxxxxxxxxxx>
Tested-by: Daniel Meyerholt <dxm523@xxxxxxxxx>
---
drivers/target/target_core_user.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/target/target_core_user.c
b/drivers/target/target_core_user.c
index a65e9671ae7a..835d3001cb0e 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -601,7 +601,7 @@ static inline void tcmu_flush_dcache_range(void
*vaddr, size_t size)
size = round_up(size+offset, PAGE_SIZE);
while (size) {
- flush_dcache_page(virt_to_page(start));
+ flush_dcache_page(vmalloc_to_page(start));
start += PAGE_SIZE;
size -= PAGE_SIZE;
}
For this bug we only need to change the flush in is_ring_space_avail
right? It's what is accessing the mb which is vmalloc'd.
No, is_ring_space_avail was just the first caller of
tcmu_flush_dcache_range for vmalloc'ed pages, so it crashed there.
The entiere address range exposed to userspace via uio consists of
two parts:
1) mb + command ring are vmalloc'ed in a single vzalloc call
during initialization of a tcmu device.
2) the data area which is allocated page by page calling
alloc_page()
The second part is handled by (scatter|gather)_data_area. For the
calls from these routines I think usage of virt_to_page in
tcmu_flush_dcache_range was fine. But patch number 1 of this
series replaced these called with direct calls to flush_dcache_page.
That was my problem. Did not see the replacement. Looks good to me.