Re: [PATCH 1/2] spi: atmel: Do not cancel a transfer upon any signal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27.11.23 16:10, Ronald Wahl wrote:
On 27.11.23 10:58, Miquel Raynal wrote:
The intended move from wait_for_completion_*() to
wait_for_completion_interruptible_*() was to allow (very) long spi memory
transfers to be stopped upon user request instead of freezing the
machine forever as the timeout value could now be significantly bigger.

However, depending on the user logic, applications can receive many
signals for their own "internal" purpose and have nothing to do with the
requested kernel operations, hence interrupting spi transfers upon any
signal is probably not a wise choice. Instead, let's switch to
wait_for_completion_killable_*() to only catch the "important"
signals. This was likely the intended behavior anyway.

Actually this seems to work. But aborting a process that has a SPI
transfer running causes ugly messages from kernel. This is somehow
unexpected:

# dd if=/dev/urandom of=/flashdisk/testfile bs=1024 count=512
^C[  380.726760] spi-nor spi0.0: spi transfer canceled
[  380.731688] spi-nor spi0.0: SPI transfer failed: -512
[  380.737141] spi_master spi0: failed to transfer one message from queue
[  380.746495] spi-nor spi0.0: spi transfer canceled
[  380.751549] spi-nor spi0.0: SPI transfer failed: -512
[  380.756844] spi_master spi0: failed to transfer one message from queue

JFFS2 also logs an informational message which is less visible but also
may rise eyebrows:
[  380.743904] jffs2: Write of 4164 bytes at 0x0016a47c failed. returned
-512, retlen 68

Killing a process is something to expect in certain cases and it should
not cause such messages which may create some anxiety that something bad
had happened. So maybe the "kill" case should be silent (e.g. level
"debug")
but without out hiding real errors. But even when hiding the message in the
SPI framework it may cause additional messages in upper layers like JFFS2.
I'm not sure whether all of this is a good idea. This is something others
have to decide.

... and now I just got a crash when unmounting and remounting jffs2:

unmount:
[ 8245.821105] spi-nor spi0.0: spi transfer canceled
[ 8245.826288] spi-nor spi0.0: SPI transfer failed: -512
[ 8245.831508] spi_master spi0: failed to transfer one message from queue
[ 8245.838484] jffs2: Write of 1092 bytes at 0x00181458 failed. returned -512, retlen 68
[ 8245.839786] spi-nor spi0.0: spi transfer canceled
[ 8245.844759] spi-nor spi0.0: SPI transfer failed: -512
[ 8245.850145] spi_master spi0: failed to transfer one message from queue
[ 8245.856909] jffs2: Write of 1092 bytes at 0x0018189c failed. returned -512, retlen 0
[ 8245.856942] jffs2: Not marking the space at 0x0018189c as dirty because the flash driver returned retlen zero

mount:
[ 8831.213456] jffs2: error: (1142) jffs2_link_node_ref: Adding new ref 28bd9da7 at (0x000ad578-0x000ae5bc) not immediately after previous (0x000ad578-0x000ad578)
[ 8831.228212] Internal error: Oops - undefined instruction: 0 [#1] THUMB2
[ 8831.234996] CPU: 0 PID: 1142 Comm: mount Not tainted 6.6.2-sama5 #1
[ 8831.241587] Hardware name: Atmel SAMA5
[ 8831.245478] PC is at jffs2_link_node_ref+0xe/0xe2
[ 8831.250360] LR is at jffs2_link_node_ref+0xb9/0xe2
[ 8831.255473] pc : [<c020969c>]    lr : [<c0209747>]    psr: 00000033
[ 8831.261893] sp : c974dd78  ip : 00000000  fp : c09c6a5c
[ 8831.267428] r10: c1572f18  r9 : 0000e002  r8 : c2849964
[ 8831.272801] r7 : 00001a44  r6 : 000ae5bc  r5 : c14df4a8  r4 : c1620208
[ 8831.279647] r3 : 00000001  r2 : 40000000  r1 : c090ce3c  r0 : 00000093
[ 8831.286340] Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
[ 8831.293974] Control: 50c53c7d  Table: 21470059  DAC: 00000051
[ 8831.300025] Register r0 information: non-paged memory
[ 8831.305239] Register r1 information: non-slab/vmalloc memory
[ 8831.311220] Register r2 information: non-paged memory
[ 8831.316427] Register r3 information: non-paged memory
[ 8831.321630] Register r4 information: slab kmalloc-2k start c1620000 pointer offset 520 size 2048
[ 8831.330796] Register r5 information: slab jffs2_refblock start c14df3e0 pointer offset 200 size 248
[ 8831.340219] Register r6 information: non-paged memory
[ 8831.345422] Register r7 information: non-paged memory
[ 8831.350784] Register r8 information: slab kmalloc-4k start c2849000 pointer offset 2404 size 4096
[ 8831.360032] Register r9 information: non-paged memory
[ 8831.365243] Register r10 information: slab pde_opener start c1572f18 pointer offset 0 size 24
[ 8831.374137] Register r11 information: non-slab/vmalloc memory
[ 8831.380041] Register r12 information: NULL pointer
[ 8831.385140] Process mount (pid: 1142, stack limit = 0xb9d2bb39)
[ 8831.391215] Stack: (0xc974dd78 to 0xc974e000)
[ 8831.395884] dd60:                                                       000ad578 000ae5bc
[ 8831.404242] dd80: 000ad578 000ad578 c2849dec c2805e00 c1620208 c2849dec 00000000 c021130d
[ 8831.412762] dda0: c1572d50 00000694 0000069c 00000036 00010000 c2937755 00000000 20061985
[ 8831.421280] ddc0: 0000069c 00000000 c2849964 c2805e00 c1620208 c2849964 0000069c 00000008
[ 8831.429799] dde0: 00000000 c2849ff8 c2849000 c020c40b c09c6a5c c974de34 c974de30 00000000
[ 8831.438154] de00: 00000004 0000069c 00001000 c2937bc0 00000208 c1620000 00000000 000a0000
[ 8831.446665] de20: 00000000 00000000 0000000a 00070000 c2849000 00000000 00001000 00000000
[ 8831.455184] de40: 00000000 00000000 00010000 c2805e00 00000000 00000000 00200000 c2937e00
[ 8831.463704] de60: 00000000 00008000 c0975a98 c020e02d c1001300 00000200 c020f3eb 00000dc0
[ 8831.472062] de80: c2937e00 00000000 c020f3eb c2805e00 c156fc80 c29b9400 00200000 c2937e00
[ 8831.480584] dea0: 00000000 c020f42d c29b9400 c156fc80 c09c6a64 c29b9400 c1298c00 c156fc80
[ 8831.489100] dec0: c020f5c9 c02a853d 0000003a 00000000 c156fc80 c020f5c9 c974df58 c02a85ff
[ 8831.497458] dee0: c156fc80 c020f5c9 00000000 c020f5bf c156fc80 c156fc80 00000020 c016fdc3
[ 8831.505976] df00: c156fc80 00000000 c1374780 c0115c75 00000000 00000000 c156fc80 c0182209
[ 8831.514497] df20: c2937e00 c974df58 c2937f80 00000000 c2937e00 c2937f80 00008000 00000000
[ 8831.523015] df40: 0006d71a 00008000 00068991 c0182307 00000000 c015048b c1360d90 c18b3a18
[ 8831.531370] df60: 00001000 c2937f80 c2937e00 00000000 00000015 c0182539 00000000 0006d71a
[ 8831.539881] df80: 0006d725 00000000 0006d70c 0006d71a 00000015 c010023c c1374780 00000015
[ 8831.548399] dfa0: 00068991 c0100041 00000000 0006d70c 0006d70c 0006d71a 0006d725 00008000
[ 8831.556908] dfc0: 00000000 0006d70c 0006d71a 00000015 00008000 00000000 0006d938 00068991
[ 8831.565263] dfe0: b6ea606c beb82b9c 00039818 b6ea607c a0000010 0006d70c 00000000 00000000
[ 8831.573779]  jffs2_link_node_ref from jffs2_sum_scan_sumnode+0x1a5/0x320
[ 8831.580860]  jffs2_sum_scan_sumnode from jffs2_scan_medium+0x2d7/0xab0
[ 8831.587595]  jffs2_scan_medium from jffs2_do_mount_fs+0xeb/0x356
[ 8831.593958]  jffs2_do_mount_fs from jffs2_do_fill_super+0xf7/0x182
[ 8831.600332]  jffs2_do_fill_super from mtd_get_sb+0x61/0x98
[ 8831.606176]  mtd_get_sb from get_tree_mtd+0x4f/0xe8
[ 8831.611241]  get_tree_mtd from vfs_get_tree+0x13/0x7c
[ 8831.616648]  vfs_get_tree from path_mount+0x409/0x4d4
[ 8831.621891]  path_mount from do_mount+0x33/0x40
[ 8831.626593]  do_mount from sys_mount+0xeb/0xfe
[ 8831.631367]  sys_mount from ret_fast_syscall+0x1/0x5c
[ 8831.636591] Exception stack(0xc974dfa8 to 0xc974dff0)
[ 8831.641961] dfa0:                   00000000 0006d70c 0006d70c 0006d71a 0006d725 00008000
[ 8831.650314] dfc0: 00000000 0006d70c 0006d71a 00000015 00008000 00000000 0006d938 00068991
[ 8831.658818] dfe0: b6ea606c beb82b9c 00039818 b6ea607c
[ 8831.664195] Code: 6a63 b085 990a b903 (de02) 6ae5
[ 8831.669135] ---[ end trace 0000000000000000 ]---
[ 8831.673902] Kernel panic - not syncing: Fatal exception

The previous aborted I/Os from a dd may also play a role here but the crash is
clearly a cause of the interrupted transfers. It's a bit odd that it is an
undefined instruction but probably it was a stack overflow.

Looking back in the SPI driver history I see some occasions where interruptible
transfers were changed to non-interruptible transfers because filesystems like
jffs2 used SIGKILL during unmount or maybe other reasons:

775c4c0032c4 "spi: stm32-qspi: remove signal sensitive on completion"
26cfc0dbe43a "spi-zynq-qspi: use wait_for_completion_timeout to make zynq_qspi_exec_mem_op not interruptible"
7f3ac71ac3b0 "spi: davinci: fix spurious i/o error"

Having them in spi-atmel now seems a bit off and even a source of misbehaviour
and even crashes.

- ron


________________________________

Ce message, ainsi que tous les fichiers joints à ce message, peuvent contenir des informations sensibles et/ ou confidentielles ne devant pas être divulguées. Si vous n'êtes pas le destinataire de ce message (ou que vous recevez ce message par erreur), nous vous remercions de le notifier immédiatement à son expéditeur, et de détruire ce message. Toute copie, divulgation, modification, utilisation ou diffusion, non autorisée, directe ou indirecte, de tout ou partie de ce message, est strictement interdite.


This e-mail, and any document attached hereby, may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized, direct or indirect, copying, disclosure, distribution or other use of the material or parts thereof is strictly forbidden.





[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux