On Tue, 2021-03-23 at 16:29 +0000, Christoph Hellwig wrote: > On Tue, Mar 23, 2021 at 05:22:03PM +0100, mwilck@xxxxxxxx wrote: > > Avoid this by simply not resetting nr_pages to 0 after allocating the > > bio. This way, the client receives an IO error when it tries to send > > requests exceeding the devices max_sectors_kb, and eventually gets > > it right. The client must still limit max_sectors_kb e.g. by an udev > > rule if (like in my case) the driver doesn't report valid block > > limits, otherwise it encounters I/O errors. > > FYI, I think the what you did here is correct, but not enough. > When pscsi_get_bio (that is bio_kmalloc) fails, this function needs > to unwind and return an error insted of blindly retrying the > allocation, > else we can't recover from a memory shortage. I can try to do that, but in the tests I ran, I never observed bio_kmalloc() failing. I saw it eat all memory, and various processes being killed by the OOM killer, before the system eventually panicked with OOM. It takes only fractions of a second until this happens: [ 57.356238] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=lightdm-gtk-gre,pid=4586,uid=481 [ 57.369635] Out of memory: Killed process 4586 (lightdm-gtk-gre) total-vm:1035752kB, anon-rss:0kB, file-rss:2936kB, shmem-rss:0kB ... [ 57.698656] Node 0 Normal free:54828kB min:55008kB low:68760kB high:82512kB active_anon:4kB inactive_anon:0kB active_file:936kB inactive_file:1164kB unevictable:58036kB writepending:720kB present:13631488kB managed:13310252kB mlocked:58036kB kernel_stack:5808kB pagetables:8972kB bounce:0kB free_pcp:564kB local_pcp:0kB free_cma:0kB ... [ 57.818978] Unreclaimable slab info: [ 58.254160] kmalloc-192 15683546KB 15683546KB (system has 16GiB memory). Martin