On Mon 20-03-23 12:18:38, Fabio M. De Francesco wrote: > On giovedì 16 marzo 2023 11:30:21 CET Fabio M. De Francesco wrote: > > On giovedì 16 marzo 2023 10:00:35 CET Jan Kara wrote: > > > On Wed 15-03-23 19:08:57, Fabio M. De Francesco wrote: > > > > On mercoledì 1 marzo 2023 15:14:16 CET Al Viro wrote: > > > > > On Wed, Mar 01, 2023 at 02:00:18PM +0100, Jan Kara wrote: > > > > > > On Wed 01-03-23 12:20:56, Fabio M. De Francesco wrote: > > > > > > > On venerdì 24 febbraio 2023 04:26:57 CET Al Viro wrote: > > > > > > > > Fabio's "switch to kmap_local_page()" patchset (originally > > > > after > > > > > > > > > > the > > > > > > > > > > > > > > > > ext2 counterpart, with a lot of cleaning up done to it; as the > > > > > > > > matter > > > > > > > > of > > > > > > > > > > > > fact, ext2 side is in need of similar cleanups - calling > > > > conventions > > > > > > > > > > there > > > > > > > > are bloody awful). > > > > > > > > [snip] > > > > > > > > > I think I've pushed a demo patchset to vfs.git at some point back in > > > > > January... Yep - see #work.ext2 in there; completely untested, though. > > > > > > > > The following commits from the VFS tree, #work.ext2 look good to me. > > > > > > > > f5b399373756 ("ext2: use offset_in_page() instead of open-coding it as > > > > subtraction") > > > > c7248e221fb5 ("ext2_get_page(): saner type") > > > > 470e54a09898 ("ext2_put_page(): accept any pointer within the page") > > > > 15abcc147cf7 ("ext2_{set_link,delete_entry}(): don't bother with > > > > page_addr") > > > > > > 16a5ee2027b7 ("ext2_find_entry()/ext2_dotdot(): callers don't need > > > > page_addr > > > > > > anymore") > > > > > > > > Reviewed-by: Fabio M. De Francesco <fmdefrancesco@xxxxxxxxx> > > > > > > Thanks! > > > > > > > I could only read the code but I could not test it in the same QEMU/KVM > > > > x86_32 VM where I test all my HIGHMEM related work. > > > > > > > > Btrfs as well as all the other filesystems I converted to > > > > kmap_local_page() > > > > > > don't make the processes in the VM to crash, whereas the xfstests on > ext2 > > > > trigger the OOM killer at random tests (only sometimes they exit > > > > gracefully). > > > > > > > > FYI, I tried to run the tests with 6GB of RAM, booting a kernel with > > > > HIGHMEM64GB enabled. I cannot add my "Tested-by" tag. > > > > > > Hum, interesting. Reading your previous emails this didn't seem to happen > > > before applying this series, did it? > > > > I wrote too many messages but was probably not able to explain the facts > > properly. Please let me summarize... > > > > 1) When testing ext2 with "./check -g quick" in a QEMU/KVM x86_32 VM, 6GB > RAM, > > booting a Vanilla kernel 6.3.0-rc1 with HIGHMEM64GB enabled, the OOM Killer > > kicks in at random tests _with_ and _without_ Al's patches. > > > > 2) The only case which does never trigger the OOM Killer is running the > tests > > on ext2 formatted filesystems in loop disks with the stock openSUSE kernel > > which is the 6.2.1-1-pae. > > > > 3) The same "./check -g quick" on 6.3.0-rc1 runs always to completion with > > other filesystems. I ran xfstests several times on Btrfs and I had no > > problems. > > > > 4) I cannot git-bisect this issue with ext2 because I cannot trust the > results > > on any particular Kernel version. I mean that I cannot mark any specific > > version neither "good" or "bad" because it happens that the same "good" > > version instead make xfstests crash at the next run. > > > > My conclusion is that we probably have some kind of race that makes the > random > > tests crash at random runs of random Kernel versions between (at least) SUSE > > 6.2.1 and Vanilla current. > > > > But it may be very well the case that I'm doing something stupid (e.g., with > > QEMU configuration or setup_disks or I can't imagine whatever else) and that > > I'm unable to see where I make mistakes. After all, I'm still a newcomer > with > > little experience :-) > > > > Therefore, I'd suggest that someone else try to test ext2 in an x86_32 VM. > > However, I'm 99.5% sure that Al's patches are good by the mere inspection of > > his code. > > > > I hope that this summary contains everything that may help. > > > > However, I remain available to provide any further information and to give > my > > contribution if you ask me for specific tasks. > > > > For my part I have no idea how to investigate what is happening. In these > > months I have run the VM hundreds of times on the most disparate filesystems > > to test my conversions to kmap_local_page() and I have never seen anything > > like this happen. > > > > Thanks, > > > > Fabio > > > > > > Honza > > > > > -- > > > Jan Kara <jack@xxxxxxxx> > > > SUSE Labs, CR > > I can't yet figure out which conditions lead to trigger the OOM Killer to kill > the XFCE Desktop Environment, and the xfstests (which I usually run into the > latter). After all, reserving 6GB of main memory to a QEMU/KVM x86_32 VM had > always been more than adequate. > > So, I thought I'd better ignore that 6GB for a 32 bit architecture are a > notable amount of RAM and squeezed some more from the host until I went to > reserve 8GB. I know that this is not what who is able to find out what > consumes so much main memory would do, but wanted to get the output from the > tests, one way or the other... :-( > > OK, I could finally run my tests to completion and had no crashes at all. I > ran "./check -g quick" on one "test" + three "scratch" loop devices formatted > with "mkfs.ext2 -c". I ran three times _with_ and then three times _without_ > Al's following patches cloned from his vfs tree, #work.ext2 branch: > > f5b399373756 ("ext2: use offset_in_page() instead of open-coding it as > subtraction") > c7248e221fb5 ("ext2_get_page(): saner type") > 470e54a09898 ("ext2_put_page(): accept any pointer within the page") > 15abcc147cf7 ("ext2_{set_link,delete_entry}(): don't bother with page_addr") > 16a5ee2027b7 ("ext2_find_entry()/ext2_dotdot(): callers don't need > > All the six tests were no longer killed by the Kernel :-) > > I got 144 failures on 597 tests, regardless of the above listed patches. > > My final conclusion is that these patches don't introduce regressions. I see > several tests that produce memory leaks but, I want to stress it again, the > failing tests are always the same with and without the patches. > > therefore, I think that now I can safely add my tag to all five patches listed > above... > > Tested-by: Fabio M. De Francesco <fmdefrancesco@xxxxxxxxx> Thanks for the effort! Al, will you submit these patches or should I just pull your branch into my tree? Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR