On Wed, Aug 21, 2024 at 8:31 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > > Currently, running the charge_reserved_hugetlb.sh selftest we can > sometimes observe something like: > > $ ./charge_reserved_hugetlb.sh -cgroup-v2 > ... > write_result is 0 > After write: > hugetlb_usage=0 > reserved_usage=10485760 > killing write_to_hugetlbfs > Received 2. > Deleting the memory > Detach failure: Invalid argument > umount: /mnt/huge: target is busy. > > Both cases are issues in the test. > > While the unmount error seems to be racy, it will make the test fail: > $ ./run_vmtests.sh -t hugetlb > ... > # [FAIL] > not ok 10 charge_reserved_hugetlb.sh -cgroup-v2 # exit=32 > > The issue is that we are not waiting for the write_to_hugetlbfs process > to quit. So it might still have a hugetlbfs file open, about which > umount is not happy. Fix that by making "killall" wait for the process > to quit. > > The other error ("Detach failure: Invalid argument") does not seem to > result in a test error, but is misleading. Turns out write_to_hugetlbfs.c > unconditionally tries to cleanup using shmdt(), even when we only > mmap()'ed a hugetlb file. Even worse, shmaddr is never even set for the > SHM case. Fix that as well. > > With this change it seems to work as expected. > > Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests") > Reported-by: Mario Casquero <mcasquer@xxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Shuah Khan <shuah@xxxxxxxxxx> > Cc: Muchun Song <muchun.song@xxxxxxxxx> > Cc: Mina Almasry <almasrymina@xxxxxxxxxx> > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> Initially I thought it could be nice to split fixes for the 2 issues in separate patches in case one of them ends up needing a revert or something, but probably not worth a respin. Fixes look good to me. Reviewed-by: Mina Almasry <almasrymina@xxxxxxxxxx> > --- > .../selftests/mm/charge_reserved_hugetlb.sh | 2 +- > .../testing/selftests/mm/write_to_hugetlbfs.c | 21 +++++++++++-------- > 2 files changed, 13 insertions(+), 10 deletions(-) > > diff --git a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh > index d680c00d2853a..67df7b47087f0 100755 > --- a/tools/testing/selftests/mm/charge_reserved_hugetlb.sh > +++ b/tools/testing/selftests/mm/charge_reserved_hugetlb.sh > @@ -254,7 +254,7 @@ function cleanup_hugetlb_memory() { > local cgroup="$1" > if [[ "$(pgrep -f write_to_hugetlbfs)" != "" ]]; then > echo killing write_to_hugetlbfs > - killall -2 write_to_hugetlbfs > + killall -2 --wait write_to_hugetlbfs This looks correct. I don't think I expected killall not to wait. > wait_for_hugetlb_memory_to_get_depleted $cgroup > fi > set -e > diff --git a/tools/testing/selftests/mm/write_to_hugetlbfs.c b/tools/testing/selftests/mm/write_to_hugetlbfs.c > index 6a2caba19ee1d..1289d311efd70 100644 > --- a/tools/testing/selftests/mm/write_to_hugetlbfs.c > +++ b/tools/testing/selftests/mm/write_to_hugetlbfs.c > @@ -28,7 +28,7 @@ enum method { > > /* Global variables. */ > static const char *self; > -static char *shmaddr; > +static int *shmaddr; > static int shmid; > > /* > @@ -47,15 +47,17 @@ void sig_handler(int signo) > { > printf("Received %d.\n", signo); > if (signo == SIGINT) { > - printf("Deleting the memory\n"); > - if (shmdt((const void *)shmaddr) != 0) { > - perror("Detach failure"); > + if (shmaddr) { > + printf("Deleting the memory\n"); > + if (shmdt((const void *)shmaddr) != 0) { > + perror("Detach failure"); > + shmctl(shmid, IPC_RMID, NULL); > + exit(4); > + } > + > shmctl(shmid, IPC_RMID, NULL); > - exit(4); > + printf("Done deleting the memory\n"); > } > - > - shmctl(shmid, IPC_RMID, NULL); > - printf("Done deleting the memory\n"); This seems like a simple refactor to only delete when shmaddr is set, looks fine to me. > } > exit(2); > } > @@ -211,7 +213,8 @@ int main(int argc, char **argv) > shmctl(shmid, IPC_RMID, NULL); > exit(2); > } > - printf("shmaddr: %p\n", ptr); > + shmaddr = ptr; > + printf("shmaddr: %p\n", shmaddr); > Setting shmaddr seems correct and an oversight. I don't see shmaddr set anywhere in the current code. > break; > default: > -- > 2.46.0 > -- Thanks, Mina