On Sat, 3 Aug 2024 at 15:50, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > + unsigned int n = open_files / BITS_PER_LONG; > + if (n % BITS_PER_LONG) { > + unsigned long mask = BITMAP_LAST_WORD_MASK(n); > + new_fdt->full_fds_bits[n / BITS_PER_LONG] &= mask; > + } Ouch. I hate this. And yes, clearly this fixes a bug, and clearly we don't maintain the full_fds_bits[] array properly, but I still hate it. The fact that we had this bug in the first place shows that our full_fds_bits code is too subtle, and this makes it worse. I really would prefer to do this in copy_fd_bitmaps() instead. Also, your comment says ... nothing of that sort happens to open_fds and * close_on_exec, since there the part that needs to be copied * will always fall on a word boundary. but it doesn't really explain *why* those things are on a word boundary. So I think part of the commentary should be that we always make sure that the normal bitmaps are full words (pointing out the ALIGN(min(count, max_fds), BITS_PER_LONG) in sane_fdtable_size), and then explaining that the reason the 'full_fds_bits' bitmap is special is that it is not a bitmap of file descriptors, it's a bitmap of 'words of file descriptor bitmaps'. So yes, the full_fds_bits bitmap is very special indeed, and yes, we got this all wrong, but I feel like the bug is in copy_fd_bitmaps(), and the fixup should be there too... In fact, I think the full_fds_bits copy in copy_fd_bitmaps() should just be entirely rewritten. Even the initial copy shouldn't be done with some byte-wise memcpy/memset, since those are all 'unsigned long' arrays, and the size is aligned. So it should be done on words, but whatever. And the full_fds_bits case should use our actual bitmap functions. So instead of cpy = BITBIT_SIZE(count); set = BITBIT_SIZE(nfdt->max_fds) - cpy; memcpy(nfdt->full_fds_bits, ofdt->full_fds_bits, cpy); memset((char *)nfdt->full_fds_bits + cpy, 0, set); I think the code could (and should) do just cpy = count / BITS_PER_LONG; max = nfdt->max_fds / BITS_PER_LONG; bitmap_copy(nfdt->full_fds_bits, ofdt->full_fds_bits, cpy); bitmap_clear(nfdt->full_fds_bits, count, max - cpy); or something like that. Doesn't that seem more straightforward? Of course, it's also entirely possible that I have just confused myself, and the above is buggy garbage. I tried to think this through, but I really tried to think more along the lines of "how can we make this legible so that the code is actually understandable". Because I think your patch almost certainly fixes the bug, but it sure isn't easy to understand. Linus