Re: [PATCH 4/6] pack-bitmap.c: factor out manual `map_pos` manipulation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 20, 2023 at 04:02:49PM -0400, Taylor Blau wrote:

> The pack-bitmap internals use a combination of a `map` and `map_pos`
> variable to keep a pointer to a memory mapped region of the `.bitmap`
> file, as well as the position within that file, respectively.
> 
> Reads within the memory mapped region are meant to mimic file reads,
> where each read adjusts the offset of the file descriptor accordingly.
> This functionality is implemented by adjusting the `map_pos` variable
> after each read.
> 
> Factor out a function similar to seek() which adjusts the `map_pos`
> variable for us. Since the bitmap code only needs to adjust relative to
> the beginning of the file as well as the current position, we do not
> need to support any "whence" values outside of SEEK_SET and SEEK_CUR.

I like the idea of centralizing the bounds checks here.

I do think copying lseek() is a little awkward, though. As you note, we
only care about SEEK_SET and SEEK_CUR, and we have to BUG() on anything
else. Which means we have a run-time check, rather than a compile time
check. If we had two functions like:

  void bitmap_index_seek_to(struct bitmap_index *bitmap_git, size_t pos)
  {
	bitmap_git->map_pos = pos;
	if (bitmap_git->map_pos >= bitmap_git->map_size)
	   ...etc...
  }

  void bitmap_index_incr(struct bitmap_index *bitmap_git, size_t offset)
  {
	bitmap_index_seek_to(bitmap_git, st_add(bitmap_git->map_pos, offset));
  }

then the compiler can see all cases directly. I dunno how much it
matters.

The other thing that's interesting from a bounds-checking perspective is
that you're checking the seek _after_ you've read an item. Which has two
implications:

  - I don't think we could ever overflow size_t here. We are working
    with a buffer that we got from mmap(), so it's already probably
    bounded to some much smaller subset of size_t. And even if we
    imagine that you really could get a buffer that stretches for the
    whole of the memory space, and that incrementing it by 4 bytes would
    overflow, we'd by definition have just overflowed the memory space
    itself by reading 4 bytes (and presumably segfaulted). So I really
    doubt this st_add() is doing anything.

  - The more interesting case is that we are not overflowing size_t, but
    simply walking past bitmap_git->map_size. But again, we are reading
    first and then seeking. So if our seek does go past, then by
    definition we just read garbage bytes, which is undefined behavior.

    For a bounds-check, wouldn't we want it the other way around? To
    make sure we have 4 bytes available, and then if so read them and
    increment the offset?

> +	if (bitmap_git->map_pos >= bitmap_git->map_size)
> +		BUG("bitmap position exceeds size (%"PRIuMAX" >= %"PRIuMAX")",
> +		    (uintmax_t)bitmap_git->map_pos,
> +		    (uintmax_t)bitmap_git->map_size);

This uses ">=", which is good, because it is valid to walk the pointer
to one past the end of the map, which is effectively EOF. But as above,
in that condition the callers should be checking for this EOF state
before reading the bytes.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux