Re: [PATCH] read-cache: avoid misaligned reads in index v4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King wrote:
> On Fri, Sep 23, 2022 at 07:43:55PM +0000, Victoria Dye via GitGitGadget wrote:
>> @@ -1883,7 +1883,7 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
>>  	size_t len;
>>  	const char *name;
>>  	const unsigned hashsz = the_hash_algo->rawsz;
>> -	const uint16_t *flagsp = (const uint16_t *)(ondisk->data + hashsz);
>> +	const char *flagsp = ondisk + offsetof(struct ondisk_cache_entry, data) + hashsz;
> 
> Now we use the "const char *" pointer instead of the cast to the
> ondisk_cache_entry struct, which is good, and is what fixes the
> alignment question.
> 
> But we also convert flagsp from being a uint16_t into a byte pointer.
> I'm not sure if that's strictly necessary from an alignment perspective,
> as we'd dereference it only via get_be16(), which handles alignment and
> type conversion itself.
> 
> I'd imagine the standard probably says that even forming such a pointer
> is illegal, so in that sense, it probably is undefined behavior. But I
> think it's one of those things that's OK in practice.

Yep, per the C standard §6.3.2.3 #7 [1]:

  A pointer to an object or incomplete type may be converted to a pointer to
  a different object or incomplete type. If the resulting pointer is not
  correctly aligned for the pointed-to type, the behavior is undefined.

To your point, it is probably fine in practice, but I'd lean towards
sticking with a 'char *' to play it safe.

[1] https://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf

>> @@ -1935,20 +1935,24 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
>>  
>>  	ce = mem_pool__ce_alloc(ce_mem_pool, len);
>>  
>> -	ce->ce_stat_data.sd_ctime.sec = get_be32(&ondisk->ctime.sec);
>> [...]
>> +	ce->ce_stat_data.sd_ctime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, ctime)
>> +							+ offsetof(struct cache_time, sec));
> 
> I had figured we'd be able to drop ondisk_cache_entry entirely. But here
> you're using it essentially as a template for a set of constants
> retrieved via offsetof().
> 
> That's OK from an alignment perspective. It does mean we'd be in trouble
> if a compiler ever decided to introduce padding into the struct. That's
> probably unlikely. We don't use __attribute__((packed)) because it's not
> portable, and our existing uses have generally been OK, because our
> data structures are organized around 8-byte alignment. We might have
> problems on a theoretical 128-bit processor or something.

In addition to portability, using '__attribute__((packed))' could hurt
performance (and, in a large index, that might have a noticeable effect).

As for dropping 'ondisk_cache_entry()', I didn't want to drop it only from
the "read" operation (and use something like the "parse left-to-right"
strategy below) while leaving it in "write." And, as you mentioned later,
changing 'ce_write_entry()' is a lot more invasive than what's already in
this patch and possibly out-of-scope.

> Another strategy is to just parse left-to-right, advancing the byte
> pointer. Like:
> 
>   ce->ce_state_data.sd_ctime.sec = get_be32(ondisk);
>   ondisk += sizeof(uint32_t);
>   ce->ce_state_data.sd_mtime.sec = get_be32(ondisk);
>   ondisk += sizeof(uint32_t);
>   ...etc...
> 
> You can even stick that in a helper function that does the get_b32() and
> advances, so you know they're always done in sync. See pack-bitmap.c's
> read_be32(), etc. IMHO this produces a nice result because the reading
> code itself becomes the source of truth for the format.
> 

...

> One final note, though:
> 
>> +	ce->ce_stat_data.sd_mtime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mtime)
>> +							+ offsetof(struct cache_time, sec));
> 
> Here (and elsewhere), you can assume that the offsetof() "sec" in
> cache_time is 0, for two reasons:
> 
>   - I didn't look up chapter and verse, but I'm pretty sure the standard
>     does guarantee that the first field of a struct is at the beginning.
> 
>   - If there's any padding, this whole scheme is hosed anyway, because
>     it means sizeof(cache_time) is bigger than we expect, which messes
>     up the offsetof() the entry after us (in this case sd_dev).
> 
> So this can just be:
> 
>   ce->ce_stat_data.sd_mtime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mtime));
> 
> which is mercifully shorter.
> 
> Assuming we dismiss the rest of what I said as not worth it for a
> minimal fix, I do think that simplification is worth rolling a v2.

That makes sense from a technical perspective, but I included the starting
entry offset for readability reasons. It might be confusing to someone
unfamiliar with C struct memory alignment to see every other 'get_be32'
refer to the exact entry it's reading via the 'offsetof()', but have that
information absent only for a few entries. And, the double 'offsetof()'
would still be used by the 'mtime.nsec'/'ctime.nsec' fields anyway.

In any case, if this patch is intended to be a short-lived change on the way
to a more complete refactor and/or I'm being overzealous on the readability,
I'd be happy to change it. :) 

Thanks!

> 
> -Peff
> 
> PS BTW, I mentioned earlier "can we just get rid of ondisk_cache_entry".
>    We also use it for the writing side, of course. That doesn't have
>    alignment issues, but it does have the same "I hope there's never any
>    padding" question. In an ideal world, it would be using the
>    equivalent put_be32(), but again, that's getting out of the "minimal
>    fix" territory.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux