On Fri, Jun 05, 2015 at 01:41:21AM +0000, brian m. carlson wrote: > However, with the object_id conversion, we run into a problem: casting > those unsigned char * values into struct object_id * values is not > allowed by the C standard. There are two possible solutions: copying; > and the just-do-it solution, where we cast and hope for the best. I'm not sure if it does violate the standard. The address of the first element of a struct is guaranteed to match the address of the struct itself. The object_id.hash member is an array of unsigned char, so there are no alignment issues. It might run afoul of rules about casting between pointer types (i.e., pointers for all types are not guaranteed to be the same size). The standard dictates that "char *" and "void *" are the same size, and big enough to hold a pointer to anything, so I think it might even be OK. But I'm not even sure that line of thinking is all that interesting. Even if we are violating some dark corner of the standard, this definitely falls into the "it's useful and works on all sane machines" category. We also do much worse things with struct-casting mmap'd data elsewhere (e.g., see the use of "struct pack_header"). It works fine in practice as long as you are careful about alignment and padding issues. So my vote would be to retain the cast. This is very low-level, performance-sensitive code. I did some very naive timings and didn't see any measurable change from your patch, but I also don't think we are seeing a real portability benefit to moving to the copy, so I'd prefer to keep the status quo. > It looks like we use the latter in nth_packed_object_offset, where we > cast an unsigned char * value to uint32_t *, which is clearly not > allowed. Yes, this one is definitely dubious by the standard. However, it works in practice because the index format is designed to be 4-byte aligned. By contrast, the .bitmap format is not, and we have to use get_be32, etc (which is really not the end of the world, but I do not think there is any real reason to change the .idx code at this point). > I'm to the point where I need to make a decision on how to > proceed, and I've included a patch with the copying conversion of > nth_packed_object_sha1 below for comparison. The casting solution is > obviously more straightforward. I haven't tested either implementation > for performance. The test I did was just running "git rev-list --use-bitmap-index --count HEAD" on a bitmapped linux.git repo. That's where I'd expect it to show the most, because we are not doing much other work. But I think even still, the timing is dominated by loading the bitmap file. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html