On Fri, Jul 15, 2022 at 2:09 PM Nathan Chancellor <nathan@xxxxxxxxxx> wrote: > > On Fri, Jul 15, 2022 at 01:36:17PM +1000, Dave Airlie wrote: > > Matthew Auld (1): > > drm/i915/ttm: fix sg_table construction > > This patch breaks i386_defconfig with both GCC and clang: > > ld: drivers/gpu/drm/i915/i915_scatterlist.o: in function `i915_rsgt_from_mm_node': > i915_scatterlist.c:(.text+0x1a7): undefined reference to `__udivdi3' Yeah, we definitely don't want arbitrary 64x64 divides in the kernel, and the fact that we don't include libgcc.a once again caught this horrid issue. The offending code is if (sg_alloc_table(st, DIV_ROUND_UP(node->size, segment_pages), GFP_KERNEL)) { and I have to say that all of those games with "u64 page_alignment" that commit aff1e0b09b54 ("drm/i915/ttm: fix sg_table construction") introduces are absolutely disgusting. And they are just *pointlessly* disgusting. Why is that "page_alignment" a "u64"? And why is it a "size", instead of being a "number of bits"? The code literally does things like const u64 max_segment = round_down(UINT_MAX, page_alignment); which means that (a) page_alignment must be a power-of-two for this to work (round_down() only works in powers of two) (b) the result obviously must fit in an "unsigned int", since it's rounding down a UINT_MAX! So (a) makes it doubtful that "page_alignment" should have been a value (as opposed to mask), and (b) then questions why was that made an "u64" value when it cannot have a u64 range? And if max_segments cannot have a 64-bit range, then segment_pages here: u64 segment_pages = max_segment >> PAGE_SHIFT; sure cannot. Fixing those then uncovers other things: len = min(block_size, max_segment - sg->length); now complains about mixing types ("max_segment - sg->length" being u32), because 'block_size' is 64, bit, and that does seem to make some amount of sense: block_size = node->size << PAGE_SHIFT; with the 'node->size' being from drm_mm_node, and that size is a 'u64'. That I *could* see being more than 32 bits on a 64-bit architecture. Ok. But then that means that 'len' cannot be a 64-bit value either, and it should probably have been u32 len; .. len = min_t(u64, block_size, max_segment - sg->length); and that would just have been a lot nicer on 32-bit x86, avoiding a lot of pointlessly 64-bit things. That said, even those type simplifications do not fix the fundamental issue. That "DIV_ROUND_UP()" still ends up being a 64-bit divide, although now it's at least a "64-by-32" bit divide. Which needs to be handled by "do_div()" rather than the generic DIV_ROUND_UP() helper, because sadly, at least gcc still generates a full __udivdi3() even for the 64-by-32 divides. Can Intel GPU people please take a look? Linus