With the big caveat that I'm completely unfamiliar with this code, it seems to me the problem is that here: https://github.com/torvalds/linux/blame/ccb98ccef0e543c2bd4ef1a72270461957f3d8d0/mm/filemap.c#L2989 "bsz" is a 32-bit type on 32-bit kernels, and so when it gets used later in that same function to mask the 64-bit "start" value with "~(bsz - 1)", it's effectively truncating "start" to 32 bits. This is more or less confirmed by the actual values of "start_byte" and "punch_start_byte" when that WARN_ON_ONCE in buffer-io.c triggers, with one being (close to) a 32-bit truncated version of the other. Changing bsz to a 64-bit type fixes the problem for me.