On Mon, Jan 21, 2019 at 07:48:33AM -0800, Christoph Hellwig wrote: > On Fri, Jan 18, 2019 at 01:39:15PM -0500, Brian Foster wrote: > > This doesn't really seem all that different to me. Rather than firm up > > the range in the caller, we turn XFS_BMAPI_DELALLOC into something that > > seemingly behaves a bit more like CONVERT_ONLY. > > The differences are: > > (a) we save one lookup in the extent tree > (b) we have a less fragile API > > > A few notes/thoughts: > > > > 1. What's the purpose of the nimaps == 0 check in > > xfs_iomap_write_allocate()? If we got to this point with the page lock > > held, shouldn't we always expect to find something backing the current > > page? > > It protects against the case where we migrate a COW fork mapping to the > data fork, which is not protected by the page lock. But I guess the > check warrants a comment and an assert. > Yeah, probably. It's not really clear to me what that means. > > 2. If so, then it also seems that the whole "eof:" thing in > > xfs_map_blocks() should never happen for data forks. If that's the case, > > the use of the eof label there seems gratuitous. > > Let me try with asserts enabled. > Ok, thanks. > > > > 3. If the starting bno param to xfs_bmapi_write() lands in a hole (due > > to racing with a hole punch for example) and it finds some subsequent > > delalloc extent to convert in the requested range, the arithmetic in > > xfs_bmapi_trim_map() actually fabricates physical blocks for the hole > > portion of the range relative to the startblock of the converted > > delalloc extent. I don't think that causes a problem with the writeback > > code because the fabricated blocks are before the page offset, but those > > semantics are insane. I think you need to do something like fix up > > xfs_bmapi_write() to return the actual converted mapping and make sure > > xfs_iomap_write_allocate() handles that it might start beyond > > map_start_fsb. > > Sounds good. To be honest I think the whole idea of converting the > mapping before the requested offset is a rather bad idea to start > with, just increasin our window that exposes stale data. But to fix > this properly we need to create everything as unwritten first, and > I didn't have time to get back to that series. Eh, I think that's a separate problem that re: your other series, we already know how to fix properly. I don't think we should mess around with something as fairly fundamental as delayed block allocation behavior for an approach that doesn't address the underlying problem. Brian