On Fri, Apr 12, 2024 at 12:13:42AM +0800, Gao Xiang wrote: > Hi Al, > > On 2024/4/7 12:05, Al Viro wrote: > > On Sat, Apr 06, 2024 at 05:09:12PM +0800, Yu Kuai wrote: > > > From: Yu Kuai <yukuai3@xxxxxxxxxx> > > > > > > Now that all filesystems stash the bdev file, it's ok to get inode > > > for the file. > > > > Looking at the only user of erofs_buf->inode (erofs_bread())... We > > use the inode for two things there - block size calculation (to get > > from block number to position in bytes) and access to page cache. > > We read in full pages anyway. And frankly, looking at the callers, > > we really would be better off if we passed position in bytes instead > > of block number. IOW, it smells like erofs_bread() having wrong type. > > > > Look at the callers. With 3 exceptions it's > > fs/erofs/super.c:135: ptr = erofs_bread(buf, erofs_blknr(sb, *offset), EROFS_KMAP); > > fs/erofs/super.c:151: ptr = erofs_bread(buf, erofs_blknr(sb, *offset), EROFS_KMAP); > > fs/erofs/xattr.c:84: it.kaddr = erofs_bread(&it.buf, erofs_blknr(sb, it.pos), EROFS_KMAP); > > fs/erofs/xattr.c:105: it.kaddr = erofs_bread(&it.buf, erofs_blknr(sb, it.pos), > > fs/erofs/xattr.c:188: it->kaddr = erofs_bread(&it->buf, erofs_blknr(sb, it->pos), > > fs/erofs/xattr.c:294: it->kaddr = erofs_bread(&it->buf, erofs_blknr(sb, it->pos), > > fs/erofs/xattr.c:339: it->kaddr = erofs_bread(&it->buf, erofs_blknr(it->sb, it->pos), > > fs/erofs/xattr.c:378: it->kaddr = erofs_bread(&it->buf, erofs_blknr(sb, it->pos), > > fs/erofs/zdata.c:943: src = erofs_bread(&buf, erofs_blknr(sb, pos), EROFS_KMAP); > > > > and all of them actually want the return value + erofs_offset(...). IOW, > > we take a linear position (in bytes). Divide it by block size (from sb). > > Pass the factor to erofs_bread(), where we multiply that by block size > > (from inode), see which page will that be in, get that page and return a > > pointer *into* that page. Then we again divide the same position > > by block size (from sb) and add the remainder to the pointer returned > > by erofs_bread(). > > > > IOW, it would be much easier to pass the position directly and to hell > > with block size logics. Three exceptions to that pattern: > > > > fs/erofs/data.c:80: return erofs_bread(buf, blkaddr, type); > > fs/erofs/dir.c:66: de = erofs_bread(&buf, i, EROFS_KMAP); > > fs/erofs/namei.c:103: de = erofs_bread(&buf, mid, EROFS_KMAP); > > > > Those could bloody well multiply the argument by block size; > > the first one (erofs_read_metabuf()) is also interesting - its > > callers themselves follow the similar pattern. So it might be > > worth passing it a position in bytes as well... > > > > In any case, all 3 have superblock reference, so they can convert > > from blocks to bytes conveniently. Which means that erofs_bread() > > doesn't need to mess with block size considerations at all. > > > > IOW, it might make sense to replace erofs_buf->inode with > > pointer to address space. And use file_mapping() instead of > > file_inode() in that patch... > > Just saw this again by chance, which is unexpected. > > Yeah, I think that is a good idea. The story is that erofs_bread() > was derived from a page-based interface: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/erofs/data.c?h=v5.10#n35 > > so it was once a page index number. I think a byte offset will be > a better interface to clean up these, thanks for your time and work > on this! FWIW, see #misc.erofs and #more.erofs in my tree; the former is the minimal conversion of erofs_read_buf() and switch from buf->inode to buf->mapping, the latter follows that up with massage for erofs_read_metabuf(). Completely untested; it builds, but that's all I can promise. Individual patches in followups.