Hi Al,
On 2024/4/7 12:05, Al Viro wrote:
On Sat, Apr 06, 2024 at 05:09:12PM +0800, Yu Kuai wrote:
From: Yu Kuai <yukuai3@xxxxxxxxxx>
Now that all filesystems stash the bdev file, it's ok to get inode
for the file.
Looking at the only user of erofs_buf->inode (erofs_bread())... We
use the inode for two things there - block size calculation (to get
from block number to position in bytes) and access to page cache.
We read in full pages anyway. And frankly, looking at the callers,
we really would be better off if we passed position in bytes instead
of block number. IOW, it smells like erofs_bread() having wrong type.
Look at the callers. With 3 exceptions it's
fs/erofs/super.c:135: ptr = erofs_bread(buf, erofs_blknr(sb, *offset), EROFS_KMAP);
fs/erofs/super.c:151: ptr = erofs_bread(buf, erofs_blknr(sb, *offset), EROFS_KMAP);
fs/erofs/xattr.c:84: it.kaddr = erofs_bread(&it.buf, erofs_blknr(sb, it.pos), EROFS_KMAP);
fs/erofs/xattr.c:105: it.kaddr = erofs_bread(&it.buf, erofs_blknr(sb, it.pos),
fs/erofs/xattr.c:188: it->kaddr = erofs_bread(&it->buf, erofs_blknr(sb, it->pos),
fs/erofs/xattr.c:294: it->kaddr = erofs_bread(&it->buf, erofs_blknr(sb, it->pos),
fs/erofs/xattr.c:339: it->kaddr = erofs_bread(&it->buf, erofs_blknr(it->sb, it->pos),
fs/erofs/xattr.c:378: it->kaddr = erofs_bread(&it->buf, erofs_blknr(sb, it->pos),
fs/erofs/zdata.c:943: src = erofs_bread(&buf, erofs_blknr(sb, pos), EROFS_KMAP);
and all of them actually want the return value + erofs_offset(...). IOW,
we take a linear position (in bytes). Divide it by block size (from sb).
Pass the factor to erofs_bread(), where we multiply that by block size
(from inode), see which page will that be in, get that page and return a
pointer *into* that page. Then we again divide the same position
by block size (from sb) and add the remainder to the pointer returned
by erofs_bread().
IOW, it would be much easier to pass the position directly and to hell
with block size logics. Three exceptions to that pattern:
fs/erofs/data.c:80: return erofs_bread(buf, blkaddr, type);
fs/erofs/dir.c:66: de = erofs_bread(&buf, i, EROFS_KMAP);
fs/erofs/namei.c:103: de = erofs_bread(&buf, mid, EROFS_KMAP);
Those could bloody well multiply the argument by block size;
the first one (erofs_read_metabuf()) is also interesting - its
callers themselves follow the similar pattern. So it might be
worth passing it a position in bytes as well...
In any case, all 3 have superblock reference, so they can convert
from blocks to bytes conveniently. Which means that erofs_bread()
doesn't need to mess with block size considerations at all.
IOW, it might make sense to replace erofs_buf->inode with
pointer to address space. And use file_mapping() instead of
file_inode() in that patch...
Just saw this again by chance, which is unexpected.
Yeah, I think that is a good idea. The story is that erofs_bread()
was derived from a page-based interface:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/erofs/data.c?h=v5.10#n35
so it was once a page index number. I think a byte offset will be
a better interface to clean up these, thanks for your time and work
on this!
BTW, sightly off the topic:
I'm little confused why I'm not be looped for this version this time
even:
1) I explicitly asked to Cc the mailing list so that I could find
the latest discussion and respond in time:
https://lore.kernel.org/r/5e04a86d-8bbd-41da-95f6-cf1562ed04f9@xxxxxxxxxxxxxxxxx
2) I sent my r-v-b tag on RFC v4 (and the tag was added on this
version) but I didn't receive this new version.
Thanks,
Gao Xiang