On 2019-8-18 10:53, Matthew Wilcox wrote: > On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote: >> On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote: >>> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote: >>>> @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx) >>>> unsigned int nameoff, maxsize; >>>> >>>> dentry_page = read_mapping_page(mapping, i, NULL); >>>> - if (IS_ERR(dentry_page)) >>>> - continue; >>>> + if (IS_ERR(dentry_page)) { >>>> + errln("fail to readdir of logical block %u of nid %llu", >>>> + i, EROFS_V(dir)->nid); >>>> + err = PTR_ERR(dentry_page); >>>> + break; >>> >>> I don't think you want to use the errno that came back from >>> read_mapping_page() (which is, I think, always going to be -EIO). >>> Rather you want -EFSCORRUPTED, at least if I understand the recent >>> patches to ext2/ext4/f2fs/xfs/... >> >> Thanks for your reply and noticing this. :) >> >> Yes, as I talked with you about read_mapping_page() in a xfs related >> topic earlier, I think I fully understand what returns here. >> >> I actually had some concern about that before sending out this patch. >> You know the status is >> PG_uptodate is not set and PG_error is set here. >> >> But we cannot know it is actually a disk read error or due to >> corrupted images (due to lack of page flags or some status, and >> I think it could be a waste of page structure space for such >> corrupted image or disk error)... >> >> And some people also like propagate errors from insiders... >> (and they could argue about err = -EFSCORRUPTED as well..) >> >> I'd like hear your suggestion about this after my words above? >> still return -EFSCORRUPTED? > > I don't think it matters whether it's due to a disk error or a corrupted > image. We can't read the directory entry, so we should probably return > -EFSCORRUPTED. Thinking about it some more, read_mapping_page() can > also return -ENOMEM, so it should probably look something like this: > > err = 0; > if (dentry_page == ERR_PTR(-ENOMEM)) > err = -ENOMEM; > else if (IS_ERR(dentry_page)) { > errln("fail to readdir of logical block %u of nid %llu", > i, EROFS_V(dir)->nid); > err = -EFSCORRUPTED; Well, if there is real IO error happen under filesystem, we should return -EIO instead of EFSCORRUPTED? The right fix may be that doing sanity check on on-disk blkaddr, and return -EFSCORRUPTED if the blkaddr is invalid and propagate the error to its caller erofs_readdir(), IIUC below error info. > [36297.354090] attempt to access beyond end of device > [36297.354098] loop17: rw=0, want=29887428984, limit=1953128 > [36297.354107] attempt to access beyond end of device > [36297.354109] loop17: rw=0, want=29887428480, limit=1953128 > [36301.827234] attempt to access beyond end of device > [36301.827243] loop17: rw=0, want=29887428480, limit=1953128 Thanks, > } > > if (err) > break; >