On Tue, Oct 24, 2017 at 01:45:27PM -0400, Brian Foster wrote: > The XFS ->writepage() cached mapping is racy in that the mapping can > change as a result of external factors after it has been looked up > and cached. If the current write_cache_pages() instance has to > handle subsequent pages over the affected mapping, writeback can > submit I/O to the wrong place, causing data loss and possibly > corruption in the process. > > To support the ability to manufacture this problem and effectively > regression test it from userspace, introduce an error injection tag > that triggers a fixed delay during ->writepage(). The delay occurs > immediately after a mapping is cached. Once userspace triggers > writeback, the delay provides userspace with a five second window to > perform other operations on the file to attempt to invalidate the > mapping. > > Note that this tag is intended to be used by xfstests rather than > for generic error handling testing. The lifetime of this tag should > be tethered to the existence of targeted regression tests for the > writepage mapping validity problem. Won't that effectively be 'forever' since we'll want to make sure the problem doesn't come back, even if Dave's RFC rework eventually makes it upstream? I was also wondering if maybe some day we might want an error log whose numeric setting is something other than "frequency of occurrence"... like this one, which could become a writepage_delay_ms to inject arbitrary amounts of delay into writeback(?) But I might just be babbling here. :) --D > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > --- > > Hi all, > > I'm posting this as an RFC for now because I'd like to try and see if > there's a reproducer for this that doesn't rely on error injection. I > need to think about that a bit more. In the meantime, I wanted to post > this as a POC for the associated problem. This does indeed confirm that > it's still possible to send I/O off to the wrong place outside the > eofblocks variant of the problem that was previously fixed (and more > easily reproduced). > > I'll post the assocated xfstests test that demonstrates this problem in > reply to this patch. The test reproduces about 50% of the time on my > test vm. > > Thoughts, reviews, flames appreciated. > > Brian > > fs/xfs/xfs_aops.c | 9 +++++++++ > fs/xfs/xfs_error.c | 3 +++ > fs/xfs/xfs_error.h | 4 +++- > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c > index a3eeaba..802e030 100644 > --- a/fs/xfs/xfs_aops.c > +++ b/fs/xfs/xfs_aops.c > @@ -967,6 +967,15 @@ xfs_writepage_map( > goto out; > wpc->imap_valid = xfs_imap_valid(inode, &wpc->imap, > offset); > + /* > + * The writepage delay injection tag is for userspace > + * imap validiation testing purposes. The delay allows > + * userspace some time to try and invalidate wpc->imap > + * immediately after it is cached. > + */ > + if (XFS_TEST_ERROR(false, XFS_I(inode)->i_mount, > + XFS_ERRTAG_WRITEPAGE_DELAY)) > + ssleep(5); > } > if (wpc->imap_valid) { > lock_buffer(bh); > diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c > index eaf86f5..36072e6 100644 > --- a/fs/xfs/xfs_error.c > +++ b/fs/xfs/xfs_error.c > @@ -58,6 +58,7 @@ static unsigned int xfs_errortag_random_default[] = { > XFS_RANDOM_DROP_WRITES, > XFS_RANDOM_LOG_BAD_CRC, > XFS_RANDOM_LOG_ITEM_PIN, > + XFS_RANDOM_WRITEPAGE_DELAY, > }; > > struct xfs_errortag_attr { > @@ -163,6 +164,7 @@ XFS_ERRORTAG_ATTR_RW(ag_resv_critical, XFS_ERRTAG_AG_RESV_CRITICAL); > XFS_ERRORTAG_ATTR_RW(drop_writes, XFS_ERRTAG_DROP_WRITES); > XFS_ERRORTAG_ATTR_RW(log_bad_crc, XFS_ERRTAG_LOG_BAD_CRC); > XFS_ERRORTAG_ATTR_RW(log_item_pin, XFS_ERRTAG_LOG_ITEM_PIN); > +XFS_ERRORTAG_ATTR_RW(writepage_delay, XFS_ERRTAG_WRITEPAGE_DELAY); > > static struct attribute *xfs_errortag_attrs[] = { > XFS_ERRORTAG_ATTR_LIST(noerror), > @@ -196,6 +198,7 @@ static struct attribute *xfs_errortag_attrs[] = { > XFS_ERRORTAG_ATTR_LIST(drop_writes), > XFS_ERRORTAG_ATTR_LIST(log_bad_crc), > XFS_ERRORTAG_ATTR_LIST(log_item_pin), > + XFS_ERRORTAG_ATTR_LIST(writepage_delay), > NULL, > }; > > diff --git a/fs/xfs/xfs_error.h b/fs/xfs/xfs_error.h > index 7c4bef3..17556b3 100644 > --- a/fs/xfs/xfs_error.h > +++ b/fs/xfs/xfs_error.h > @@ -107,7 +107,8 @@ extern void xfs_verifier_error(struct xfs_buf *bp); > #define XFS_ERRTAG_DROP_WRITES 28 > #define XFS_ERRTAG_LOG_BAD_CRC 29 > #define XFS_ERRTAG_LOG_ITEM_PIN 30 > -#define XFS_ERRTAG_MAX 31 > +#define XFS_ERRTAG_WRITEPAGE_DELAY 31 > +#define XFS_ERRTAG_MAX 32 > > /* > * Random factors for above tags, 1 means always, 2 means 1/2 time, etc. > @@ -143,6 +144,7 @@ extern void xfs_verifier_error(struct xfs_buf *bp); > #define XFS_RANDOM_DROP_WRITES 1 > #define XFS_RANDOM_LOG_BAD_CRC 1 > #define XFS_RANDOM_LOG_ITEM_PIN 1 > +#define XFS_RANDOM_WRITEPAGE_DELAY 1 > > #ifdef DEBUG > extern int xfs_errortag_init(struct xfs_mount *mp); > -- > 2.9.5 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html