On Tue, May 22, 2018 at 09:59:07AM +0300, Artem Blagodarenko wrote: > From: Vitaly Fertman <vitaly_fertman@xxxxxxxxxxx> > > make sleep between reads twice in ext4_multi_mount_protect twice > longer than between write and read, make the later one equal to the > system check_interval This commit message could be improved a bit, something like: Make the sleep between initial reads in ext4_mmp_start() equal to twice the maximum of the superblock and MMP block interval, to ensure that the initial check waits long enough to detect if the previous owner is still updating the MMP block. If no activity is detected on the superblock, the sleep between the MMP write and subsequent re-read is reduced to a single interval to ensure that the final write completes at the stated interval so that another node doesn't consider the filesystem idle. > Xyratex-Bug-Id: MRP-390 > > Reviewed-by: Andrew Perepechko <Andrew_Perepechko@xxxxxxxxxxx> > --- > lib/ext2fs/mmp.c | 19 +++++++++++-------- > 1 file changed, 11 insertions(+), 8 deletions(-) > > diff --git a/lib/ext2fs/mmp.c b/lib/ext2fs/mmp.c > index 9a771de7..cb968adf 100644 > --- a/lib/ext2fs/mmp.c > +++ b/lib/ext2fs/mmp.c > @@ -296,6 +296,13 @@ errcode_t ext2fs_mmp_start(ext2_filsys fs) > if (mmp_check_interval < EXT4_MMP_MIN_CHECK_INTERVAL) > mmp_check_interval = EXT4_MMP_MIN_CHECK_INTERVAL; > > + /* > + * If check_interval in MMP block is larger, use that instead of > + * check_interval from the superblock. > + */ > + if (mmp_s->mmp_check_interval > mmp_check_interval) > + mmp_check_interval = mmp_s->mmp_check_interval; Moving this check up only really affects the "EXT4_MMP_SEQ_CLEAN" case here, as all other code paths result in an error or they would have hit the same code below. In the "clean_seq" case, do we really want to wait the full MMP interval, if the writing node was busy just before it unmounted the filesystem? I don't think this is bad, just potentially slower than strictly necessary. > seq = mmp_s->mmp_seq; > if (seq == EXT4_MMP_SEQ_CLEAN) > goto clean_seq; > @@ -309,13 +316,6 @@ errcode_t ext2fs_mmp_start(ext2_filsys fs) > goto mmp_error; > } > > - /* > - * If check_interval in MMP block is larger, use that instead of > - * check_interval from the superblock. > - */ > - if (mmp_s->mmp_check_interval > mmp_check_interval) > - mmp_check_interval = mmp_s->mmp_check_interval; > - > sleep(2 * mmp_check_interval + 1); > > retval = ext2fs_mmp_read(fs, fs->super->s_mmp_block, fs->mmp_buf); > @@ -344,7 +344,10 @@ clean_seq: > if (retval) > goto mmp_error; > > - sleep(2 * mmp_check_interval + 1); > + /* This sleep between write & read must be shorter than the previous > + * sleep between 2 reads, so that the check above of a racing thread > + * would never succeed between this write & read. */ > + sleep(mmp_check_interval + 1); I think this change makes sense, as we want to update the MMP block a second time after mmp_s->mmp_check_interval in order for other nodes to detect that libext2fs is in control of the filesystem. That said, I wonder if it also makes sense to add: if (mmp_s->mmp_check_interval < mmp_check_interval) mmp_s->mmp_check_interval = mmp_check_interval; before the first ext2fs_mmp_write() to properly reflect the longer sleep interval? That would only happen if the superblock s_mmp_update_interval is longer than mmp_s->mmp_check_interval, which shouldn't really be possible, but is a bit safer. In summary, while I think there might be some small improvements possible, I think the current patch is OK, and could add my: Signed-off-by: Andreas Dilger <adilger@xxxxxxxxx> Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP