On Mon, May 12, 2014 at 06:42:37PM +0900, Namjae Jeon wrote: > > > > +xfs_bmap_split_extent( > > > + struct xfs_inode *ip, > > > + xfs_fileoff_t split_fsb, > > > + xfs_extnum_t *split_ext) > > > +{ > > > + struct xfs_mount *mp = ip->i_mount; > > > + struct xfs_trans *tp; > > > + struct xfs_bmap_free free_list; > > > + xfs_fsblock_t firstfsb; > > > + int committed; > > > + int error; > > > + > > > + tp = xfs_trans_alloc(mp, XFS_TRANS_DIOSTRAT); > > > + error = xfs_trans_reserve(tp, &M_RES(mp)->tr_write, > > > + XFS_DIOSTRAT_SPACE_RES(mp, 0), 0); > > > + > > > + if (error) { > > > + /* > > > + * Free the transaction structure. > > > + */ > > > + ASSERT(XFS_FORCED_SHUTDOWN(mp)); > > > Hi, Brian. > > As in the other patch, we're attempting to reserve fs blocks for the > > transaction, so ENOSPC is a possibility that I think the assert should > > accommodate. > How about removing the ASSERT completely as suggessted by Dave > in other thread? > Yeah, that works too. If Dave prefers to just remove these asserts that's fine with me. I just wanted to make sure we aren't adding spurious asserts for valid failures. > > ... > > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > > > index 97855c5..392b029 100644 > > > --- a/fs/xfs/xfs_file.c > > > +++ b/fs/xfs/xfs_file.c > > > @@ -760,7 +760,8 @@ xfs_file_fallocate( > > > if (!S_ISREG(inode->i_mode)) > > > return -EINVAL; > > > if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | > > > - FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE)) > > > + FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE | > > > + FALLOC_FL_INSERT_RANGE)) > > > return -EOPNOTSUPP; > > > > > > xfs_ilock(ip, XFS_IOLOCK_EXCL); > > > @@ -790,6 +791,40 @@ xfs_file_fallocate( > > > error = xfs_collapse_file_space(ip, offset, len); > > > if (error) > > > goto out_unlock; > > > + } else if (mode & FALLOC_FL_INSERT_RANGE) { > > > + unsigned blksize_mask = (1 << inode->i_blkbits) - 1; > > > + struct iattr iattr; > > > + > > > + if (offset & blksize_mask || len & blksize_mask) { > > > + error = -EINVAL; > > > + goto out_unlock; > > > + } > > > + > > > + /* Check for wrap through zero */ > > > + if (inode->i_size + len > inode->i_sb->s_maxbytes) { > > > + error = -EFBIG; > > > + goto out_unlock; > > > + } > > > + > > > + /* Offset should be less than i_size */ > > > + if (offset >= i_size_read(inode)) { > > > + error = -EINVAL; > > > + goto out_unlock; > > > + } > > > + > > > + /* > > > + * The first thing we do is to expand file to > > > + * avoid data loss if there is error while shifting > > > + */ > > > + iattr.ia_valid = ATTR_SIZE; > > > + iattr.ia_size = i_size_read(inode) + len; > > > + error = xfs_setattr_size(ip, &iattr); > > > + if (error) > > > + goto out_unlock; > > > > I don't necessarily know that it's problematic to do the setattr before > > the bmap fixup. We'll have a chance for partial completion of this > > operation either way. But I'm not a fan of the code duplication here. > > This also still skips the time update in the event of insert space > > failure, though perhaps that's not such a big deal if we're returning an > > error. > > > > I think it would be better to leave things organized as before and > > introduce an error2 variable and a &nrshifts or some such parameter to > > xfs_insert_file_space() that initializes to 0 and returns the number of > > record shifts. The caller can then decide whether it's appropriate to > > break out immediately or do the inode size update and return the error. > > > > Perhaps not the cleanest thing in the world, but also not the first > > place we would use 'error2' to manage error priorities (grep around for > > it)... > Yes, Right. I also thought such sequence at first. But we should consider > sudden power off and unplug device case during shifting extent. > While we are in the middle of shifitng extents and if there is sudden > power failure user can still think that data is lost as we won't get any > chance to update the file size in these cases. > Updating file size before the shifitng operation can start will prevent this. > > Thanks. Hmm, fair point. That seems less critical to me than the general error sequence, but if we want to handle that case I think we could still fix the duplication in xfs_file_fallocate(). We could possibly factor out the common bits (update time and set size) into a helper, or what seems a bit cleaner on first thought, move the bulk of the (mode & FALLOC_FL_INSERT_RANGE) block to after the common part. Then the function looks something like this: ... xfs_ilock(); /* pre-inode fixup ops */ if (mode & ...) { ... } else if (mode & FALLOC_FL_INSERT_RANGE) { /* comment as to what's going on here :) */ /* error checks */ new_size = ...; do_file_insert = 1; } ... xfs_trans_ichgtime(); xfs_setattr_size(); ... /* * Some operations are performed after the inode size is updated. For * example, insert range expands the address space of the file, shifts * all subsequent extents over and allocates space into the hole. * Updating the size first ensures that shifted extents aren't left * hanging past EOF in the event of a crash or failure. */ if (do_file_insert) { /* alloc space */ ... } ... That seems a bit cleaner to me, but I'm not wedded to it. Thoughts? It might be worth soliciting some other thoughts/ideas before reworking it. Thanks. Brian > > > > Brian > > > > > + > > > + error = xfs_insert_file_space(ip, offset, len); > > > + if (error) > > > + goto out_unlock; > > > } else { > > > if (!(mode & FALLOC_FL_KEEP_SIZE) && > > > offset + len > i_size_read(inode)) { > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html