On 03/21/2013 07:06 PM, Peng Tao wrote: > On Thu, Mar 21, 2013 at 10:13:00AM -0400, Trond Myklebust wrote: > <snip> >> @@ -1458,7 +1479,6 @@ static void pnfs_ld_handle_write_error(struct nfs_write_data *data) >> dprintk("pnfs write error = %d\n", hdr->pnfs_error); >> if (NFS_SERVER(hdr->inode)->pnfs_curr_ld->flags & >> PNFS_LAYOUTRET_ON_ERROR) { >> - clear_bit(NFS_INO_LAYOUTCOMMIT, &NFS_I(hdr->inode)->flags); > Hi Trond and Boaz, > > If object layout requires layout being committed before returned (as fixed in > the 3/3 patch), is it a potential problem to directly return layout here as > well? e.g., if one lseg is successfully written and pending layoutcommit, > then another lseg of the same file failed read/write, then layout will be > returned w/o layoutcommit. For blocklayout, it is a potential data corruption > and that's why block layout doesn't set PNFS_LAYOUTRET_ON_ERROR bit. So > I'm wondering if object will suffer from the same issue? > Hi Tao No, not at all. The objects layout has error reported as part of the layout_return OPT. With exact devices that failed and why. In fact the data should not be "committed" per ce, but a recovery process must be preformed because we know that not all data of a stripe was committed including parity, and the raid5 check-some is surly wrong. This is why there is an error bit in layout_commit OPT to denote that this is not a true commit and that there is an Error report on the way, for those clients that must always lo_commit before lo_return even on Error. (I know that 4.2 has plans for error-report RETURNs for other layout types as well, this is part of why) > Thanks, > Tao > Cheers Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html