On Tue, Mar 04, 2014 at 03:32:25PM +1100, Dave Chinner wrote: > On Mon, Mar 03, 2014 at 08:20:57PM -0500, Brian Foster wrote: > > On Tue, Mar 04, 2014 at 09:29:46AM +1100, Dave Chinner wrote: > > > On Mon, Mar 03, 2014 at 12:44:26PM -0500, Brian Foster wrote: > > > > On Mon, Mar 03, 2014 at 04:39:53PM +1100, Dave Chinner wrote: > > > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > > > > > While the verifier reoutines may return EFSBADCRC when a buffer ahs > > > > > a bad CRC, we need to translate that to EFSCORRUPTED so that the > > > > > higher layers treat the error appropriately and so we return a > > > > > consistent error to userspace. This fixes a xfs/005 regression. > > > > > > > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > --- > > > > > > > > This change looks Ok to me, but when I start looking through the users > > > > of bp->b_error, I see examples like xfs_dir3_data_read() being called in > > > > xfs_dir2_leaf_addname() where it looks like an error could bubble all > > > > the way up to xfs_vn_mknod() and its callers. > > > > > > Sure, but: > > > > > > xfs_dir3_data_read > > > xfs_da_read_buf > > > xfs_trans_read_buf_map > > > > > > Which means the patch prevents the EFSBADCRC leaking back out > > > through that path because it converts it in xfs_trans_read_buf_map. > > > > > > > Ok, I see. FWIW, I was also trolling through some of the log recovery > > code and it appears that code uses b_ops for write verification purposes > > only (i.e., crc generation I suspect), correct? > > Yes. > > > > > If the intent is to use EFSBADCRC as an internal-only error to > > > > differentiate corruption from crc failure, why not push this more > > > > closely to the boundaries that we have already defined? For example, we > > > > already convert positive errnos to negative at the internal/external > > > > boundaries. Could we convert those to use some kind of > > > > XFS_USERSPACE_ERROR(error) macro/helper that converts errors > > > > appropriately? > > > > > > That doesn't solve the problem needing an error conversion layer in > > > the first place. The long term goal is to remove the error > > > conversions in XFS by converting the core code to the same error > > > passing conventions as the rest of the kernel code. We manage to > > > screw the negation up fairly regularly because it is convoluted and > > > we cal into generic code that returns negative errors from the core > > > that returns positive errors in lots of places. The conversion > > > surface is just too large to manage sanely. > > > > > > > Well it wasn't clear that the error conversion layer was a problem that > > itself needed solving, but fair enough. ;) If the objective is to move > > away from the positive/negative business to something more "kernel > > native," then building more infrastructure around that is probably not > > the way to go, unless that was actually part of the changeover strategy > > (e.g., if we happened to use something that conditionally negated error > > values to support incremental codepath conversions... as a semi-random > > thought). > > I'm not sure there's any real benefit to adding infrastructure when > doing the conversion. It's a lot of little changes involving pushing > the conversion down from the top layers, combined with converting > interfaces (e.g. bmapi) as they get exposed. > > Some functions will be able to lose variables as the return becomes > a tri-state value (e.g. the btree functions for looking up records); > others will be able to use IS_ERR() for pointer returns, and so on. > This will have a flow on effect on the code in the callers, and so > it really comes down to spending the time to peel back each layer > carefully. > > It's a lot of work, hence the "long term goal" aspect of it. We've > been talking about doing this cleanup for several years now, but it > hasn't yet been done because it's going to take hundreds of patches > to do... ;) > Indeed. I just wasn't aware of that, so I was thinking aloud a bit (along the same lines of pushing the conversion from top-to-bottom). In any event, it's good to have this in mind going forward. > > > > Another thought could be to reconsider whether we still need some of > > > > these extra warnings, as in the xfs_mount.c hunk below, now that we have > > > > the generic xfs_verifier_error() messaging. E.g., if we could remove > > > > those, perhaps we could snub out EFSBADCRC in or around the verifier > > > > after it makes a distinction. > > > > > > Redundant errors aren't s significant problem. It's the lack of > > > meaningful error messages that are much more of an issue. We get > > > more meaningful error messages as a result of the EFSBADCRC changes > > > that have been made, but for the moment that error simply means > > > EFSCORRUPTED to the higher layers. Hence the translation back to > > > EFSCORRUPTED at the (low) layers where the error no longer has > > > a distinct meaning. > > > > > > > Right, the purpose is not to solve the problem of redundant errors. > > Rather, if the only thing preventing the consolidation of the EFSBADCRC > > trap to a single hunk are redundant errors, then it wouldn't be a > > problem to remove those errors and push the EFSBADCRC trap into the > > generic code for the purpose of clarity. > > I don't see much point in trying to have a single trap for EFSBADCRC > because the moment we want to handle one of those cases specially we > are going to have to - as a first step - remove the single trap and > push it outwards to all the places that need it like this patch > does.... > Agreed. I'm less concerned about it with the understanding that EFSBADCRC is not to be forever isolated to the world of verifiers. I just wanted to make sure my point was clear. ;) Brian > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs