Re: [v2] ext4: fix possible non-initialized variable

Eric Sandeen <sandeen@xxxxxxxxxx> · Wed, 19 Sep 2012 21:59:04 -0500

On 9/19/12 3:41 PM, Theodore Ts'o wrote:
> On Wed, Sep 19, 2012 at 05:10:57PM -0300, Carlos Maiolino wrote:
>> Ted,
>>
>> In case of ext4_add_entry() I'm supposing to make the function call ext4_error()
>> and return -EIO in the case where ext4_bread() returns NULL and err is 0'ed,
>> does that matches with your thoughts or is there a better way to handle with
>> this?
> 
> Yeah, that's about it.  The real issue is EIO isn't really the right
> errno code, but we don't have a better one.  I've considered trying to
> propose adding a new EFSCORRUPT errno value, just so that the error
> message that eventually gets displayed back to the user via an
> application error message is more obvious, but I haven't had the
> energy to see if we can get it past the fsdevel shed-painting
> process.

FWIW, xfs used to have a custom "EFSCORRUPTED" errno, but we ditched in
favor of a standard but seldom-used "EUCLEAN" (structure needs cleaning).
Doesn't seem like too bad a choice.

-Eric

>> I'm talking about ext4_add_entry() behavious mainly as an example to
>> better understand how we should handle these situations. In case of
>> ext4_add_entry(), based on our discussions ext4_bread() should not
>> fail once dir entries should not have HOLES, so, a NULL return
>> should indicate a on-disk corruption or an I/O error.
> 
> Well, a NULL return indicates that there is a hole in that particular
> inode.  If err=0, then it's a hole.  Whether or not a hole is an
> actual file system corruption is up to the caller of ext4_bread() to
> determine.  In the case of directories, holes are an example file
> system corruption, so for the code in fs/ext4/namei.c, the appropriate
> thing to do would be to call ext4_error() and then return an error to
> abort the operation.
> 
> There is an argument to be made that if the file system is mounted
> with errors=continue, in this case we could just try to ignore the
> hole, and try to recover as best we can.  But that's going to be quite
> tricky to implement, and I'm not at all convinced it's worth the extra
> complexity to implement.  I could imagine someone who had a very high
> requirement for "zero downtime" who might argue for this, but then we
> would need to make sure the fallback code really was bulletproof.  If
> we pursued this option, then the EIO or EFSCORRUPT error code would
> only be returned in the errors=remount-ro case.
> 
> (This is the sort of thing which is what customers pay $$$ for IBM's
> mainframe in zOS.  Whether or not this is really needed for ext4 and
> Linux, even for an enterprise product, is a different sort of
> question.  I wouldn't object to someone who tried to make the
> errors=continue behaviour to recover in a clean and safe way, but the
> code would have to be very clearly documented, tested, and carefully
> reviewed.)
> 
> 						- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html