On Wed, Sep 19, 2012 at 05:10:57PM -0300, Carlos Maiolino wrote: > Ted, > > In case of ext4_add_entry() I'm supposing to make the function call ext4_error() > and return -EIO in the case where ext4_bread() returns NULL and err is 0'ed, > does that matches with your thoughts or is there a better way to handle with > this? Yeah, that's about it. The real issue is EIO isn't really the right errno code, but we don't have a better one. I've considered trying to propose adding a new EFSCORRUPT errno value, just so that the error message that eventually gets displayed back to the user via an application error message is more obvious, but I haven't had the energy to see if we can get it past the fsdevel shed-painting process. > I'm talking about ext4_add_entry() behavious mainly as an example to > better understand how we should handle these situations. In case of > ext4_add_entry(), based on our discussions ext4_bread() should not > fail once dir entries should not have HOLES, so, a NULL return > should indicate a on-disk corruption or an I/O error. Well, a NULL return indicates that there is a hole in that particular inode. If err=0, then it's a hole. Whether or not a hole is an actual file system corruption is up to the caller of ext4_bread() to determine. In the case of directories, holes are an example file system corruption, so for the code in fs/ext4/namei.c, the appropriate thing to do would be to call ext4_error() and then return an error to abort the operation. There is an argument to be made that if the file system is mounted with errors=continue, in this case we could just try to ignore the hole, and try to recover as best we can. But that's going to be quite tricky to implement, and I'm not at all convinced it's worth the extra complexity to implement. I could imagine someone who had a very high requirement for "zero downtime" who might argue for this, but then we would need to make sure the fallback code really was bulletproof. If we pursued this option, then the EIO or EFSCORRUPT error code would only be returned in the errors=remount-ro case. (This is the sort of thing which is what customers pay $$$ for IBM's mainframe in zOS. Whether or not this is really needed for ext4 and Linux, even for an enterprise product, is a different sort of question. I wouldn't object to someone who tried to make the errors=continue behaviour to recover in a clean and safe way, but the code would have to be very clearly documented, tested, and carefully reviewed.) - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html