Again, thanks for sending these results to the list. It is unfortunate that nobody on the list was able to look through the results until after your paper submission. On Nov 04, 2008 12:14 -0600, Cindy Rubio Gonz�lez wrote: > OVERWRITING ERRORS > ================== > fs/ext4/extents.c:2078: overwriting potential non-tentative unchecked error in "err" > fs/ext4/extents.c:2082: overwriting potential non-tentative unchecked error in "err" > fs/ext4/extents.c:2087: overwriting potential non-tentative unchecked error in "err" I think these are all bugs in the analysis tool as previously mentioned. That said, it appears that the error could be overwritten at line 2118 in the same function because the "while (i >= 0 && err == 0)" loop exits with "err != 0" and then the check of "if (path->p_hdr->eh_entries == 0)" is true "err" is re-used. > OUT OF SCOPE ERRORS: > =================== > fs/ext4/extents.c:2864: potential non-tentative unchecked error in out of scope variable "err" > > UNSAVED ERRORS: > ============== > ext4_block_truncate_page > fs/ext4/extents.c:2819: potential non-tentative unchecked error is not saved Unfortunately, the "->truncate" VFS method does not allow an error to be returned to the caller. I suspect what we need to do here is check the error and call ext4_warning() in this case, and mark the filesystem in error. I don't think calling ext4_error() is the best course of action here, because at worst this will result in some blocks being orphaned until e2fsck is run. I don't think it is reasonable to force the filesystem read-only or panic the node in such a situation, as it doesn't affect the correctness of the data or metadata. > mpage_da_submit_io > fs/ext4/inode.c:1903: potential non-tentative unchecked error is not saved > fs/ext4/inode.c:1946: potential non-tentative unchecked error is not saved > fs/ext4/inode.c:2050: potential non-tentative unchecked error is not saved Aneesh, could you have a look at these? I suspect they are possibly data-losing bugs. > filemap_write_and_wait > fs/ext4/inode.c:2562: potential non-tentative unchecked error is not saved The error should just be returned to the caller. > ext4_get_block > {FP} fs/ext4/inode.c:3084: potential non-tentative unchecked error is not saved The error should be returned to the caller. > ext4_journal_test_restart > fs/ext4/inode.c:3246: potential non-tentative unchecked error is not saved > fs/ext4/inode.c:3456: potential non-tentative unchecked error is not saved Actually, digging into this one, the only possible source of error is from start_this_handle(). There are a number of places where errors could be returned from this function, but I don't know if any of them will actually be "new" failures in the filesystem. 96 ret = -ENOSPC; This should only happen as a result of a code bug, where the caller is asking for more journal credits than can possibly fit into the journal. That isn't to say it is impossible, however. It might be possible to track this via static code analysis by monitoring the maximum number of blocks requested by all of the callers. The minimum number of journal transaction blocks is 256 (1/4 of the minimum journal size of JBD2_MIN_JOURNAL_BLOCKS = 1024). 102 new_transaction = kzalloc(sizeof(*new_transaction), 103 GFP_NOFS|__GFP_NOFAIL); 104 if (!new_transaction) { 105 ret = -ENOMEM; This one can't ever be hit, because kzalloc() is called with __GFP_NOFAIL, so the kernel should just spin here forever until memory is available. I suspect that "if (!new_transaction)" check was added because of previous static analysis showing that the result of the allocation wasn't checked. Your analysis needs to be fixed to not tag allocations with __GFP_NOFAIL as TENTATIVE_ENOMEM. > ext4_mb_try_best_found > {FP} fs/ext4/mballoc.c:1843: potential non-tentative unchecked error is not saved Does {FP} mean "false positive"? It does appear that this is the case. I don't necessarily like the current code construction, however. If the allocation status is returned in ->ac_status then the function should just return void. > ext4_mb_init_per_dev_proc > fs/ext4/mballoc.c:2560: potential non-tentative unchecked error is not saved This is a non-fatal error, and we should just make it a void function. What it does mean is that all of the code should check if s_mb_proc != NULL before accessing it. > ext4_mb_destroy_per_dev_proc > fs/ext4/mballoc.c:2653: potential non-tentative unchecked error is not saved This shouldn't return an error either, because it is legal for s_mb_proc to be NULL if we don't abort the mount if ext4_mb_init_per_dev_proc() fails at mount time. This should be a void function, and just return here. > ext4_mb_free_metadata > fs/ext4/mballoc.c:4697: potential non-tentative unchecked error is not saved Ted re-wrote this code recently, so it isn't worth investigating. It might have been a data corruptor though... > extend_credit_for_blkde > fs/ext4/migrate.c:263: potential non-tentative unchecked error is not saved > fs/ext4/migrate.c:298: potential non-tentative unchecked error is not saved > fs/ext4/migrate.c:269: potential non-tentative unchecked error is not saved > fs/ext4/migrate.c:309: potential non-tentative unchecked error is not saved > fs/ext4/migrate.c:421: potential non-tentative unchecked error is not saved > ext4_journal_restart > fs/ext4/migrate.c:598: potential non-tentative unchecked error is not saved I think this code is in flux, and I don't know many details of it, so I won't review these cases. > vfs_quota_off > fs/ext4/super.c:1746: potential non-tentative unchecked error is not saved Probably Jan Kara can investigate this one further. > ext4_mb_init > fs/ext4/super.c:2474: potential non-tentative unchecked error is not saved In this case, if the function has an error it clears the MBALLOC option, and the failure is harmless because that functionality is not used. That said, in newer kernels the MBALLOC functionality is required and this error needs to be checked and mounting failed if there is an error. > set_blocksize > fs/ext4/super.c:2608: potential non-tentative unchecked error is not saved This is a false positive, in the sense that bdev_hardsect_size(bdev) was checked a few lines earlier to be at most the filesystem blocksize, so the only failure condition in set_blocksize() cannot be hit. That said, the extra bdev_hardsect_size() check in ext4_fill_super() could be removed and we just check the return value from set_blocksize() directly. That removes some logic that ext4 doesn't really need to know about. > sync_dirty_buffer > fs/ext4/super.c:2808: potential non-tentative unchecked error is not saved Most of the callers of ext4_commit_super() are void, so there isn't really any place to report the error to. Since this is an error trying to write out the superblock, I suspect that we should return the error. In some places we can call ext4_error(), but not in ext4_handle_error() or a few others (which would cause recursion). In others like ext4_create_journal() we can just return the error and fail the mount. > jbd2_journal_clear_err > fs/ext4/super.c:2868: potential non-tentative unchecked error is not saved I don't know why ext4_clear_journal_err() is void. The error should just be returned to the caller and the mount failed. > jbd2_log_wait_commit > fs/ext4/super.c:2913: potential non-tentative unchecked error is not saved > > sync_blockdev > fs/ext4/super.c:544: potential non-tentative unchecked error is not saved Not sure what can be done here. The filesystem is just finished being unmounted, so there isn't anywhere that an error can be returned, and the filesystem can't even be marked in error. I suppose it wouldn't hurt to check the return value and print a warning this case like ext4_warning(sb, "write error at unmount, may be inconsistent\n"); > Additional: > ========== > ext4_mark_inode_dirty > fs/ext4/acl.c:245: potential non-tentative unchecked error is not saved This looks like a simple case where the error should be checked and returned, but I don't know much about the ACL code, so it may need some extra cleanup. > ext4_journal_stop > fs/ext4/acl.c:405: potential non-tentative unchecked error is not saved > fs/ext4/acl.c:512: potential non-tentative unchecked error is not saved > fs/ext4/extents.c:2027: potential non-tentative unchecked error is not saved > fs/ext4/extents.c:2130: potential non-tentative unchecked error is not saved > fs/ext4/extents.c:2863: potential non-tentative unchecked error is not saved > : > : In ext4_journal_stop() it always checks and calls __ext4_std_error() and ext4_handle_error() to handle an error. I suspect there should be error handling in the callers for the case where the filesystem is mounted with errors=continue, but that is no the default behaviour of ext4 anymore. > ext4_forget > ext4_ext_dirty > ext4_mark_inode_dirty > ext4_orphan_del > ext4_journal_get_write_access > ext4_orphan_add > ext4_mark_iloc_dirty Sorry, getting to the end of my flight here, and can't continue to look at the rest of these error cases. I think in most of the cases the checker has produced valid errors. If the {FP} cases where found automatically, then they probably shouldn't be listed at all. I didn't look at any but the first one. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html