On 2010-03-19, at 08:17, jing zhang wrote:
ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL);
@@ -3811,6 +3813,12 @@ repeat:
list_del(&pa->u.pa_tmp_list);
call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback);
}
+ if (! list_empty(&list)) {
+ if (occurs++ < 2)
+ goto best_efforts;
+ else
+ BUG();
+ }
if (ac)
kmem_cache_free(ext4_ac_cachep, ac);
}
Hmm, I'm not sure that BUG() is appropriate here. If there is an
I/O error reading the block bitmap, #1, retrying isn't going to help,
and #2, bringing down the entire system just because of an I/O error
in reading the block bitmap doesn't seem right.
But disk hardware error is not rare,
Exactly, which is the reason why it should not cause the system to
hang. The filesystem should handle such errors gracefully if this is
possible, return an error to the application, and/or marking the
filesystem in error so that it will be checked on next boot, or similar.
Right now, if there is a problem, we just end up leaving the
preallocated list on the inode. Does that cause problems later on
down the line which you have observed?
- Ted
and is there still chance to call the
call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback);
function again later on? (I am not sure yet the chance does exist.)
If no chance, how about the kmem_cache subsystem then?
After reboot, the file system is still reliable, or just with a few
lost blocks?
Thus it is necessary, at least for me, to make sure whether the
chance exists.
- zj
--
To unsubscribe from this list: send the line "unsubscribe linux-
ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html