Sorry I do not agree, we have a bug so we cannot ignore it. Solving at first place can save a lot of time if same problem create a side effect that may sometime be very hard to catch. Now lets consider the current problem 1. Its related to libxfs in xfsprogs, so its not mkfs issue anymore 2. If we come across any critical problem in libxfs we can cross verify kernel xfs implementation to find if there also a logical issue. One learning and be used in other part. 3. Yes I agree that if mkfs.xfs fails we have to re-run it anyways, but then what is the difference between a novice code and professional product. If you cscope libxfs_trans_read_buf() in xfsprogs, its caller always checks the return value, and its used extensively in xfsprogs. But this function always return 0. Infact there is no error handding at all, lets not consider EIO error only. 4. We are here in open community out of need, at the same time to make it better. I was wondering why I am not getting any reply, I think mail subject was wrong......mkfs ;) I will release the patch, please take out time to review it. On Thu, Feb 3, 2011 at 1:10 PM, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote: > On 2/1/11 5:06 AM, Ajeet Yadav wrote: >> We are testing mkfs.xfs and xfs_repair stability to look for crashes >> and other issues specially with removable devices. >> And unfortunately crashes does occur. >> Code inspection shows in most cases the caller does not handle >> libxfs_readbuf() for error cases i.e when return value = NULL. >> >> Now I need your suggestion. >> We should fix all such cases or the simplest way is to exit... if >> read() or write() fails with EIO errorno in libxfs_readbufr() and >> libxfs_writebufr(). > > I see very little reason to gracefully handle all error cases > during mkfs. It would be prettier, yes, but if mkfs fails, with > or without an error, with or without a segfault, you have to > just start it over anyway, right? > > I think there are better places to focus effort. > > -Eric > >> Fortunately these function already support exit, if we use flag >> LIBXFS_EXIT_ON_FAILURE, LIBXFS_B_EXIT but they are used selectively. >> >> The current problem is related to function libxfs_trans_read_buf() >> >> Â Â Â Âbp = libxfs_readbuf(dev, blkno, len, flags); >> #ifdef XACT_DEBUG >> Â Â Â Â fprintf(stderr, "trans_read_buf buffer %p, transaction %p\n", bp, tp); >> #endif >> Â Â Â Â xfs_buf_item_init(bp, tp->t_mountp); >> Â Â Â Â bip = XFS_BUF_FSPRIVATE(bp, xfs_buf_log_item_t *); >> Â Â Â Â bip->bli_recur = 0; >> Â Â Â Â xfs_trans_add_item(tp, (xfs_log_item_t *)bip); >> >> Â Â Â Â /* initialise b_fsprivate2 so we can find it incore */ >> Â Â Â Â XFS_BUF_SET_FSPRIVATE2(bp, tp); >> Â Â Â Â *bpp = bp; >> Â Â Â Â return 0; >> >> if Âlibxfs_readbuf() fails due to device removal or other error, bp = NULL. >> In function xfs_buf_item_init(bp, tp->t_mountp) as soon as bp is >> dereferenced occurs >> >> mkfs.xfs: unhandled page fault (11) at 0x00000070, code 0x017 >> >> _______________________________________________ >> xfs mailing list >> xfs@xxxxxxxxxxx >> http://oss.sgi.com/mailman/listinfo/xfs >> > > _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs