On 2/3/11 12:03 AM, Ajeet Yadav wrote: > Sorry I do not agree, we have a bug so we cannot ignore it. > Solving at first place can save a lot of time if same problem create a > side effect that may sometime be very hard to catch. > > Now lets consider the current problem > 1. Its related to libxfs in xfsprogs, so its not mkfs issue anymore > 2. If we come across any critical problem in libxfs we can cross > verify kernel xfs implementation to find if there also a logical > issue. > One learning and be used in other part. > 3. Yes I agree that if mkfs.xfs fails we have to re-run it anyways, > but then what is the difference between a novice code and professional > product. > If you cscope libxfs_trans_read_buf() in xfsprogs, its caller > always checks the return value, and its used extensively in xfsprogs. > But this function always return 0. Infact there is no error handding > at all, lets not consider EIO error only. > 4. We are here in open community out of need, at the same time to make > it better. > > I was wondering why I am not getting any reply, I think mail subject > was wrong......mkfs ;) I may have jumped at that too quickly, yes ;) > I will release the patch, please take out time to review it. Well, that's fair enough, that's how it works - if it's important to you, and you want to fix it, then you can! And if properly done it gets merged. -Eric > On Thu, Feb 3, 2011 at 1:10 PM, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote: >> On 2/1/11 5:06 AM, Ajeet Yadav wrote: >>> We are testing mkfs.xfs and xfs_repair stability to look for crashes >>> and other issues specially with removable devices. >>> And unfortunately crashes does occur. >>> Code inspection shows in most cases the caller does not handle >>> libxfs_readbuf() for error cases i.e when return value = NULL. >>> >>> Now I need your suggestion. >>> We should fix all such cases or the simplest way is to exit... if >>> read() or write() fails with EIO errorno in libxfs_readbufr() and >>> libxfs_writebufr(). >> >> I see very little reason to gracefully handle all error cases >> during mkfs. It would be prettier, yes, but if mkfs fails, with >> or without an error, with or without a segfault, you have to >> just start it over anyway, right? >> >> I think there are better places to focus effort. >> >> -Eric >> >>> Fortunately these function already support exit, if we use flag >>> LIBXFS_EXIT_ON_FAILURE, LIBXFS_B_EXIT but they are used selectively. >>> >>> The current problem is related to function libxfs_trans_read_buf() >>> >>> bp = libxfs_readbuf(dev, blkno, len, flags); >>> #ifdef XACT_DEBUG >>> fprintf(stderr, "trans_read_buf buffer %p, transaction %p\n", bp, tp); >>> #endif >>> xfs_buf_item_init(bp, tp->t_mountp); >>> bip = XFS_BUF_FSPRIVATE(bp, xfs_buf_log_item_t *); >>> bip->bli_recur = 0; >>> xfs_trans_add_item(tp, (xfs_log_item_t *)bip); >>> >>> /* initialise b_fsprivate2 so we can find it incore */ >>> XFS_BUF_SET_FSPRIVATE2(bp, tp); >>> *bpp = bp; >>> return 0; >>> >>> if libxfs_readbuf() fails due to device removal or other error, bp = NULL. >>> In function xfs_buf_item_init(bp, tp->t_mountp) as soon as bp is >>> dereferenced occurs >>> >>> mkfs.xfs: unhandled page fault (11) at 0x00000070, code 0x017 >>> >>> _______________________________________________ >>> xfs mailing list >>> xfs@xxxxxxxxxxx >>> http://oss.sgi.com/mailman/listinfo/xfs >>> >> >> > _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs