On 2/1/11 5:06 AM, Ajeet Yadav wrote: > We are testing mkfs.xfs and xfs_repair stability to look for crashes > and other issues specially with removable devices. > And unfortunately crashes does occur. > Code inspection shows in most cases the caller does not handle > libxfs_readbuf() for error cases i.e when return value = NULL. > > Now I need your suggestion. > We should fix all such cases or the simplest way is to exit... if > read() or write() fails with EIO errorno in libxfs_readbufr() and > libxfs_writebufr(). I see very little reason to gracefully handle all error cases during mkfs. It would be prettier, yes, but if mkfs fails, with or without an error, with or without a segfault, you have to just start it over anyway, right? I think there are better places to focus effort. -Eric > Fortunately these function already support exit, if we use flag > LIBXFS_EXIT_ON_FAILURE, LIBXFS_B_EXIT but they are used selectively. > > The current problem is related to function libxfs_trans_read_buf() > > bp = libxfs_readbuf(dev, blkno, len, flags); > #ifdef XACT_DEBUG > fprintf(stderr, "trans_read_buf buffer %p, transaction %p\n", bp, tp); > #endif > xfs_buf_item_init(bp, tp->t_mountp); > bip = XFS_BUF_FSPRIVATE(bp, xfs_buf_log_item_t *); > bip->bli_recur = 0; > xfs_trans_add_item(tp, (xfs_log_item_t *)bip); > > /* initialise b_fsprivate2 so we can find it incore */ > XFS_BUF_SET_FSPRIVATE2(bp, tp); > *bpp = bp; > return 0; > > if libxfs_readbuf() fails due to device removal or other error, bp = NULL. > In function xfs_buf_item_init(bp, tp->t_mountp) as soon as bp is > dereferenced occurs > > mkfs.xfs: unhandled page fault (11) at 0x00000070, code 0x017 > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs > _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs