Re: mkfs.xfs pagefault when removed storage during operation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry I do not agree, we have a bug so we cannot ignore it.
Solving at first place can save a lot of time if same problem create a
side effect that may sometime be very hard to catch.

Now lets consider the current problem
1. Its related to libxfs in xfsprogs, so its not mkfs issue anymore
2. If we come across any critical problem in libxfs we can cross
verify kernel xfs implementation to find if there also a logical
issue.
    One learning and be used in other part.
3. Yes I agree that if mkfs.xfs fails we have to re-run it anyways,
but then what is the difference between a novice code and professional
product.
     If you cscope libxfs_trans_read_buf() in xfsprogs, its caller
always checks the return value, and its used extensively in xfsprogs.
But this function always return 0. Infact there is no error handding
at all, lets not consider EIO error only.
4. We are here in open community out of need, at the same time to make
it better.

I was wondering why I am not getting any reply, I think mail subject
was wrong......mkfs ;)
I will release the patch, please take out time to review it.

On Thu, Feb 3, 2011 at 1:10 PM, Eric Sandeen <sandeen@xxxxxxxxxxx> wrote:
> On 2/1/11 5:06 AM, Ajeet Yadav wrote:
>> We are testing mkfs.xfs and xfs_repair stability to look for crashes
>> and other issues specially with removable devices.
>> And unfortunately crashes does occur.
>> Code inspection shows in most cases the caller does not handle
>> libxfs_readbuf() for error cases i.e when return value = NULL.
>>
>> Now I need your suggestion.
>> We should fix all such cases or the simplest way is to exit... if
>> read() or write() fails with EIO errorno in libxfs_readbufr() and
>> libxfs_writebufr().
>
> I see very little reason to gracefully handle all error cases
> during mkfs. It would be prettier, yes, but if mkfs fails, with
> or without an error, with or without a segfault, you have to
> just start it over anyway, right?
>
> I think there are better places to focus effort.
>
> -Eric
>
>> Fortunately these function already support exit, if we use flag
>> LIBXFS_EXIT_ON_FAILURE, LIBXFS_B_EXIT but they are used selectively.
>>
>> The current problem is related to function libxfs_trans_read_buf()
>>
>> Â Â Â Âbp = libxfs_readbuf(dev, blkno, len, flags);
>> #ifdef XACT_DEBUG
>> Â Â Â Â fprintf(stderr, "trans_read_buf buffer %p, transaction %p\n", bp, tp);
>> #endif
>> Â Â Â Â xfs_buf_item_init(bp, tp->t_mountp);
>> Â Â Â Â bip = XFS_BUF_FSPRIVATE(bp, xfs_buf_log_item_t *);
>> Â Â Â Â bip->bli_recur = 0;
>> Â Â Â Â xfs_trans_add_item(tp, (xfs_log_item_t *)bip);
>>
>> Â Â Â Â /* initialise b_fsprivate2 so we can find it incore */
>> Â Â Â Â XFS_BUF_SET_FSPRIVATE2(bp, tp);
>> Â Â Â Â *bpp = bp;
>> Â Â Â Â return 0;
>>
>> if Âlibxfs_readbuf() fails due to device removal or other error, bp = NULL.
>> In function xfs_buf_item_init(bp, tp->t_mountp) as soon as bp is
>> dereferenced occurs
>>
>> mkfs.xfs: unhandled page fault (11) at 0x00000070, code 0x017
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@xxxxxxxxxxx
>> http://oss.sgi.com/mailman/listinfo/xfs
>>
>
>

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux