Re: filestore: ENODATA error after directory split confuses transaction

Neha Ojha <nojha@xxxxxxxxxx> · Thu, 15 Apr 2021 14:42:17 -0700

Hi Mykola,

On Tue, Apr 13, 2021 at 8:49 PM Mykola Golub <to.my.trociny@xxxxxxxxx> wrote:
>
> Hi,
>
> We had a case reported by our customer, when a faulty disk was
> returning ENODATA error on directory split and it created some mess
> due to transactions aborting a transaction operation when encountering
> the directory split error but not aborting the whole transaction,
> exectuting another operations.
>
> The kernel log:
>
> 2020-12-16T07:02:36.736166+09:00 node5 kernel: [10270806.635341] sd 0:2:10:0: [sdk] tag#1 BRCM Debug mfi stat 0x2d, data len requested/completed 0x4000/0x0
> 2020-12-16T07:02:36.736180+09:00 node5 kernel: [10270806.635349] sd 0:2:10:0: [sdk] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> 2020-12-16T07:02:36.736181+09:00 node5 kernel: [10270806.635351] sd 0:2:10:0: [sdk] tag#1 Sense Key : Medium Error [current]
> 2020-12-16T07:02:36.736184+09:00 node5 kernel: [10270806.635353] sd 0:2:10:0: [sdk] tag#1 Add. Sense: Unrecovered read error
> 2020-12-16T07:02:36.736203+09:00 node5 kernel: [10270806.635355] sd 0:2:10:0: [sdk] tag#1 CDB: Read(16) 88 00 00 00 00 00 02 67 ec 00 00 00 00 20 00 00
> 2020-12-16T07:02:36.736234+09:00 node5 kernel: [10270806.635357] blk_update_request: critical medium error, dev sdk, sector 40365056
> 2020-12-16T07:02:36.736240+09:00 node5 kernel: [10270806.635379] XFS (sdk2): metadata I/O error: block 0x1edda00 ("xfs_trans_read_buf_map") error 61 numblks 32
> 2020-12-16T07:02:36.736241+09:00 node5 kernel: [10270806.635384] XFS (sdk2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -61.
>
> The osd log:
>
> 2020-12-16 07:02:36.419810 7f283a43d700  1 _created [2,5,C,A,6] has 447 objects, starting split in pg 5.452s0_head.
> 2020-12-16 07:02:36.736125 7f283a43d700  1 _created [2,5,C,A,6] split completed in pg 5.452s0_head.
> 2020-12-16 07:02:36.736150 7f283a43d700 -1 filestore(/var/lib/ceph/osd/ceph-57) error creating 0#5:4a3568ed:::<CENSORED>:head# (/var/lib/ceph/osd/ceph-57/current/5.452s0_head/DIR_2/DIR_5/DIR_C/DIR_A/DIR_6/<CENSORED>__head_B716AC52__5_ffffffffffffffff_0) in index: (61) No data available
>
> So a transaction operation created a new object file, detected that
> the directory needed splitting, tried to split, failed, aborted the
> operation in the middle, returned the ENODATA error to
> FileStore::_do_transaction, but it was ignored and the transaction
> conitnued.
>
> We do not have idea where exactly the split was failing but it seemed
> it did not cause data loss, but it aborted a transaction operation in
> the middle and it could make some mess.
>
> We were seeing at least two types of such "messy" transactions:
>
> 1) On rados writing a new objects, one of the first transaction
> operations is OP_TOUCH. It creates the object file, tries to split the
> directory, aborts and skips creating the object spill_out attribute
> due to this.
>
> 2) On rados deleting an object, one of the transactions operations is
> OP_COLL_MOVE_RENAME, wich creates a temporary link, which triggers the
> directory split and the error, the op is aborted in the middle leaving
> the original object file not removed.
>
> So it looks like a bug and could be improved, but the question is if
> the upsteam is still interested in improving the filestore in this
> area? Should I report it to the tracker?

Please feel free to create a tracker for it. Though we are not
actively developing against filestore, if the fix for this issue isn't
too invasive, I don't see any issues in merging it.

Thanks,
Neha

>
> --
> Mykola Golub
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
>
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx