Hi, We had a case reported by our customer, when a faulty disk was returning ENODATA error on directory split and it created some mess due to transactions aborting a transaction operation when encountering the directory split error but not aborting the whole transaction, exectuting another operations. The kernel log: 2020-12-16T07:02:36.736166+09:00 node5 kernel: [10270806.635341] sd 0:2:10:0: [sdk] tag#1 BRCM Debug mfi stat 0x2d, data len requested/completed 0x4000/0x0 2020-12-16T07:02:36.736180+09:00 node5 kernel: [10270806.635349] sd 0:2:10:0: [sdk] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE 2020-12-16T07:02:36.736181+09:00 node5 kernel: [10270806.635351] sd 0:2:10:0: [sdk] tag#1 Sense Key : Medium Error [current] 2020-12-16T07:02:36.736184+09:00 node5 kernel: [10270806.635353] sd 0:2:10:0: [sdk] tag#1 Add. Sense: Unrecovered read error 2020-12-16T07:02:36.736203+09:00 node5 kernel: [10270806.635355] sd 0:2:10:0: [sdk] tag#1 CDB: Read(16) 88 00 00 00 00 00 02 67 ec 00 00 00 00 20 00 00 2020-12-16T07:02:36.736234+09:00 node5 kernel: [10270806.635357] blk_update_request: critical medium error, dev sdk, sector 40365056 2020-12-16T07:02:36.736240+09:00 node5 kernel: [10270806.635379] XFS (sdk2): metadata I/O error: block 0x1edda00 ("xfs_trans_read_buf_map") error 61 numblks 32 2020-12-16T07:02:36.736241+09:00 node5 kernel: [10270806.635384] XFS (sdk2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -61. The osd log: 2020-12-16 07:02:36.419810 7f283a43d700 1 _created [2,5,C,A,6] has 447 objects, starting split in pg 5.452s0_head. 2020-12-16 07:02:36.736125 7f283a43d700 1 _created [2,5,C,A,6] split completed in pg 5.452s0_head. 2020-12-16 07:02:36.736150 7f283a43d700 -1 filestore(/var/lib/ceph/osd/ceph-57) error creating 0#5:4a3568ed:::<CENSORED>:head# (/var/lib/ceph/osd/ceph-57/current/5.452s0_head/DIR_2/DIR_5/DIR_C/DIR_A/DIR_6/<CENSORED>__head_B716AC52__5_ffffffffffffffff_0) in index: (61) No data available So a transaction operation created a new object file, detected that the directory needed splitting, tried to split, failed, aborted the operation in the middle, returned the ENODATA error to FileStore::_do_transaction, but it was ignored and the transaction conitnued. We do not have idea where exactly the split was failing but it seemed it did not cause data loss, but it aborted a transaction operation in the middle and it could make some mess. We were seeing at least two types of such "messy" transactions: 1) On rados writing a new objects, one of the first transaction operations is OP_TOUCH. It creates the object file, tries to split the directory, aborts and skips creating the object spill_out attribute due to this. 2) On rados deleting an object, one of the transactions operations is OP_COLL_MOVE_RENAME, wich creates a temporary link, which triggers the directory split and the error, the op is aborted in the middle leaving the original object file not removed. So it looks like a bug and could be improved, but the question is if the upsteam is still interested in improving the filestore in this area? Should I report it to the tracker? -- Mykola Golub _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx