Re: [EXTERNAL] Re: fsck error: found stray omap data on omap_head

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You are correct, even though the repair reports an error, I was able to join the disk back into the cluster, and it stopped reporting the legacy omap warning. I had assumed an "error" was something that needed to be rectified before anything could proceed, but apparently it's more like "warning: there was an error on this one non-critical task" :)


We'll probably just destroy and rebuild that OSD once we're back to HEALTH_OK.


Thank you!


________________________________
From: Igor Fedotov <ifedotov@xxxxxxx>
Sent: Thursday, May 20, 2021 05:15
To: Pickett, Neale T; ceph-users@xxxxxxx
Subject: [EXTERNAL]  Re: fsck error: found stray omap data on omap_head

I think there is no way to fix that at the moment other than manually
identify and remove relevant record(s) in RocksDB with
ceph-kvstore-tool. Which might be pretty tricky..

Looks like we should implement these stray records removal when
repairing BlueStore...


On 5/19/2021 11:12 PM, Pickett, Neale T wrote:
> We just upgraded to pacific, and I'm trying to clear warnings about legacy bluestore omap usage stats by running 'ceph-bluestore-tool repair`, as instructed by the warning message. It's been going fine, but we are now getting this error:
>
>
> [root@vanilla bin]# ceph-bluestore-tool repair --path $osd_path
> 2021-05-19T19:25:26.485+0000 7f67ca3593c0 -1 bluestore(/var/lib/ceph/osd/ceph-9) fsck error: found stray omap data on omap_head 12256434 0 0
> repair status: remaining 1 error(s) and warning(s)
> [root@vanilla bin]# ceph-bluestore-tool fsck --path $osd_path -deep
>
> 2021-05-19T20:03:17.002+0000 7f4d1d6603c0 -1 bluestore(/var/lib/ceph/osd/ceph-9) fsck error: found stray omap data on omap_head 12256434 0 0
>
> fsck status: remaining 1 error(s) and warning(s)
>
>
> We're only 10% of the way through our OSDs, so I'd like to find some way to fix this other than destroying and rebuilding the OSD, in case it happens again. Fixing this error is especially attractive since we can't get out of HEALTH_WARN until we've run recover on all OSDs.

One can silent 'legacy omap' warning via
"bluestore_warn_on_no_per_pool_omap" and
"bluestore_warn_on_no_per_pg_omap" config parameterrs.

And I'm not sure I understand why the above fsck error prevents from
proceeding with the upgrade. Indeed the repair leaves this stray omap
record as-is but all the other omaps should be properly converted at
this point. I presume this should eliminate the "legacy omap" warning
for this specific OSD. Isn't this the case?


>
> Any suggestions?
>
>
>
> Neale Pickett <neale@xxxxxxxx>
> A-4: Advanced Research in Cyber Systems
> Los Alamos National Laboratory
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux