Hello Jason, it seems to be related to fstrim and discard. I cannot reproduce it for images were we don't use trim - but it's still the case it's working fine for images created with jewel and it is not for images pre jewel. The only difference i can find is that the images created with jewel also support deep-flatten. Greets, Stefan Am 11.05.2017 um 22:28 schrieb Jason Dillaman: > Assuming the only log messages you are seeing are the following: > > 2017-05-06 03:20:50.830626 7f7876a64700 -1 > librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating > object map in-memory > 2017-05-06 03:20:50.830634 7f7876a64700 -1 > librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating > object map on-disk > 2017-05-06 03:20:50.831250 7f7877265700 -1 > librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0 > > It looks like that can only occur if somehow the object-map on disk is > larger than the actual image size. If that's the case, how the image > got into that state is unknown to me at this point. > > On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG > <s.priebe@xxxxxxxxxxxx> wrote: >> Hi Jason, >> >> it seems i can at least circumvent the crashes. Since i restarted ALL >> osds after enabling exclusive lock and rebuilding the object maps it had >> no new crashes. >> >> What still makes me wonder are those >> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0 >> >> messages. >> >> Greets, >> Stefan >> >> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG: >>> Hi, >>> Am 08.05.2017 um 14:40 schrieb Jason Dillaman: >>>> You are saying that you had v2 RBD images created against Hammer OSDs >>>> and client libraries where exclusive lock, object map, etc were never >>>> enabled. You then upgraded the OSDs and clients to Jewel and at some >>>> point enabled exclusive lock (and I'd assume object map) on these >>>> images >>> >>> Yes i did: >>> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }'); >>> do rbd -p cephstor5 feature enable $img >>> exclusive-lock,object-map,fast-diff || echo $img; done >>> >>>> -- or were the exclusive lock and object map features already >>>> enabled under Hammer? >>> >>> No as they were not the rbd defaults. >>> >>>> The fact that you encountered an object map error on an export >>>> operation is surprising to me. Does that error re-occur if you >>>> perform the export again? If you can repeat it, it would be very >>>> helpful if you could run the export with "--debug-rbd=20" and capture >>>> the generated logs. >>> >>> No i can't repeat it. It happens every night but for different images. >>> But i never saw it for a vm twice. If i do he export again it works fine. >>> >>> I'm doing an rbd export or an rbd export-diff --from-snap it depends on >>> the VM and day since the last snapshot. >>> >>> Greets, >>> Stefan >>> >>>> >>>> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG >>>> <s.priebe@xxxxxxxxxxxx> wrote: >>>>> Hi, >>>>> >>>>> also i'm getting these errors only for pre jewel images: >>>>> >>>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1 >>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating >>>>> object map in-memory >>>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1 >>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating >>>>> object map on-disk >>>>> 2017-05-06 03:20:50.831250 7f7877265700 -1 >>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0 >>>>> >>>>> while running export-diff. >>>>> >>>>> Stefan >>>>> >>>>> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG: >>>>>> Hello Json, >>>>>> >>>>>> while doing further testing it happens only with images created with >>>>>> hammer and that got upgraded to jewel AND got enabled exclusive lock. >>>>>> >>>>>> Greets, >>>>>> Stefan >>>>>> >>>>>> Am 04.05.2017 um 14:20 schrieb Jason Dillaman: >>>>>>> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the >>>>>>> command and post the resulting log to a new ticket at [1]? I'd also be >>>>>>> interested if you could re-create that >>>>>>> "librbd::object_map::InvalidateRequest" issue repeatably. >>>>>>> n >>>>>>> [1] http://tracker.ceph.com/projects/rbd/issues >>>>>>> >>>>>>> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG >>>>>>> <s.priebe@xxxxxxxxxxxx> wrote: >>>>>>>> Example: >>>>>>>> # rbd rm cephstor2/vm-136-disk-1 >>>>>>>> Removing image: 99% complete... >>>>>>>> >>>>>>>> Stuck at 99% and never completes. This is an image which got corrupted >>>>>>>> for an unknown reason. >>>>>>>> >>>>>>>> Greets, >>>>>>>> Stefan >>>>>>>> >>>>>>>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG: >>>>>>>>> I'm not sure whether this is related but our backup system uses rbd >>>>>>>>> snapshots and reports sometimes messages like these: >>>>>>>>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1 >>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0 >>>>>>>>> >>>>>>>>> Stefan >>>>>>>>> >>>>>>>>> >>>>>>>>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG: >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> since we've upgraded from hammer to jewel 10.2.7 and enabled >>>>>>>>>> exclusive-lock,object-map,fast-diff we've problems with corrupting VM >>>>>>>>>> filesystems. >>>>>>>>>> >>>>>>>>>> Sometimes the VMs are just crashing with FS errors and a restart can >>>>>>>>>> solve the problem. Sometimes the whole VM is not even bootable and we >>>>>>>>>> need to import a backup. >>>>>>>>>> >>>>>>>>>> All of them have the same problem that you can't revert to an older >>>>>>>>>> snapshot. The rbd command just hangs at 99% forever. >>>>>>>>>> >>>>>>>>>> Is this a known issue - anythink we can check? >>>>>>>>>> >>>>>>>>>> Greets, >>>>>>>>>> Stefan >>>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> ceph-users mailing list >>>>>>>> ceph-users@xxxxxxxxxxxxxx >>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> >>>>>>> >>>>>>> >>>> >>>> >>>> > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com