Re: corrupted rbd filesystems since jewel

Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> · Sun, 14 May 2017 19:33:15 +0200

Hello Jason,

Am 14.05.2017 um 14:04 schrieb Jason Dillaman:
> It appears as though there is client.27994090 at 10.255.0.13 that
> currently owns the exclusive lock on that image. I am assuming the log
> is from "rbd feature disable"?
Yes.

> If so, I can see that it attempts to
> acquire the lock and the other side is not appropriately responding to
> the request.
> 
> Assuming your system is still in this state, is there any chance to
> get debug rbd=20 logs from that client by using the client's asok file
> and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
> and re-run the attempt to disable exclusive lock?

It's a VM running qemu with librbd. It seems there is no default socket.
If there is no way to activate it later - i don't think so. I can try to
activate it in ceph.conf and migrate it to another node. But i'm not
sure whether the problem persist after migration or if librbd is
somewhat like reinitialized.

> Also, what version of Ceph is that client running?
Client and Server are on ceph 10.2.7.

Greets,
Stefan

> Jason
> 
> On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
> <s.priebe@xxxxxxxxxxxx> wrote:
>> Hello Jason,
>>
>> as it still happens and VMs are crashing. I wanted to disable
>> exclusive-lock,fast-diff again. But i detected that there are images
>> where the rbd commands runs in an endless loop.
>>
>> I canceled the command after 60s and used --debug-rbd=20. Will send the
>> log off list.
>>
>> Thanks!
>>
>> Greets,
>> Stefan
>>
>> Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
>>> Hello Jason,
>>>
>>> it seems to be related to fstrim and discard. I cannot reproduce it for
>>> images were we don't use trim - but it's still the case it's working
>>> fine for images created with jewel and it is not for images pre jewel.
>>> The only difference i can find is that the images created with jewel
>>> also support deep-flatten.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
>>>> Assuming the only log messages you are seeing are the following:
>>>>
>>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>> object map in-memory
>>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>> object map on-disk
>>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>>
>>>> It looks like that can only occur if somehow the object-map on disk is
>>>> larger than the actual image size. If that's the case, how the image
>>>> got into that state is unknown to me at this point.
>>>>
>>>> On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>> Hi Jason,
>>>>>
>>>>> it seems i can at least circumvent the crashes. Since i restarted ALL
>>>>> osds after enabling exclusive lock and rebuilding the object maps it had
>>>>> no new crashes.
>>>>>
>>>>> What still makes me wonder are those
>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>>>
>>>>> messages.
>>>>>
>>>>> Greets,
>>>>> Stefan
>>>>>
>>>>> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
>>>>>> Hi,
>>>>>> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>>>>>>> You are saying that you had v2 RBD images created against Hammer OSDs
>>>>>>> and client libraries where exclusive lock, object map, etc were never
>>>>>>> enabled. You then upgraded the OSDs and clients to Jewel and at some
>>>>>>> point enabled exclusive lock (and I'd assume object map) on these
>>>>>>> images
>>>>>>
>>>>>> Yes i did:
>>>>>> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
>>>>>> do rbd -p cephstor5 feature enable $img
>>>>>> exclusive-lock,object-map,fast-diff || echo $img; done
>>>>>>
>>>>>>> -- or were the exclusive lock and object map features already
>>>>>>> enabled under Hammer?
>>>>>>
>>>>>> No as they were not the rbd defaults.
>>>>>>
>>>>>>> The fact that you encountered an object map error on an export
>>>>>>> operation is surprising to me.  Does that error re-occur if you
>>>>>>> perform the export again? If you can repeat it, it would be very
>>>>>>> helpful if you could run the export with "--debug-rbd=20" and capture
>>>>>>> the generated logs.
>>>>>>
>>>>>> No i can't repeat it. It happens every night but for different images.
>>>>>> But i never saw it for a vm twice. If i do he export again it works fine.
>>>>>>
>>>>>> I'm doing an rbd export or an rbd export-diff --from-snap it depends on
>>>>>> the VM and day since the last snapshot.
>>>>>>
>>>>>> Greets,
>>>>>> Stefan
>>>>>>
>>>>>>>
>>>>>>> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> also i'm getting these errors only for pre jewel images:
>>>>>>>>
>>>>>>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>> object map in-memory
>>>>>>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>> object map on-disk
>>>>>>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>>>>>>
>>>>>>>> while running export-diff.
>>>>>>>>
>>>>>>>> Stefan
>>>>>>>>
>>>>>>>> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>> Hello Json,
>>>>>>>>>
>>>>>>>>> while doing further testing it happens only with images created with
>>>>>>>>> hammer and that got upgraded to jewel AND got enabled exclusive lock.
>>>>>>>>>
>>>>>>>>> Greets,
>>>>>>>>> Stefan
>>>>>>>>>
>>>>>>>>> Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
>>>>>>>>>> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
>>>>>>>>>> command and post the resulting log to a new ticket at [1]? I'd also be
>>>>>>>>>> interested if you could re-create that
>>>>>>>>>> "librbd::object_map::InvalidateRequest" issue repeatably.
>>>>>>>>>> n
>>>>>>>>>> [1] http://tracker.ceph.com/projects/rbd/issues
>>>>>>>>>>
>>>>>>>>>> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>>>>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>>>>> Example:
>>>>>>>>>>> # rbd rm cephstor2/vm-136-disk-1
>>>>>>>>>>> Removing image: 99% complete...
>>>>>>>>>>>
>>>>>>>>>>> Stuck at 99% and never completes. This is an image which got corrupted
>>>>>>>>>>> for an unknown reason.
>>>>>>>>>>>
>>>>>>>>>>> Greets,
>>>>>>>>>>> Stefan
>>>>>>>>>>>
>>>>>>>>>>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>>>> I'm not sure whether this is related but our backup system uses rbd
>>>>>>>>>>>> snapshots and reports sometimes messages like these:
>>>>>>>>>>>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>>>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
>>>>>>>>>>>>
>>>>>>>>>>>> Stefan
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> since we've upgraded from hammer to jewel 10.2.7 and enabled
>>>>>>>>>>>>> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
>>>>>>>>>>>>> filesystems.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sometimes the VMs are just crashing with FS errors and a restart can
>>>>>>>>>>>>> solve the problem. Sometimes the whole VM is not even bootable and we
>>>>>>>>>>>>> need to import a backup.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All of them have the same problem that you can't revert to an older
>>>>>>>>>>>>> snapshot. The rbd command just hangs at 99% forever.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is this a known issue - anythink we can check?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Greets,
>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>
>>>>
>>>>
> 
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com