Re: corrupted rbd filesystems since jewel

Jason Dillaman <jdillama@xxxxxxxxxx> · Mon, 15 May 2017 15:19:21 -0400

I was able to re-create the issue where "rbd feature disable" hangs if
the client experienced a long comms failure with the OSDs, and I have
a proposed fix posted [1]. Unfortunately, I haven't been successful in
repeating any stalled IO, discard issues, nor export-diff logged
errors. I'll keep trying to reproduce, but if you can generate
debug-level logging from one of these events it would be greatly
appreciated.

[1] https://github.com/ceph/ceph/pull/15093

On Mon, May 15, 2017 at 1:29 PM, Stefan Priebe - Profihost AG
<s.priebe@xxxxxxxxxxxx> wrote:
> Hello Jason,
>> Just so I can attempt to repeat this:
>
> Thanks.
>
>> (1) you had an image that was built using Hammer clients and OSDs with
>> exclusive lock disabled
> Yes. It was created with the hammer rbd defaults.
>
>> (2) you updated your clients and OSDs to Jewel
>> (3) you restarted your OSDs and live-migrated your VMs to pick up the
>> Jewel changes
>
> No. I updated the clients only and did a live migration for all VMs to
> load up the jewel librbd.
>
> After that i updated the mons + restart and than updated the osds + restart.
>
>> (4) you enabled exclusive-lock, object-map, and fast-diff on a running VM
> Yes.
>
>> (5) you rebuilt the image's object map (while the VM was running?)
> Yes.
>
>> (6) things started breaking at this point
> Yes but not on all VMs and only while creating and deleting snapshots.
>
> Greets,
> Stefan
>
>
>>
>> On Sun, May 14, 2017 at 1:42 PM, Stefan Priebe - Profihost AG
>> <s.priebe@xxxxxxxxxxxx> wrote:
>>> I verified it. After a live migration of the VM i'm able to successfully
>>> disable fast-diff,exclusive-lock,object-map.
>>>
>>> The problem only seems to occur at all if a client has connected to
>>> hammer without exclusive lock. Than got upgraded to jewel and exclusive
>>> lock gets enabled.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 14.05.2017 um 19:33 schrieb Stefan Priebe - Profihost AG:
>>>> Hello Jason,
>>>>
>>>> Am 14.05.2017 um 14:04 schrieb Jason Dillaman:
>>>>> It appears as though there is client.27994090 at 10.255.0.13 that
>>>>> currently owns the exclusive lock on that image. I am assuming the log
>>>>> is from "rbd feature disable"?
>>>> Yes.
>>>>
>>>>> If so, I can see that it attempts to
>>>>> acquire the lock and the other side is not appropriately responding to
>>>>> the request.
>>>>>
>>>>> Assuming your system is still in this state, is there any chance to
>>>>> get debug rbd=20 logs from that client by using the client's asok file
>>>>> and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
>>>>> and re-run the attempt to disable exclusive lock?
>>>>
>>>> It's a VM running qemu with librbd. It seems there is no default socket.
>>>> If there is no way to activate it later - i don't think so. I can try to
>>>> activate it in ceph.conf and migrate it to another node. But i'm not
>>>> sure whether the problem persist after migration or if librbd is
>>>> somewhat like reinitialized.
>>>>
>>>>> Also, what version of Ceph is that client running?
>>>> Client and Server are on ceph 10.2.7.
>>>>
>>>> Greets,
>>>> Stefan
>>>>
>>>>> Jason
>>>>>
>>>>> On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>> Hello Jason,
>>>>>>
>>>>>> as it still happens and VMs are crashing. I wanted to disable
>>>>>> exclusive-lock,fast-diff again. But i detected that there are images
>>>>>> where the rbd commands runs in an endless loop.
>>>>>>
>>>>>> I canceled the command after 60s and used --debug-rbd=20. Will send the
>>>>>> log off list.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Greets,
>>>>>> Stefan
>>>>>>
>>>>>> Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
>>>>>>> Hello Jason,
>>>>>>>
>>>>>>> it seems to be related to fstrim and discard. I cannot reproduce it for
>>>>>>> images were we don't use trim - but it's still the case it's working
>>>>>>> fine for images created with jewel and it is not for images pre jewel.
>>>>>>> The only difference i can find is that the images created with jewel
>>>>>>> also support deep-flatten.
>>>>>>>
>>>>>>> Greets,
>>>>>>> Stefan
>>>>>>>
>>>>>>> Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
>>>>>>>> Assuming the only log messages you are seeing are the following:
>>>>>>>>
>>>>>>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>> object map in-memory
>>>>>>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>> object map on-disk
>>>>>>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>>>>>>
>>>>>>>> It looks like that can only occur if somehow the object-map on disk is
>>>>>>>> larger than the actual image size. If that's the case, how the image
>>>>>>>> got into that state is unknown to me at this point.
>>>>>>>>
>>>>>>>> On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
>>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>>> Hi Jason,
>>>>>>>>>
>>>>>>>>> it seems i can at least circumvent the crashes. Since i restarted ALL
>>>>>>>>> osds after enabling exclusive lock and rebuilding the object maps it had
>>>>>>>>> no new crashes.
>>>>>>>>>
>>>>>>>>> What still makes me wonder are those
>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>>>>>>>
>>>>>>>>> messages.
>>>>>>>>>
>>>>>>>>> Greets,
>>>>>>>>> Stefan
>>>>>>>>>
>>>>>>>>> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>> Hi,
>>>>>>>>>> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>>>>>>>>>>> You are saying that you had v2 RBD images created against Hammer OSDs
>>>>>>>>>>> and client libraries where exclusive lock, object map, etc were never
>>>>>>>>>>> enabled. You then upgraded the OSDs and clients to Jewel and at some
>>>>>>>>>>> point enabled exclusive lock (and I'd assume object map) on these
>>>>>>>>>>> images
>>>>>>>>>>
>>>>>>>>>> Yes i did:
>>>>>>>>>> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
>>>>>>>>>> do rbd -p cephstor5 feature enable $img
>>>>>>>>>> exclusive-lock,object-map,fast-diff || echo $img; done
>>>>>>>>>>
>>>>>>>>>>> -- or were the exclusive lock and object map features already
>>>>>>>>>>> enabled under Hammer?
>>>>>>>>>>
>>>>>>>>>> No as they were not the rbd defaults.
>>>>>>>>>>
>>>>>>>>>>> The fact that you encountered an object map error on an export
>>>>>>>>>>> operation is surprising to me.  Does that error re-occur if you
>>>>>>>>>>> perform the export again? If you can repeat it, it would be very
>>>>>>>>>>> helpful if you could run the export with "--debug-rbd=20" and capture
>>>>>>>>>>> the generated logs.
>>>>>>>>>>
>>>>>>>>>> No i can't repeat it. It happens every night but for different images.
>>>>>>>>>> But i never saw it for a vm twice. If i do he export again it works fine.
>>>>>>>>>>
>>>>>>>>>> I'm doing an rbd export or an rbd export-diff --from-snap it depends on
>>>>>>>>>> the VM and day since the last snapshot.
>>>>>>>>>>
>>>>>>>>>> Greets,
>>>>>>>>>> Stefan
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
>>>>>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> also i'm getting these errors only for pre jewel images:
>>>>>>>>>>>>
>>>>>>>>>>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>>>>>> object map in-memory
>>>>>>>>>>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>>>>>> object map on-disk
>>>>>>>>>>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>>>>>>>>>>
>>>>>>>>>>>> while running export-diff.
>>>>>>>>>>>>
>>>>>>>>>>>> Stefan
>>>>>>>>>>>>
>>>>>>>>>>>> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>>>>> Hello Json,
>>>>>>>>>>>>>
>>>>>>>>>>>>> while doing further testing it happens only with images created with
>>>>>>>>>>>>> hammer and that got upgraded to jewel AND got enabled exclusive lock.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Greets,
>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
>>>>>>>>>>>>>> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
>>>>>>>>>>>>>> command and post the resulting log to a new ticket at [1]? I'd also be
>>>>>>>>>>>>>> interested if you could re-create that
>>>>>>>>>>>>>> "librbd::object_map::InvalidateRequest" issue repeatably.
>>>>>>>>>>>>>> n
>>>>>>>>>>>>>> [1] http://tracker.ceph.com/projects/rbd/issues
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>>>>>>>>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>>>>>>>>> Example:
>>>>>>>>>>>>>>> # rbd rm cephstor2/vm-136-disk-1
>>>>>>>>>>>>>>> Removing image: 99% complete...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Stuck at 99% and never completes. This is an image which got corrupted
>>>>>>>>>>>>>>> for an unknown reason.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Greets,
>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>>>>>>>> I'm not sure whether this is related but our backup system uses rbd
>>>>>>>>>>>>>>>> snapshots and reports sometimes messages like these:
>>>>>>>>>>>>>>>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>>>>>>>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> since we've upgraded from hammer to jewel 10.2.7 and enabled
>>>>>>>>>>>>>>>>> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
>>>>>>>>>>>>>>>>> filesystems.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sometimes the VMs are just crashing with FS errors and a restart can
>>>>>>>>>>>>>>>>> solve the problem. Sometimes the whole VM is not even bootable and we
>>>>>>>>>>>>>>>>> need to import a backup.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> All of them have the same problem that you can't revert to an older
>>>>>>>>>>>>>>>>> snapshot. The rbd command just hangs at 99% forever.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Is this a known issue - anythink we can check?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Greets,
>>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>
>>>>>
>>>>>
>>
>>
>>

-- 
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com