Re: corrupted rbd filesystems since jewel

Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> · Mon, 15 May 2017 21:54:39 +0200

Hi,

great thanks.

I'm still trying but it's difficult to me as well. As it happens only
sometimes there must be an unknown additional factor. For the future
i've enabled client sockets for all VMs as well. But this does not help
in this case - as it seems to be fixed after migration.

Would it be possible that the problem is the same you fixed? Could it be
that the export process holds the lock while the client tries to
reaquire it for writing?

Greets,
Stefan
Am 15.05.2017 um 21:19 schrieb Jason Dillaman:
> I was able to re-create the issue where "rbd feature disable" hangs if
> the client experienced a long comms failure with the OSDs, and I have
> a proposed fix posted [1]. Unfortunately, I haven't been successful in
> repeating any stalled IO, discard issues, nor export-diff logged
> errors. I'll keep trying to reproduce, but if you can generate
> debug-level logging from one of these events it would be greatly
> appreciated.
> 
> [1] https://github.com/ceph/ceph/pull/15093
> 
> On Mon, May 15, 2017 at 1:29 PM, Stefan Priebe - Profihost AG
> <s.priebe@xxxxxxxxxxxx> wrote:
>> Hello Jason,
>>> Just so I can attempt to repeat this:
>>
>> Thanks.
>>
>>> (1) you had an image that was built using Hammer clients and OSDs with
>>> exclusive lock disabled
>> Yes. It was created with the hammer rbd defaults.
>>
>>> (2) you updated your clients and OSDs to Jewel
>>> (3) you restarted your OSDs and live-migrated your VMs to pick up the
>>> Jewel changes
>>
>> No. I updated the clients only and did a live migration for all VMs to
>> load up the jewel librbd.
>>
>> After that i updated the mons + restart and than updated the osds + restart.
>>
>>> (4) you enabled exclusive-lock, object-map, and fast-diff on a running VM
>> Yes.
>>
>>> (5) you rebuilt the image's object map (while the VM was running?)
>> Yes.
>>
>>> (6) things started breaking at this point
>> Yes but not on all VMs and only while creating and deleting snapshots.
>>
>> Greets,
>> Stefan
>>
>>
>>>
>>> On Sun, May 14, 2017 at 1:42 PM, Stefan Priebe - Profihost AG
>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>> I verified it. After a live migration of the VM i'm able to successfully
>>>> disable fast-diff,exclusive-lock,object-map.
>>>>
>>>> The problem only seems to occur at all if a client has connected to
>>>> hammer without exclusive lock. Than got upgraded to jewel and exclusive
>>>> lock gets enabled.
>>>>
>>>> Greets,
>>>> Stefan
>>>>
>>>> Am 14.05.2017 um 19:33 schrieb Stefan Priebe - Profihost AG:
>>>>> Hello Jason,
>>>>>
>>>>> Am 14.05.2017 um 14:04 schrieb Jason Dillaman:
>>>>>> It appears as though there is client.27994090 at 10.255.0.13 that
>>>>>> currently owns the exclusive lock on that image. I am assuming the log
>>>>>> is from "rbd feature disable"?
>>>>> Yes.
>>>>>
>>>>>> If so, I can see that it attempts to
>>>>>> acquire the lock and the other side is not appropriately responding to
>>>>>> the request.
>>>>>>
>>>>>> Assuming your system is still in this state, is there any chance to
>>>>>> get debug rbd=20 logs from that client by using the client's asok file
>>>>>> and "ceph --admin-daemon /path/to/client/asok config set debug_rbd 20"
>>>>>> and re-run the attempt to disable exclusive lock?
>>>>>
>>>>> It's a VM running qemu with librbd. It seems there is no default socket.
>>>>> If there is no way to activate it later - i don't think so. I can try to
>>>>> activate it in ceph.conf and migrate it to another node. But i'm not
>>>>> sure whether the problem persist after migration or if librbd is
>>>>> somewhat like reinitialized.
>>>>>
>>>>>> Also, what version of Ceph is that client running?
>>>>> Client and Server are on ceph 10.2.7.
>>>>>
>>>>> Greets,
>>>>> Stefan
>>>>>
>>>>>> Jason
>>>>>>
>>>>>> On Sun, May 14, 2017 at 1:55 AM, Stefan Priebe - Profihost AG
>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>> Hello Jason,
>>>>>>>
>>>>>>> as it still happens and VMs are crashing. I wanted to disable
>>>>>>> exclusive-lock,fast-diff again. But i detected that there are images
>>>>>>> where the rbd commands runs in an endless loop.
>>>>>>>
>>>>>>> I canceled the command after 60s and used --debug-rbd=20. Will send the
>>>>>>> log off list.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> Greets,
>>>>>>> Stefan
>>>>>>>
>>>>>>> Am 13.05.2017 um 19:19 schrieb Stefan Priebe - Profihost AG:
>>>>>>>> Hello Jason,
>>>>>>>>
>>>>>>>> it seems to be related to fstrim and discard. I cannot reproduce it for
>>>>>>>> images were we don't use trim - but it's still the case it's working
>>>>>>>> fine for images created with jewel and it is not for images pre jewel.
>>>>>>>> The only difference i can find is that the images created with jewel
>>>>>>>> also support deep-flatten.
>>>>>>>>
>>>>>>>> Greets,
>>>>>>>> Stefan
>>>>>>>>
>>>>>>>> Am 11.05.2017 um 22:28 schrieb Jason Dillaman:
>>>>>>>>> Assuming the only log messages you are seeing are the following:
>>>>>>>>>
>>>>>>>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>>> object map in-memory
>>>>>>>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>>> object map on-disk
>>>>>>>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>>>>>>>
>>>>>>>>> It looks like that can only occur if somehow the object-map on disk is
>>>>>>>>> larger than the actual image size. If that's the case, how the image
>>>>>>>>> got into that state is unknown to me at this point.
>>>>>>>>>
>>>>>>>>> On Thu, May 11, 2017 at 3:23 PM, Stefan Priebe - Profihost AG
>>>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>>>> Hi Jason,
>>>>>>>>>>
>>>>>>>>>> it seems i can at least circumvent the crashes. Since i restarted ALL
>>>>>>>>>> osds after enabling exclusive lock and rebuilding the object maps it had
>>>>>>>>>> no new crashes.
>>>>>>>>>>
>>>>>>>>>> What still makes me wonder are those
>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>>>>>>>>
>>>>>>>>>> messages.
>>>>>>>>>>
>>>>>>>>>> Greets,
>>>>>>>>>> Stefan
>>>>>>>>>>
>>>>>>>>>> Am 08.05.2017 um 14:50 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>>> Hi,
>>>>>>>>>>> Am 08.05.2017 um 14:40 schrieb Jason Dillaman:
>>>>>>>>>>>> You are saying that you had v2 RBD images created against Hammer OSDs
>>>>>>>>>>>> and client libraries where exclusive lock, object map, etc were never
>>>>>>>>>>>> enabled. You then upgraded the OSDs and clients to Jewel and at some
>>>>>>>>>>>> point enabled exclusive lock (and I'd assume object map) on these
>>>>>>>>>>>> images
>>>>>>>>>>>
>>>>>>>>>>> Yes i did:
>>>>>>>>>>> for img in $(rbd -p cephstor5 ls -l | grep -v "@" | awk '{ print $1 }');
>>>>>>>>>>> do rbd -p cephstor5 feature enable $img
>>>>>>>>>>> exclusive-lock,object-map,fast-diff || echo $img; done
>>>>>>>>>>>
>>>>>>>>>>>> -- or were the exclusive lock and object map features already
>>>>>>>>>>>> enabled under Hammer?
>>>>>>>>>>>
>>>>>>>>>>> No as they were not the rbd defaults.
>>>>>>>>>>>
>>>>>>>>>>>> The fact that you encountered an object map error on an export
>>>>>>>>>>>> operation is surprising to me.  Does that error re-occur if you
>>>>>>>>>>>> perform the export again? If you can repeat it, it would be very
>>>>>>>>>>>> helpful if you could run the export with "--debug-rbd=20" and capture
>>>>>>>>>>>> the generated logs.
>>>>>>>>>>>
>>>>>>>>>>> No i can't repeat it. It happens every night but for different images.
>>>>>>>>>>> But i never saw it for a vm twice. If i do he export again it works fine.
>>>>>>>>>>>
>>>>>>>>>>> I'm doing an rbd export or an rbd export-diff --from-snap it depends on
>>>>>>>>>>> the VM and day since the last snapshot.
>>>>>>>>>>>
>>>>>>>>>>> Greets,
>>>>>>>>>>> Stefan
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, May 6, 2017 at 2:38 PM, Stefan Priebe - Profihost AG
>>>>>>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> also i'm getting these errors only for pre jewel images:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-05-06 03:20:50.830626 7f7876a64700 -1
>>>>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>>>>>>> object map in-memory
>>>>>>>>>>>>> 2017-05-06 03:20:50.830634 7f7876a64700 -1
>>>>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 invalidating
>>>>>>>>>>>>> object map on-disk
>>>>>>>>>>>>> 2017-05-06 03:20:50.831250 7f7877265700 -1
>>>>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f7860004410 should_complete: r=0
>>>>>>>>>>>>>
>>>>>>>>>>>>> while running export-diff.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am 06.05.2017 um 07:37 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>>>>>> Hello Json,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> while doing further testing it happens only with images created with
>>>>>>>>>>>>>> hammer and that got upgraded to jewel AND got enabled exclusive lock.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Greets,
>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
>>>>>>>>>>>>>>> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
>>>>>>>>>>>>>>> command and post the resulting log to a new ticket at [1]? I'd also be
>>>>>>>>>>>>>>> interested if you could re-create that
>>>>>>>>>>>>>>> "librbd::object_map::InvalidateRequest" issue repeatably.
>>>>>>>>>>>>>>> n
>>>>>>>>>>>>>>> [1] http://tracker.ceph.com/projects/rbd/issues
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>>>>>>>>>>>>>>> <s.priebe@xxxxxxxxxxxx> wrote:
>>>>>>>>>>>>>>>> Example:
>>>>>>>>>>>>>>>> # rbd rm cephstor2/vm-136-disk-1
>>>>>>>>>>>>>>>> Removing image: 99% complete...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Stuck at 99% and never completes. This is an image which got corrupted
>>>>>>>>>>>>>>>> for an unknown reason.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Greets,
>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>>>>>>>>> I'm not sure whether this is related but our backup system uses rbd
>>>>>>>>>>>>>>>>> snapshots and reports sometimes messages like these:
>>>>>>>>>>>>>>>>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>>>>>>>>>>>>>>>>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> since we've upgraded from hammer to jewel 10.2.7 and enabled
>>>>>>>>>>>>>>>>>> exclusive-lock,object-map,fast-diff we've problems with corrupting VM
>>>>>>>>>>>>>>>>>> filesystems.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sometimes the VMs are just crashing with FS errors and a restart can
>>>>>>>>>>>>>>>>>> solve the problem. Sometimes the whole VM is not even bootable and we
>>>>>>>>>>>>>>>>>> need to import a backup.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> All of them have the same problem that you can't revert to an older
>>>>>>>>>>>>>>>>>> snapshot. The rbd command just hangs at 99% forever.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is this a known issue - anythink we can check?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Greets,
>>>>>>>>>>>>>>>>>> Stefan
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>
>>>
>>>
> 
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com