Re: ONE pg deep-scrub blocks cluster

Mehmet <ceph@xxxxxxxxxx> · Tue, 30 Aug 2016 15:14:56 +0200

Good news Jean-Charles :)

now i have deleted the object

[...]
-rw-r--r-- 1 ceph ceph 100G Jul 31 01:04 vm-101-disk-2__head_383C3223__0
[...]

root@:~# rados -p rbd rm vm-101-disk-2

and did run again a deep-scrub on 0.223.

root@gengalos:~# ceph pg 0.223 query

No blocked requests anymore :)

To be sure i have checked

root@:~# ceph pg 0.223 query | less
[...]
                "stat_sum": {
                    "num_bytes": 110264905728,
                    "num_objects": 703,
[...]

But this is not changed till now. I guess i have to wait a while.

Thank you very much for your patience and great help!

Now lets play a bit with ceph ^^

Best regards,

- Mehmet

Am 2016-08-30 00:02, schrieb Jean-Charles Lopez:
How Mehmet

OK so it does come from a rados put.

As you were able to check the VM device objet size is 4 MB.

So we'll see after you have removed the object with rados -p rbd rm.

I'll wait for an update.

JC

While moving. Excuse unintended typos.

On Aug 29, 2016, at 14:34, Mehmet <ceph@xxxxxxxxxx> wrote:

Hey JC,

after setting up the ceph-cluster i tried to migrate an image from one 
of our production vm into ceph via

# rados -p rbd put ...

but i have got always "file too large". I guess this file

# -rw-r--r-- 1 ceph ceph 100G Jul 31 01:04 
vm-101-disk-2__head_383C3223__0

is the result of this :) - did not thought that there will be 
something stay in ceph after the mentioned error above.
Seems i was wrong...

This could match the time where the issue happened first time...:

1. i tried to put via "rados -p rbd put..." this did not worked (tried 
to put a ~400G file...)
2. after ~ 1 week i see the blocked requests after first running 
"deep-scrub" (default where ceph starts deep-scrubbing)

I guess the deleting of this file should solve the issue.
Did you see my mail where i wrote the test results of this?

# osd_scrub_chunk_max = 5
# osd_deep_scrub_stride = 1048576

Only corner note.

This seems more to me like a pure radios object of 100GB that was
uploaded to the cluster. From the name it could be a VM disk image
that was uploaded as an object. If it was an RBD object, it’s size
would be in the boundaries of an RBD objects (order 12=4K order
25=32MB).

Verify that when you do a "rados -p rbd ls | grep vm-101-disk-2”
command, you can see an object named vm-101-disk-2.

root@:~# rados -p rbd ls | grep vm-101-disk-2
rbd_id.vm-101-disk-2
vm-101-disk-2

Verify if you have an RBD named this way “rbd -p rbd ls | grep 
vm-101-disk-2"

root@:~# rbd -p rbd ls | grep vm-101-disk-2
vm-101-disk-2

As I’m not familiar with proxmox so I’d suggest the following:
If yes to 1, for security, copy this file somewhere else and then to 
a
rados -p rbd rm vm-101-disk-2.

root@:~# rbd -p rbd info vm-101-disk-2
rbd image 'vm-101-disk-2':
       size 400 GB in 102400 objects
       order 22 (4096 kB objects)
       block_name_prefix: rbd_data.5e7d1238e1f29
       format: 2
       features: layering, exclusive-lock, object-map, fast-diff, 
deep-flatten
       flags:

The VM with the id "101" is up and running. This is using 
"vm-101-disk-2" as disk - i have moved the disk sucessfully in another 
way :) (same name :/) after "rados put" did not worked. And as we can 
see here the objects for this image also exists within ceph

root@:~# rados -p rbd ls | grep "rbd_data.5e7d1238e1f29" | wc -l
53011

I assumed here to get 102400 objects but as ceph is doing thin 
provisining this should be ok.

If no to 1, for security, copy this file somewhere else and then to a
rm -rf vm-101-disk-2__head_383C3223__0

I should be able to delete the mentioned "100G file".

Make sure all your PG copies show the same content and wait for the
next scrub to see what is happening.

Will make a backup of this file and in addition from the vm within 
proxmox tomorrow on all involved osds and then start a deep-scrub and 
of course keep you informed.

If anything goes wrong you will be able to upload an object with the
exact same content from the file you copied.
Is proxmox using such huge objects for something to your knowledge 
(VM
boot image or something else)? Can you search the proxmox mailing 
list
and open tickets to verify.

As i already wrote in this eMail i guess that i am the cause for this 
:*( with the wrong usage of "rados put".
Proxmox is using librbd to talk with ceph so it should not be able to 
create such a large one file.

And is this the cause of the long deep scrub? I do think so but I’m
not in front of the cluster.

Let it see :) - i hope that my next eMail will close this issue.

Thank you very much for your help!

Best regards,
- Mehmet
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com