Re: Issue with fstrim and Nova hw_disk_discard=unmap

Jason Dillaman <jdillama@xxxxxxxxxx> · Thu, 15 Mar 2018 08:35:21 -0400



OK, last suggestion just to narrow the issue down: ensure you have a
functional admin socket and librbd log file as documented here [1].
With the VM running, before you execute "fstrim", run "ceph
--admin-daemon /path/to/the/asok/file conf set debug_rbd 20" on the
hypervisor host, execute "fstrim" within the VM, and then restore the
log settings via "ceph --admin-daemon /path/to/the/asok/file conf set
debug_rbd 0/5".  Grep the log file for "aio_discard" to verify if QEMU
is passing the discard down to librbd.


[1] http://docs.ceph.com/docs/master/rbd/rbd-openstack/

On Thu, Mar 15, 2018 at 6:53 AM, Fulvio Galeazzi
<fulvio.galeazzi@xxxxxxx> wrote:
> Hallo Jason, I am really thankful for your time!
>
>   Changed the volume features:
>
> rbd image 'volume-80838a69-e544-47eb-b981-a4786be89736':
> .....
>         features: layering, exclusive-lock, deep-flatten
>
> I had to create several dummy files before seeing and increase with "rbd
> du": to me, this is sort of indication that dirty blocks are, at least,
> reused if not properly released.
>
>   Then I did "rm * ; sync ; fstrim / ; sync" but the size did not go down.
>   Is there a way to instruct Ceph to perform what is not currently happening
> automatically (namely, scan the object-map of a volume and force cleanup of
> released blocks)? Or the problem is exactly that such blocks are not seen by
> Ceph as reusable?
>
>   By the way, I think I forgot to mention that underlying OSD disks are
> taken from a FibreChannel storage (DELL MD3860, which is not capable of
> presenting JBOD so I present single disks as RAID0) and XFS formatted.
>
>   Thanks!
>
>                         Fulvio
>
> -------- Original Message --------
> Subject: Re:  Issue with fstrim and Nova hw_disk_discard=unmap
> From: Jason Dillaman <jdillama@xxxxxxxxxx>
> To: Fulvio Galeazzi <fulvio.galeazzi@xxxxxxx>
> CC: Ceph Users <ceph-users@xxxxxxxxxxxxxx>
> Date: 03/14/2018 02:10 PM
>
>> Hmm -- perhaps as an experiment, can you disable the object-map and
>> fast-diff features to see if they are incorrectly reporting the object
>> as in-use after a discard?
>>
>> $ rbd --cluster cephpa1 -p cinder-ceph feature disable
>> volume-80838a69-e544-47eb-b981-a4786be89736 object-map,fast-diff
>>
>> On Wed, Mar 14, 2018 at 3:29 AM, Fulvio Galeazzi
>> <fulvio.galeazzi@xxxxxxx> wrote:
>>>
>>> Hallo Jason, sure here it is!
>>>
>>> rbd --cluster cephpa1 -p cinder-ceph info
>>> volume-80838a69-e544-47eb-b981-a4786be89736
>>> rbd image 'volume-80838a69-e544-47eb-b981-a4786be89736':
>>>          size 15360 MB in 3840 objects
>>>          order 22 (4096 kB objects)
>>>          block_name_prefix: rbd_data.9e7ffe238e1f29
>>>          format: 2
>>>          features: layering, exclusive-lock, object-map, fast-diff,
>>> deep-flatten
>>>          flags:
>>>
>>>    Thanks
>>>
>>>                  Fulvio
>>>
>>>
>>> -------- Original Message --------
>>> Subject: Re:  Issue with fstrim and Nova
>>> hw_disk_discard=unmap
>>> From: Jason Dillaman <jdillama@xxxxxxxxxx>
>>> To: Fulvio Galeazzi <fulvio.galeazzi@xxxxxxx>
>>> CC: Ceph Users <ceph-users@xxxxxxxxxxxxxx>
>>> Date: 03/13/2018 06:33 PM
>>>
>>>> Can you provide the output from "rbd info <pool
>>>> name>/volume-80838a69-e544-47eb-b981-a4786be89736"?
>>>>
>>>> On Tue, Mar 13, 2018 at 12:30 PM, Fulvio Galeazzi
>>>> <fulvio.galeazzi@xxxxxxx> wrote:
>>>>>
>>>>>
>>>>> Hallo!
>>>>>
>>>>>> Discards appear like they are being sent to the device.  How big of a
>>>>>> temporary file did you create and then delete? Did you sync the file
>>>>>> to disk before deleting it? What version of qemu-kvm are you running?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I made several test with commands like (issuing sync after each
>>>>> operation):
>>>>>
>>>>> dd if=/dev/zero of=/tmp/fileTest bs=1M count=200 oflag=direct
>>>>>
>>>>> What I see is that if I repeat the command with count<=200 the size
>>>>> does
>>>>> not
>>>>> increase.
>>>>>
>>>>> Let's try now with count>200:
>>>>>
>>>>> NAME                                        PROVISIONED  USED
>>>>> volume-80838a69-e544-47eb-b981-a4786be89736      15360M 2284M
>>>>>
>>>>> dd if=/dev/zero of=/tmp/fileTest bs=1M count=750 oflag=direct
>>>>> dd if=/dev/zero of=/tmp/fileTest2 bs=1M count=750 oflag=direct
>>>>> sync
>>>>>
>>>>> NAME                                        PROVISIONED  USED
>>>>> volume-80838a69-e544-47eb-b981-a4786be89736      15360M 2528M
>>>>>
>>>>> rm /tmp/fileTest*
>>>>> sync
>>>>> sudo fstrim -v /
>>>>> /: 14.1 GiB (15145271296 bytes) trimmed
>>>>>
>>>>> NAME                                        PROVISIONED  USED
>>>>> volume-80838a69-e544-47eb-b981-a4786be89736      15360M 2528M
>>>>>
>>>>>
>>>>>
>>>>> As for qemu-kvm, the guest OS is CentOS7, with:
>>>>>
>>>>> [centos@testcentos-deco3 tmp]$ rpm -qa | grep qemu
>>>>> qemu-guest-agent-2.8.0-2.el7.x86_64
>>>>>
>>>>> while the host is Ubuntu 16 with:
>>>>>
>>>>> root@pa1-r2-s10:/home/ubuntu# dpkg -l | grep qemu
>>>>> ii  qemu-block-extra:amd64               1:2.8+dfsg-3ubuntu2.9~cloud1
>>>>> amd64        extra block backend modules for qemu-system and qemu-utils
>>>>> ii  qemu-kvm                             1:2.8+dfsg-3ubuntu2.9~cloud1
>>>>> amd64        QEMU Full virtualization
>>>>> ii  qemu-system-common                   1:2.8+dfsg-3ubuntu2.9~cloud1
>>>>> amd64        QEMU full system emulation binaries (common files)
>>>>> ii  qemu-system-x86                      1:2.8+dfsg-3ubuntu2.9~cloud1
>>>>> amd64        QEMU full system emulation binaries (x86)
>>>>> ii  qemu-utils                           1:2.8+dfsg-3ubuntu2.9~cloud1
>>>>> amd64        QEMU utilities
>>>>>
>>>>>
>>>>>     Thanks!
>>>>>
>>>>>                           Fulvio
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>


-- 
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com