Re: OSD slow requests causing disk aborts in KVM

Alexandre DERUMIER <aderumier@xxxxxxxxx> · Fri, 13 Feb 2015 09:00:58 +0100 (CET)

>>Can this timeout be increased in some way? I've searched around and found the /sys/block/sdx/device/timeout knob, which in my case is set to 30s. 

yes, sure

echo 60 > /sys/block/sdx/device/timeout

for 60s for example

----- Mail original -----
De: "Krzysztof Nowicki" <krzysztof.a.nowicki@xxxxxxxxx>
À: "Andrey Korolyov" <andrey@xxxxxxx>, "aderumier" <aderumier@xxxxxxxxx>
Cc: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Vendredi 13 Février 2015 08:18:26
Objet: Re:  OSD slow requests causing disk aborts in KVM

Thu Feb 12 2015 at 16:23:38 użytkownik Andrey Korolyov < andrey@xxxxxxx > napisał: 

On Fri, Feb 6, 2015 at 12:16 PM, Krzysztof Nowicki 
< krzysztof.a.nowicki@xxxxxxxxx > wrote: 
> Hi all, 
> 
> I'm running a small Ceph cluster with 4 OSD nodes, which serves as a storage 
> backend for a set of KVM virtual machines. The VMs use RBD for disk storage. 
> On the VM side I'm using virtio-scsi instead of virtio-blk in order to gain 
> DISCARD support. 
> 
> Each OSD node is running on a separate machine, using 3TB WD Black drive + 
> Samsung SSD for journal. The machines used for OSD nodes are not equal in 
> spec. Three of them are small servers, while one is a desktop PC. The last 
> node is the one causing trouble. During high loads caused by remapping due 
> to one of the other nodes going down I've experienced some slow requests. To 
> my surprise however these slow requests caused aborts from the block device 
> on the VM side, which ended up corrupting files. 
> 
> What I wonder if such behaviour (aborts) is normal in case slow requests 
> pile up. I always though that these requests would be delayed but eventually 
> they'd be handled. Are there any tunables that would help me avoid such 
> situations? I would really like to avoid VM outages caused by such 
> corruption issues. 
> 
> I can attach some logs if needed. 
> 
> Best regards 
> Chris 

Hi, this is unevitable payoff for using scsi backend on a storage 
which is capable to slow enough operations. There was some 
argonaut/bobtail-era discussions in ceph ml, may be those readings can 
be interesting for you. AFAIR the scsi disk would about after 70s of 
non-receiving ack state for a pending operation. 

Can this timeout be increased in some way? I've searched around and found the /sys/block/sdx/device/timeout knob, which in my case is set to 30s. 

As for the versions I'm running all Ceph nodes on Gentoo with Ceph version 0.80.5. The VM guest in question is running Ubuntu 12.04 LTS with kernel 3.13. The guest filesystem is BTRFS. 

I'm thinking that the corruption may be some BTRFS bug. 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com