Re: which kernel version can help avoid kernel client deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 30, 2015 at 12:46 PM, Z Zhang <zhangz.david@xxxxxxxxxxx> wrote:
>
>> Date: Thu, 30 Jul 2015 11:37:37 +0300
>> Subject: Re:  which kernel version can help avoid kernel
>> client deadlock
>> From: idryomov@xxxxxxxxx
>> To: zhangz.david@xxxxxxxxxxx
>> CC: chaofanyu@xxxxxxxxxxx; ceph-users@xxxxxxxxxxxxxx
>>
>> On Thu, Jul 30, 2015 at 10:29 AM, Z Zhang <zhangz.david@xxxxxxxxxxx>
>> wrote:
>> >
>> > ________________________________
>> > Subject: Re:  which kernel version can help avoid kernel
>> > client
>> > deadlock
>> > From: chaofanyu@xxxxxxxxxxx
>> > Date: Thu, 30 Jul 2015 13:16:16 +0800
>> > CC: idryomov@xxxxxxxxx; ceph-users@xxxxxxxxxxxxxx
>> > To: zhangz.david@xxxxxxxxxxx
>> >
>> >
>> > On Jul 30, 2015, at 12:48 PM, Z Zhang <zhangz.david@xxxxxxxxxxx> wrote:
>> >
>> > We also hit the similar issue from time to time on centos with 3.10.x
>> > kernel. By iostat, we can see kernel rbd client's util is 100%, but no
>> > r/w
>> > io, and we can't umount/unmap this rbd client. After restarting OSDs, it
>> > will become normal.
>>
>> 3.10.x is rather vague, what is the exact version you saw this on? Can you
>> provide syslog logs (I'm interested in dmesg)?
>
> The kernel version should be 3.10.0.
>
> I don't have sys logs at hand. It is not easily reproduced, and it happened
> at very low memory situation. We are running DB instances over rbd as
> storage. DB instances will use lot of memory when running high concurrent
> rw, and after running for a long time, rbd might hit this problem, but not
> always. Enabling rbd log makes our system behave strange during our test.
>
> I back-ported one of your fixes:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/block/rbd.c?id=5a60e87603c4c533492c515b7f62578189b03c9c
>
> So far test looks fine for few days, but still under observation. So want to
> know if there are some other fixes?

I'd suggest following 3.10 stable series (currently at 3.10.84).  The
fix you backported is crucial in low memory situations, so I wouldn't
be surprised if it alone fixed your problem.  (It is not in 3.10.84,
I assume it'll show up in 3.10.85 - for now just apply your backport.)

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux