Re: reproducable rbd-nbd crashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason,

Am 23.07.19 um 14:41 schrieb Jason Dillaman
> Can you please test a consistent Ceph release w/ a known working
> kernel release? It sounds like you have changed two variables, so it's
> hard to know which one is broken. We need *you* to isolate what
> specific Ceph or kernel release causes the break.
Sure, lets find the origin of this problem. :-)
>
> We really haven't made many changes to rbd-nbd, but the kernel has had
> major changes to the nbd driver. As Mike pointed out on the tracker
> ticket, one of those major changes effectively capped the number of
> devices at 256. Can you repeat this with a single device? 


Definitely, the problematic rbd-nbd runs on a virtual system which only utilizes one single nbd and one single krbd device.

To be clear:

# lsb_release -d
Description:    Ubuntu 16.04.5 LTS

# uname -a
Linux archiv-001 4.15.0-45-generic #48~16.04.1-Ubuntu SMP Tue Jan 29 18:03:48 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# rbd nbd ls
pid    pool    image             snap device   
626931 rbd_hdd archiv-001-srv_ec -    /dev/nbd0 

# rbd showmapped
id pool    image          snap device   
0  rbd_hdd archiv-001_srv -    /dev/rbd0 

# df -h|grep -P "File|nbd|rbd"
Filesystem                Size  Used Avail Use% Mounted on
/dev/rbd0                  32T   31T  1.8T  95% /srv
/dev/nbd0                 3.0T  1.3T  1.8T  42% /srv_ec

#  mount|grep -P "nbd|rbd"
/dev/rbd0 on /srv type xfs (rw,relatime,attr2,largeio,inode64,allocsize=4096k,logbufs=8,logbsize=256k,sunit=8192,swidth=8192,noquota,_netdev)
/dev/nbd0 on /srv_ec type xfs (rw,relatime,attr2,discard,largeio,inode64,allocsize=4096k,logbufs=8,logbsize=256k,noquota,_netdev)

# dpkg -l|grep -P "rbd|ceph"
ii  ceph-common                           12.2.11-1xenial                            amd64        common utilities to mount and interact with a ceph storage cluster
ii  libcephfs2                            12.2.11-1xenial                            amd64        Ceph distributed file system client library
ii  librbd1                               12.2.11-1xenial                            amd64        RADOS block device client library
ii  python-cephfs                         12.2.11-1xenial                            amd64        Python 2 libraries for the Ceph libcephfs library
ii  python-rbd                            12.2.11-1xenial                            amd64        Python 2 libraries for the Ceph librbd library
ii  rbd-nbd                               12.2.12-1xenial                            amd64        NBD-based rbd client for the Ceph distributed file system

More details regarding the problem environment can be gathered in my initial mail below the description"Environment". 
> Can you
> repeat this on Ceph rbd-nbd 12.2.11 with an older kernel?

Sure, which kernel do you prefer?

I can test with following releases:

# apt-cache search linux-image-4.*.*.*-*-generic 2>&1|sed '~s,\.[0-9]*-[0-9]*-*-generic - .*,,;~s,linux-image-,,'|sort -u
4.10
4.11
4.13
4.15
4.4
4.8

We can also perform tests by using another filesystem (i.e. ext4).

>From my point of view i suppose that there is something wrong nbd.ko or with rbd-nbd (excluding rbd-cache functionality) - therefore i do not think that this very promising....

Regards
Marc
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux