ceph fio read test hangs

"Da Chun" <ngugc@xxxxxx> · Wed, 17 Jul 2013 20:36:49 +0800

On Ubuntu 13.04, ceph 0.61.4.

I was running an fio read test as below, then it hung:
root@ceph-node2:/mnt# fio -filename=/dev/rbd1 -direct=1 -iodepth 1 -thread -rw=read -ioengine=psync -bs=4k -size=50G -numjobs=16 -group_reporting -name=mytest 
mytest: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=psync, iodepth=1
...
mytest: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=psync, iodepth=1
2.0.8
Starting 16 threads
^Cbs: 16 (f=16): [RRRRRRRRRRRRRRRR] [0.1% done] [0K/0K /s] [0 /0  iops] [eta 02d:01h:34m:39s]   
fio: terminating on signal 2
^Cbs: 16 (f=16): [RRRRRRRRRRRRRRRR] [0.1% done] [0K/0K /s] [0 /0  iops] [eta 02d:18h:36m:23s]
fio: terminating on signal 2
Jobs: 16 (f=16): [RRRRRRRRRRRRRRRR] [0.1% done] [0K/0K /s] [0 /0  iops] [eta 04d:07h:40m:55s]

The top command shown that one cpu was waiting for disk IO, and the other was idle:
top - 20:28:30 up 1 day,  6:02,  3 users,  load average: 16.00, 13.91, 8.55
Tasks: 141 total,   1 running, 139 sleeping,   0 stopped,   1 zombie
%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.3 us,  0.3 sy,  0.0 ni,  0.0 id, 99.3 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   4013924 total,   702112 used,  3311812 free,     3124 buffers
KiB Swap:  3903484 total,   184520 used,  3718964 free,    74156 cached

root@ceph-node4:~# ceph -s
   health HEALTH_OK
   monmap e5: 3 mons at {ceph-node0=172.18.11.30:6789/0,ceph-node2=172.18.11.32:6789/0,ceph-node4=172.18.11.34:6789/0}, election epoch 714, quorum 0,1,2 ceph-node0,ceph-node2,ceph-node4
   osdmap e4043: 11 osds: 11 up, 11 in
    pgmap v92429: 1192 pgs: 1192 active+clean; 530 GB data, 1090 GB used, 9041 GB / 10131 GB avail
   mdsmap e1: 0/0/1 up

Nothing error found in the ceph.log.

Anything else I can collect for investigation?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com