Hi Bryan, Have you checked your MTUs? I was recently bitten by large packets not getting through where small packets would. (This list, Dec 14, "All pgs stuck peering".) Small files working but big files not working smells like it could be a similar problem. Cheers, Chris On Thu, Dec 17, 2015 at 07:43:54PM +0000, Bryan Wright wrote: > Hi folks, > > This is driving me crazy. I have a ceph filesystem that behaves normally > when I "ls" files, and behaves normally when I copy smallish files on or off > of the filesystem, but large files (~ GB size) hang after copying a few > megabytes. > > This is ceph 0.94.5 under Centos 6.7 under kernel 4.3.3-1.el6.elrepo.x86_64. > I've tried 64-bit and 32-bit clients with several different kernels, but > all behave the same. > > After copying the first few bytes I get a stream of "slow request" messages > for the osds, like this: > > 2015-12-17 14:20:40.458306 osd.208 [WRN] slow request 1922.166564 seconds > old, received at 2015-12-17 13:48:38.291683: osd_op(mds.0.14956:851 > 100010a7b92.0000000d [stat] 0.5d427a9a RETRY=5 > ack+retry+read+rwordered+known_if_redirected e193868) currently reached_pg > > It's not a single OSD misbehaving. It seems to be any OSD. The OSDs have > plenty of disk space, and there's nothing in the osd logs that points to a > problem. > > How can I find out what's blocking these requests? > > Bryan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com