On 12/18/2014 11:13 AM, Wido den Hollander wrote: > On 12/17/2014 07:42 PM, Gregory Farnum wrote: >> On Wed, Dec 17, 2014 at 8:35 AM, Wido den Hollander <wido@xxxxxxxx> wrote: >>> Hi, >>> >>> Today I've been playing with CephFS and the morning started great with >>> CephFS playing along just fine. >>> >>> Some information first: >>> - Ceph 0.89 >>> - Linux kernel 3.18 >>> - Ceph fuse 0.89 >>> - One Active MDS, one Standby >>> >>> This morning I could write a 10GB file like this using the kclient: >>> $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync >>> >>> That gave me 850MB/sec (all 10G network) and I could read the same file >>> again with 610MB/sec. >>> >>> After writing to it multiple times it suddenly started to hang. >>> >>> No real evidence on the MDS (debug mds set to 20) or anything on the >>> client. That specific operation just blocked, but I could still 'ls' the >>> filesystem in a second terminal. >>> >>> The MDS was showing in it's log that it was checking active sessions of >>> clients. It showed the active session of my single client. >>> >>> The client renewed it's caps and proceeded. >> >> Can you clarify this? I'm not quite sure what you mean. >> > > I currently don't have the logs available. That was my problem when > typing the original e-mail. > >>> I currently don't have any logs, but I'm just looking for a direction to >>> be pointed towards. >>> >>> Any ideas? >> >> Well, now that you're on v0.89 you should explore the admin >> socket...there are commands on the MDS to dump ops in flight (and >> maybe to look at session states? I don't remember when that merged). > > Sage's pointer towards the kernel debugging and the new admin socket > showed me that it were RADOS calls which were hanging. > > I investigated even further and it seems that this is not a CephFS > problem, but a local TCP issue which is only triggered when using CephFS. > > At some point, which is still unclear to me, data transfer becomes very > slow. The MDS doesn't seem to be able to update the journal and the > client can't write to the OSDs anymore. > > It happened after I did some very basic TCP tuning (timestamp, rmem, > wmem, sack, fastopen). > So it was tcp_sack. With tcp_sack=0 the MDS has problems talking to OSDs. Other clients still work fine, but the MDS couldn't replay it's journal and such. Enabling tcp_sack again resolved the problem. The new admin socket really helped there! > Reverting back to the Ubuntu 14.04 defaults resolved it all and CephFS > is running happily now. > > I'll dig some deeper to see why this system was affected by those > changes. I applied these settings earlier on a RBD-only cluster without > any problems. > >> -Greg >> > > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html