On Dec 18, 2014, at 10:54 AM, Wido den Hollander <wido@xxxxxxxx> wrote: > On 12/18/2014 11:13 AM, Wido den Hollander wrote: >> On 12/17/2014 07:42 PM, Gregory Farnum wrote: >>> On Wed, Dec 17, 2014 at 8:35 AM, Wido den Hollander <wido@xxxxxxxx> wrote: >>>> Hi, >>>> >>>> Today I've been playing with CephFS and the morning started great with >>>> CephFS playing along just fine. >>>> >>>> Some information first: >>>> - Ceph 0.89 >>>> - Linux kernel 3.18 >>>> - Ceph fuse 0.89 >>>> - One Active MDS, one Standby >>>> >>>> This morning I could write a 10GB file like this using the kclient: >>>> $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync >>>> >>>> That gave me 850MB/sec (all 10G network) and I could read the same file >>>> again with 610MB/sec. >>>> >>>> After writing to it multiple times it suddenly started to hang. >>>> >>>> No real evidence on the MDS (debug mds set to 20) or anything on the >>>> client. That specific operation just blocked, but I could still 'ls' the >>>> filesystem in a second terminal. >>>> >>>> The MDS was showing in it's log that it was checking active sessions of >>>> clients. It showed the active session of my single client. >>>> >>>> The client renewed it's caps and proceeded. >>> >>> Can you clarify this? I'm not quite sure what you mean. >>> >> >> I currently don't have the logs available. That was my problem when >> typing the original e-mail. >> >>>> I currently don't have any logs, but I'm just looking for a direction to >>>> be pointed towards. >>>> >>>> Any ideas? >>> >>> Well, now that you're on v0.89 you should explore the admin >>> socket...there are commands on the MDS to dump ops in flight (and >>> maybe to look at session states? I don't remember when that merged). >> >> Sage's pointer towards the kernel debugging and the new admin socket >> showed me that it were RADOS calls which were hanging. >> >> I investigated even further and it seems that this is not a CephFS >> problem, but a local TCP issue which is only triggered when using CephFS. >> >> At some point, which is still unclear to me, data transfer becomes very >> slow. The MDS doesn't seem to be able to update the journal and the >> client can't write to the OSDs anymore. >> >> It happened after I did some very basic TCP tuning (timestamp, rmem, >> wmem, sack, fastopen). >> > > So it was tcp_sack. With tcp_sack=0 the MDS has problems talking to > OSDs. Other clients still work fine, but the MDS couldn't replay it's > journal and such. > > Enabling tcp_sack again resolved the problem. The new admin socket > really helped there! What was the reasoning behind disabling SACK to begin with? Without it, any drops or reordering might require resending potentially a lot of data. > >> Reverting back to the Ubuntu 14.04 defaults resolved it all and CephFS >> is running happily now. >> >> I'll dig some deeper to see why this system was affected by those >> changes. I applied these settings earlier on a RBD-only cluster without >> any problems. >> >>> -Greg >>> >> >> > > > -- > Wido den Hollander > 42on B.V. > Ceph trainer and consultant > > Phone: +31 (0)20 700 9902 > Skype: contact42on > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html