On 12/18/2014 05:32 PM, Atchley, Scott wrote: > On Dec 18, 2014, at 10:54 AM, Wido den Hollander <wido@xxxxxxxx> wrote: > >> On 12/18/2014 11:13 AM, Wido den Hollander wrote: >>> On 12/17/2014 07:42 PM, Gregory Farnum wrote: >>>> On Wed, Dec 17, 2014 at 8:35 AM, Wido den Hollander <wido@xxxxxxxx> wrote: >>>>> Hi, >>>>> >>>>> Today I've been playing with CephFS and the morning started great with >>>>> CephFS playing along just fine. >>>>> >>>>> Some information first: >>>>> - Ceph 0.89 >>>>> - Linux kernel 3.18 >>>>> - Ceph fuse 0.89 >>>>> - One Active MDS, one Standby >>>>> >>>>> This morning I could write a 10GB file like this using the kclient: >>>>> $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync >>>>> >>>>> That gave me 850MB/sec (all 10G network) and I could read the same file >>>>> again with 610MB/sec. >>>>> >>>>> After writing to it multiple times it suddenly started to hang. >>>>> >>>>> No real evidence on the MDS (debug mds set to 20) or anything on the >>>>> client. That specific operation just blocked, but I could still 'ls' the >>>>> filesystem in a second terminal. >>>>> >>>>> The MDS was showing in it's log that it was checking active sessions of >>>>> clients. It showed the active session of my single client. >>>>> >>>>> The client renewed it's caps and proceeded. >>>> >>>> Can you clarify this? I'm not quite sure what you mean. >>>> >>> >>> I currently don't have the logs available. That was my problem when >>> typing the original e-mail. >>> >>>>> I currently don't have any logs, but I'm just looking for a direction to >>>>> be pointed towards. >>>>> >>>>> Any ideas? >>>> >>>> Well, now that you're on v0.89 you should explore the admin >>>> socket...there are commands on the MDS to dump ops in flight (and >>>> maybe to look at session states? I don't remember when that merged). >>> >>> Sage's pointer towards the kernel debugging and the new admin socket >>> showed me that it were RADOS calls which were hanging. >>> >>> I investigated even further and it seems that this is not a CephFS >>> problem, but a local TCP issue which is only triggered when using CephFS. >>> >>> At some point, which is still unclear to me, data transfer becomes very >>> slow. The MDS doesn't seem to be able to update the journal and the >>> client can't write to the OSDs anymore. >>> >>> It happened after I did some very basic TCP tuning (timestamp, rmem, >>> wmem, sack, fastopen). >>> >> >> So it was tcp_sack. With tcp_sack=0 the MDS has problems talking to >> OSDs. Other clients still work fine, but the MDS couldn't replay it's >> journal and such. >> >> Enabling tcp_sack again resolved the problem. The new admin socket >> really helped there! > > What was the reasoning behind disabling SACK to begin with? Without it, any drops or reordering might require resending potentially a lot of data. > I was testing with various TCP settings and sack was one of those. Didn't think about it earlier that it might be the problem. >> >>> Reverting back to the Ubuntu 14.04 defaults resolved it all and CephFS >>> is running happily now. >>> >>> I'll dig some deeper to see why this system was affected by those >>> changes. I applied these settings earlier on a RBD-only cluster without >>> any problems. >>> >>>> -Greg >>>> >>> >>> >> >> >> -- >> Wido den Hollander >> 42on B.V. >> Ceph trainer and consultant >> >> Phone: +31 (0)20 700 9902 >> Skype: contact42on >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html