On Tue, Jan 30, 2018 at 04:31:58PM -0500, J. Bruce Fields wrote: > On Tue, Jan 30, 2018 at 07:03:17PM +0000, Terry Barnaby wrote: > > It looks like each RPC call takes about 0.5ms. Why do there need to be some > > many RPC calls for this ? The OPEN call could set the attribs, no need for > > the later GETATTR or SETATTR calls. > > The first SETATTR (which sets ctime and mtime to server's time) seems > unnecessary, maybe there's a client bug. > > The second looks like tar's fault, strace shows it doing a utimensat() > on each file. I don't know why or if that's optional. > > > Even the CLOSE could be integrated with the WRITE and taking this > > further OPEN could do OPEN, SETATTR, and some WRITE all in one. > > We'd probably need some new protocol to make it safe to return from the > open systemcall before we've gotten the OPEN reply from the server. > > Write delegations might save us from having to wait for the other > operations. > > Taking a look at my own setup, I see the same calls taking about 1ms. > The drives can't do that, so I've got a problem somewhere too.... Whoops, I totally forgot it was still set up with an external journal on SSD: # tune2fs -l /dev/mapper/export-export |grep '^Journal' Journal UUID: dc356049-6e2f-4e74-b185-5357bee73a32 Journal device: 0x0803 Journal backup: inode blocks # blkid --uuid dc356049-6e2f-4e74-b185-5357bee73a32 /dev/sda3 # cat /sys/block/sda/device/model INTEL SSDSA2M080 So, most of the data is striped across a couple big hard drives, but the journal is actually on a small partition on an SSD. If I remember correctly, I initially tried this with an older intel SSD and didn't get a performance improvement. Then I replaced it with this model which has the "Enhanced Power Loss Data Protection" feature, which I believe means the write cache is durable, so it should be able to safely acknowledge writes as soon as they reach the SSD's cache. And weirdly I think I never actually got around to rerunning these tests after I installed the new SSD. Anyway, so that might explain the difference we're seeing. I'm not sure how to find new SSDs with that feature, but it may be worth considering as a cheap way to accelerate this kind of workload. It can be a very small SSD as it only needs to hold the journal. Adding an external journal is a quick operation (you don't have to recreate the filesystem or anything). --b. _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx