Hello, On Thu, 6 Jul 2017 14:34:41 -0700 Su, Zhan wrote: > Hi, > > We are running a Ceph cluster serving both batch workload (e.g. data import > / export, offline processing) and latency-sensitive workload. Currently > batch traffic causes a huge slow down in serving latency-sensitive requests > (e.g. streaming). When that happens, network is not the bottleneck (50%~60% > usage of the 10Gib link) and cpu looks to be fairly idle as well. Our > hypothesis is that requests hit the same drive and caused this slowdown. We > use spinning disks and they are bad at serving two sequential I/O at the > same time. > Don't hypothesize, verify it with atop, iostat, etc. But if you're using plain disks w/o any SSDs for journals or otherwise, that is most likely what happens, yes. > We would like to know whether there is a way to set Ceph or Ceph client so > operations for different workload are properly prioritized. Thanks. > Not that I'm aware of, there isn't even a way to tell which client is causing what activities at this time, which has been put forward multiple times here. You could have different pools with different crush rules to separate the distinct users so that those reads go to OSDs that aren't used by the batch stuff. Beyond that, journal SSDs (future WAL SSDs for Bluestore), SSDs for bcache or so to cache reads, SSD pools, cache-tiering, etc. Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Rakuten Communications _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com