OSD omap disk write bursts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello!


We are observing a somewhat strange IO pattern on our OSDs.


The cluster is running Hammer 0.94.1, 48 OSDs, 4 TB spinners, xfs, 

colocated journals.


Over periods of days on end we see groups of 3 OSDs being busy with 

lots and lots of small writes for several minutes at a time.

Once one group calms down, another group begins. Might be easier to 

understand in a graph:


https://public.centerdevice.de/3e62a18d-dd01-477e-b52b-f65d181e2920


(this shows a limited time range to make the individual lines 

discernable)


Initial attemps to correlate this to client activity with small writes, 

turned out to be wrong -- not really surprising, because both VM RBD 

activity, as well as RGW object storage should show much evenly spread 

patterns across all OSDs. 


Using sysdig I figured it seems to be LevelDB activity:


[16:58:42 B|daniel.schneller@node02] ~ 

  sudo sysdig -p "%12user.name %6proc.pid %12proc.name %3fd.num %fd.typechar %fd.name" "evt.type=write and proc.pid=8215"

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763308.log


... (*lots and lots* more writes to 763308.log ) ... 


root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763308.log

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763308.log

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

root         8215   ceph-osd     15  f /var/lib/ceph/osd/ceph-14/current/omap/LOG

root         8215   ceph-osd     15  f /var/lib/ceph/osd/ceph-14/current/omap/LOG

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb

root         8215   ceph-osd     153 f /var/lib/ceph/osd/ceph-14/current/omap/763311.ldb

... (*lots and lots* more writes to 763311.ldb ) ... 

root         8215   ceph-osd     15  f /var/lib/ceph/osd/ceph-14/current/omap/LOG

root         8215   ceph-osd     15  f /var/lib/ceph/osd/ceph-14/current/omap/LOG

root         8215   ceph-osd     18  f /var/lib/ceph/osd/ceph-14/current/omap/MANIFEST-171304

root         8215   ceph-osd     18  f /var/lib/ceph/osd/ceph-14/current/omap/MANIFEST-171304

root         8215   ceph-osd     15  f /var/lib/ceph/osd/ceph-14/current/omap/LOG

root         8215   ceph-osd     15  f /var/lib/ceph/osd/ceph-14/current/omap/LOG

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

root         8215   ceph-osd     103 f /var/lib/ceph/osd/ceph-14/current/omap/763310.log

... (*lots and lots* more writes to 763310.log ) ... 



This correlates to the patterns in the graph for the given OSDs. If I 

understand this correctly, it looks like LevelDB compaction -- however, 

if that is the case, why would that happen in groups of only three at a 

time and why would it hit a single OSD in short succession? See this 

single-OSD graph of the same time as  before:


https://public.centerdevice.de/ab5f417d-43af-435d-aad0-7becff2b9acb


Are there any regular / event based maintenance tasks that are ensured to only run on n (=3)

OSDs at time?


Can I do anything to smooth this out or reduce it somehow?


Thanks,

Daniel




-- 

Daniel Schneller

Principal Cloud Engineer

 

CenterDevice GmbH

https://www.centerdevice.de

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux