That's better :D Thanks a lot, now I will be able to troubleshoot my problem :) Thanks Dan, Andrija On 11 August 2014 13:21, Dan Van Der Ster <daniel.vanderster at cern.ch> wrote: > Hi, > I changed the script to be a bit more flexible with the osd path. Give > this a try again: > https://github.com/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl > Cheers, Dan > > -- Dan van der Ster || Data & Storage Services || CERN IT Department -- > > > On 11 Aug 2014, at 12:48, Andrija Panic <andrija.panic at gmail.com> wrote: > > I appologize, clicked the Send button to fast... > > Anyway, I can see there are lines in log file: > 2014-08-11 12:43:25.477693 7f022d257700 10 > filestore(/var/lib/ceph/osd/ceph-0) write > 3.48_head/14b1ca48/rbd_data.41e16619f5eb6.0000000000001bd1/head//3 > 3641344~4608 = 4608 > Not sure if I can do anything to fix this... ? > > Thanks, > Andrija > > > > On 11 August 2014 12:46, Andrija Panic <andrija.panic at gmail.com> wrote: > >> Hi Dan, >> >> the script provided seems to not work on my ceph cluster :( >> This is ceph version 0.80.3 >> >> I get empty results, on both debug level 10 and the maximum level of >> 20... >> >> [root at cs1 ~]# ./rbd-io-stats.pl /var/log/ceph/ceph-osd.0.log-20140811.gz >> Writes per OSD: >> Writes per pool: >> Writes per PG: >> Writes per RBD: >> Writes per object: >> Writes per length: >> . >> . >> . >> >> >> >> >> On 8 August 2014 16:01, Dan Van Der Ster <daniel.vanderster at cern.ch> >> wrote: >> >>> Hi, >>> >>> On 08 Aug 2014, at 15:55, Andrija Panic <andrija.panic at gmail.com> >>> wrote: >>> >>> Hi Dan, >>> >>> thank you very much for the script, will check it out...no thortling >>> so far, but I guess it will have to be done... >>> >>> This seems to read only gziped logs? >>> >>> >>> Well it?s pretty simple, and it zcat?s each input file. So yes, only >>> gz files in the current script. But you can change that pretty trivially ;) >>> >>> so since read only I guess it is safe to run it on proudction cluster >>> now? ? >>> >>> >>> I personally don?t do anything new on a Friday just before leaving ;) >>> >>> But its just grepping the log files, so start with one, then two, >>> then... >>> >>> The script will also check for mulitply OSDs as far as I can >>> understadn, not just osd.0 given in script comment ? >>> >>> >>> Yup, what I do is gather all of the OSD logs for a single day in a >>> single directory (in CephFS ;), then run that script on all of the OSDs. It >>> takes awhile, but it will give you the overall daily totals for the whole >>> cluster. >>> >>> If you are only trying to find the top users, then it is sufficient to >>> check a subset of OSDs, since by their nature the client IOs are spread >>> across most/all OSDs. >>> >>> Cheers, Dan >>> >>> Thanks a lot. >>> Andrija >>> >>> >>> >>> >>> On 8 August 2014 15:44, Dan Van Der Ster <daniel.vanderster at cern.ch> >>> wrote: >>> >>>> Hi, >>>> Here?s what we do to identify our top RBD users. >>>> >>>> First, enable log level 10 for the filestore so you can see all the >>>> IOs coming from the VMs. Then use a script like this (used on a dumpling >>>> cluster): >>>> >>>> >>>> https://github.com/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl >>>> >>>> to summarize the osd logs and identify the top clients. >>>> >>>> Then its just a matter of scripting to figure out the ops/sec per >>>> volume, but for us at least the main use-case has been to identify who is >>>> responsible for a new peak in overall ops ? and daily-granular statistics >>>> from the above script tends to suffice. >>>> >>>> BTW, do you throttle your clients? We found that its absolutely >>>> necessary, since without a throttle just a few active VMs can eat up the >>>> entire iops capacity of the cluster. >>>> >>>> Cheers, Dan >>>> >>>> -- Dan van der Ster || Data & Storage Services || CERN IT Department -- >>>> >>>> >>>> On 08 Aug 2014, at 13:51, Andrija Panic <andrija.panic at gmail.com> >>>> wrote: >>>> >>>> Hi, >>>> >>>> we just had some new clients, and have suffered very big degradation >>>> in CEPH performance for some reasons (we are using CloudStack). >>>> >>>> I'm wondering if there is way to monitor OP/s or similar usage by >>>> client connected, so we can isolate the heavy client ? >>>> >>>> Also, what is the general best practice to monitor these kind of >>>> changes in CEPH ? I'm talking about R/W or OP/s change or similar... >>>> >>>> Thanks, >>>> -- >>>> >>>> Andrija Pani? >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users at lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>>> >>> >>> >>> -- >>> >>> Andrija Pani? >>> -------------------------------------- >>> http://admintweets.com >>> -------------------------------------- >>> >>> >>> >> >> >> -- >> >> Andrija Pani? >> -------------------------------------- >> http://admintweets.com >> -------------------------------------- >> > > > > -- > > Andrija Pani? > -------------------------------------- > http://admintweets.com > -------------------------------------- > > > -- Andrija Pani? -------------------------------------- http://admintweets.com -------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140811/2a048cad/attachment.htm>