Just adding a note in case you hadn't noticed that updatedb itself has a CLI for managing the .conf: --add-prune{fs,names,paths}. Sadly, there is no "--remove", but at least it lets the conf file format be abstract. +1 on "everything has a .d/ dir" though. On 02/20/2014 10:47 AM, Sage Weil wrote: > On Tue, 18 Feb 2014, bernhard glomm wrote: >> Well IMO the cleanest solution would be to convince the >> mlocate devels to introduce an /etc/updatedb.d/ directory >> so every package could drop their snippet there. > > I agree. Does somebody want to take the lead on that one? It seems like > the right long-term solution, even if it doesn't help us much right now. > > sage > >> Since that might be a rather long shot >> I agree that it is a a job for general management >> >> In cfengine it would just read something like: >> ... >> edit_line => regex_replace("(^\s*PRUNEFS=\")(?!ceph)(.*$)","$(match.1)ceph $(match.2)"); >> ... >> With a proper self containing bundle once called from postinst that would survive any >> upgrade of mlocate with minimal impact. >> >> Regards, >> Bernhard >> >>> On Feb 18, 2014, at 7:20 PM, Dan van der Ster <daniel.vanderster@xxxxxxx> wrote: >>> >>> Hi, >>> >>> On Tue, Feb 18, 2014 at 6:52 PM, Gaudenz Steinlin <gaudenz@xxxxxxxxxx> wrote: >>>> >>>> Hi >>>> >>>> Sage Weil <sage@xxxxxxxxxxx> writes: >>>> >>>>> Dan at CERN noticed that his performance was tanking because updatedb was >>>>> running against /var/lib/ceph. updatedb has a PRUNE line in >>>>> /etc/updatedb.conf that we should presumably be adding ourselves to. One >>>>> user pointed out a package that uses sed to rewrite this line in the init >>>>> script on start. >>>>> >>>>> I have two questions: >>>>> >>>>> - is there no better way than sed to add ourselves to this list? >>>>> - should we do it in the init script, or postinst, or both? >>>>> >>>>> Presumably this is a problem others have solved with other packages. >>>> >>>> At least for Debian neither solution is appropriate. Changing other >>>> packages conffiles in postinst scripts is forbidden by policy. There is >>>> also no way to preserve this reliably over upgrades without user >>>> interaction. I'm not sure if there is an explicit policy for init >>>> scripts, but this seems equally wrong. Also it's unclear how one would >>>> handle the case were an administrator does NOT want to exclude this >>>> directory. >>>> >>>> The only solution I see if you really want to completely exclude the >>>> directory is to convince the mlocate maintainers to either add the >>>> directory to the default configuration or to add something like an >>>> /etc/updatedb.d directory where packages can drop configuration file >>>> snippets. But the latter seems like overkill to me. >>>> >>>> The real question to me is why an updatedb run can drastically impact >>>> ceph performance. At least in Debian updatedb is run with ionice >>>> -c3 in the "Idle" scheduling class. According to the man page this >>>> means: "A program running with idle I/O priority will only get disk time >>>> when no other program has asked for disk I/O for a defined grace period. >>>> The impact of an idle I/O process on normal system activity should be >>>> zero. This scheduling class does not take a priority argument. >>>> Presently, this scheduling class is permitted for an ordinary user >>>> (since kernel 2.6.25)." So it should not have any negative effect. >>>> >>>> Maybe CERN (or the distribution they use) should also run updatedb under >>>> ionice. >>> >>> updatedb _is_ run under ionice on our systems (RHEL), but IO >>> scheduling classes are only implemented by the cfq scheduler. >>> We use deadline, which is recommended for an enterprise disk server, >>> and indeed measured more stable IO latencies with deadline than with >>> cfq. And when you run updatedb on a deadline scheduled drive, there >>> are so many reads queued up that writes can be starved for many 10s of >>> seconds. >>> >>> In our case, we've already added /var/lib/ceph to the PRUNEPATHS via >>> their puppet configuration. Though an upstream solution is probably a >>> good idea since I would assume that most ceph deployments use deadline >>> and this would hit most of them eventually once they have enough >>> files. In addition, as someone else mentioned, you'd better add ceph >>> to the pruned fs types as well, just like /afs is pruned at the >>> moment, lest every client spend all day stat'ing their big cephfs >>> namespace. >>> >>> Cheers, Dan >>> >>> -- Dan van der Ster || Data & Storage Services || CERN IT Department -- >>> >>>> >>>> Gaudenz >>>> >>>> -- >>>> Ever tried. Ever failed. No matter. >>>> Try again. Fail again. Fail better. >>>> ~ Samuel Beckett ~ >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html