Re: Slow performing ceph-volume

Sage Weil <sweil@xxxxxxxxxx> · Wed, 13 Nov 2019 15:47:54 +0000 (UTC)

On Wed, 13 Nov 2019, Dimitri Savineau wrote:
> Hi,
> 
> I discovered this back in June and reported the issue in [1].
> 
> The solution (for me it's only a workaround) was to set "--ulimit
> nofile=1024:4096" on the docker run calls in ceph-ansible [2][3].
> This is also implemented in the ceph/daemon container images because we're
> using ceph-volume/ceph-disk commands before running the ceph-osd process
> [4].

It looks like the underlying "problem" is that the close_fds flag in 
python makes python loop for every possible fd >2 up until the ulimit.  
This is a pretty brute-force approach and super-lame on python's part:

gnit:~ (master) 09:43 AM $ cat t.py
import subprocess
subprocess.call(['true'], close_fds=True)

gnit:~ (master) 09:43 AM $ ulimit -n 100000 ; rm -f xx ; time strace -f 
-oxx python t.py ; grep -c close xx

real    0m2.672s
user    0m0.901s
sys     0m2.327s
100106
gnit:~ (master) 09:43 AM $ ulimit -n 100 ; rm -f xx ; time strace -f -oxx 
python t.py ; grep -c close xx

real    0m0.079s
user    0m0.029s
sys     0m0.060s
207

But, GOOD NEWS!  Python3 is not so dumb!

gnit:~ (master) 09:44 AM $ ulimit -n 100000 ; rm -f xx ; time strace -f 
-oxx python3 t.py ; grep -c close xx

real    0m0.093s
user    0m0.060s
sys     0m0.040s
87
gnit:~ (master) 09:44 AM $ ulimit -n 1000 ; rm -f xx ; time strace -f -oxx 
python3 t.py ; grep -c close xx

real    0m0.086s
user    0m0.047s
sys     0m0.045s
86
gnit:~ (master) 09:44 AM $ python3 --version
Python 3.7.4

same result with

[root@smithi060 ~]# python3 --version
Python 3.6.8

Bottom line: I think the ulimit hack for ceph-ansible is fine for nautilus 
and olderw, and octopus won't have this problem at all once we make 
the py3 transition.

sage

> 
> Note that the issue is also present on non container deployment but the
> default max open files values are already set to 1024:4096 whereas in
> container it's set to 1048576.
> If you increase this value on non containerized deployment then you will
> see the same behaviour.
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1722562
> [2]
> https://github.com/ceph/ceph-ansible/blob/master/library/ceph_volume.py#L192
> [3]
> https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-osd/tasks/start_osds.yml#L24
> [4]
> https://github.com/ceph/ceph-container/blob/master/src/daemon/osd_scenarios/osd_volume_activate.sh#L66-L67
> 
> Regards,
> 
> Dimitri
> 
> On Wed, Nov 13, 2019 at 8:52 AM Sebastien Han <shan@xxxxxxxxxx> wrote:
> 
> > I think Dimitry found that weeks ago and did some changes in
> > ceph-ansible to speed that up (along the same line IIRC)
> > Dim, can you share what you did?
> >
> > Thanks!
> > –––––––––
> > Sébastien Han
> > Principal Software Engineer, Storage Architect
> >
> > "Always give 100%. Unless you're giving blood."
> >
> > On Wed, Nov 13, 2019 at 2:46 PM Sage Weil <sweil@xxxxxxxxxx> wrote:
> > >
> > > On Wed, 13 Nov 2019, Paul Cuzner wrote:
> > > > Hi Sage,
> > > >
> > > > So I tried switching out the udev calls to pyudev, and shaved a
> > whopping
> > > > 1sec from the timings..Looking deeper I found that the issue is
> > related to
> > > > *ALL* process.Popen calls (of which there are many!) - they all use
> > > > close_fds=True.
> > > >
> > > > My suspicion is that when running in a container the close_fds sees
> > fd's
> > > > from the host too - so it tries to tidy up more than it should. If you
> > set
> > > > ulimit -n 1024 or something and then try a ceph-volume inventory, it
> > should
> > > > just fly through! (at least it did for me)
> > > >
> > > > Let me know if this works for you.
> > >
> > > Yes.. that speeds of significantly!  1.5s -> .2s in my case.  I can't say
> > > that I understand why, though... it seems like ulimit -n will make file
> > > open attempts fail, but I don't see any failures.
> > >
> > > Can we drop the close_fds arg?
> > >
> > > sage
> > > _______________________________________________
> > > Dev mailing list -- dev@xxxxxxx
> > > To unsubscribe send an email to dev-leave@xxxxxxx
> > >
> >
> >
> 
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx