On Mon, Aug 7, 2017 at 7:58 AM, Ken Dreyer <kdreyer@xxxxxxxxxx> wrote: > On Wed, Aug 2, 2017 at 7:39 AM, Alfredo Deza <adeza@xxxxxxxxxx> wrote: >> The ceph-debuginfo package has continued to increase in size on almost >> every release, reaching 1.5GB for the latest luminous RC (12.1.2). >> >> To contrast that, the latest ceph-debuginfo in Hammer was about 0.73GB. >> >> Having packages that large is problematic on a few fronts: > > I agree Alfredo. Here's a similar issue I am experiencing with the source sizes: > > Jewel sizes: > 14M ceph-10.2.7.tar.gz > 82M ceph-10.2.7 uncompressed > > Luminous sizes: > 142M ceph-12.1.2.tar.gz > 709M ceph-12.1.2 uncompressed > > This adds minutes onto the build times when we must shuffle these > large artifacts around: > > - Upstream we're transferring the artifacts between Jenkins slaves and chacra > and download.ceph.com. > > - Downstream in Fedora/RHEL land we're uploading these source tars to > dist-git's lookaside cache, and it takes a while just to upload/download. > > - Downstream in Debian and Ubuntu (AFAICT) they upload the source tars to Git > with git-buildpackage, and this increases the time it takes to even "git > clone" these repos. > > The bundled Boost alone is is 474MB unpacked in 12.1.2. If we could > build Boost as a separate package (and not bundle it into ceph) it > would make it easier to manage builds upstream and downstream. > > We could build a boost package in the jenkins.ceph.com infrastructure, > or the CentOS Storage SIG (for RHEL-based distros), and then start > depending on that system instead of EPEL. For Debian/Ubuntu, we could > use jenkins.ceph.com/chacra or something else - any suggestions from > Debian/Ubuntu folks? I spent some time talking to Ken and Alfredo today to try and work their concerns into something understandable by happily package-building-unaware developers like myself. I've tried to distill that conversation into the points below: 1) They would *love* it if we started relying more on "external" packages and less on in-tree source, even if our packaging team is responsible for maintaining them. 2) The actual size of a full source checkout is an actual problem when building 600 packages a day (our systems are). If we can cut it down, we can get dev packages built more quickly! The biggest contributors anybody isolated are boost and inclusions like the web dev stuff for ceph-mgr. (I'm making no promises for him, but it sounded like Ken was going to investigate/push against the boost wall a bit more.) 3) ceph-debuginfo (and the .deb equivalents) are ginormous enough (so much so that it requires special configuration of our package serving infrastructure) Don't have much to say about (1) in isolation. As far as (2) goes, it's really convenient from a dev perspective to have one git checkout and its submodules to deal with, instead of needing to install a bunch of packages. But we already have our install-deps and we don't seem to update many of the dependencies that often. How much would it hurt to split out stuff into separate ceph-dev-* repos and packages we rely on? (We could probably even do separate ones for each Ceph release stream?) We do sometimes update the submodule and add an interface jump concurrent with that, but I don't think it's really often. Is it feasible from both sides to instead change what package version we depend on, and to start building a new package? On (3), there are a few causes. One is that we just have a lot of code. But a far bigger impact seems to come from all the ceph_test_* binaries and other things which we have statically linked with ceph-common et al. There are two approaches we can take there: we can figure out how to dynamically link them (which I haven't been involved in but recall being difficult — but also have caused other issues to us over the years that it would be good to resolve); separately we can be more picky about what debug info we actually put into ceph-debuginfo. We have a giant ceph-tests package that mixes up both the test binaries and very disaster-recovery-helpful stuff like ceph-objectstore-tool. If we could better segregate those, we can at least avoid distributing them to users. (We would probably still want debuginfo for the ceph-tests packages because we run them in teuthology. But I assume just splitting it would still do some good.) Hopefully that helps other people understand some of what we're all dealing with. :) -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html