Re: ceph/hadoop benchmarking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Comments below

On 10/26/11 2:34 PM, Sage Weil wrote:
[adding ceph-devel CC]

On Wed, 26 Oct 2011, Noah Watkins wrote:
----- Original Message -----
From: "Sage Weil"<sage@xxxxxxxxxxxx>


There was some packaging/cleanup work mentioned a while back on the
mailing list. Is anyone working on this?

Nope.

I'm not really sure what the "right' way to build/expose/package java
bindings is.. but presumably it's less annoying than what we have now
:)

I'm revisiting this now to prepare packages and instructions for the
students running benchmarks this quarter, but I wanted to get your
input on future goals to not waste too much effort doing short-term
stuff. The possible approaches I see:

1) Everything (JNI wrappers + Java) lives in the Ceph tree and build
    instructions include applying a patch to Hadoop.

    The current solution, and isn't too bad but needs documentation. This
    approach can be simplified to avoid patching Hadoop by integrating
    Ceph-specific Java code using the CLASSPATH global variable.

2) Everything is sent to Hadoop upstream.

    This is convenient because the Hadoop infrastructure already has
    the facilities for building and linking native code into Hadoop,
    and the only depenendency then becomes a standard ceph-devel
    installation.

    This was the approach taken with the kernel client version which
    also included JNI code (for ioctl).

3) Only JNI wrappers live in the Ceph tree and push Java patch upstream.

    This could be better if it is anticipated that libceph will see a lot
    of churn in the future, and we'd avoid pushing more changes upstream.

#3 strikes me as the right approach, since there are potentially other
Java users of libcephfs, and the Hadoop CephFileSystem class will be best
maintained (and most usable) if it upstream.  There will just be the
initial pain of getting the packaging right for libcephfs so that it will
work w/ hadoop out of the box.  (I'm assuming that is possible.. e.g.
apt-get install ceph, fire up hadoop with the proper config?)

A couple comments about libceph-java: After looking through a bunch of Debian Java packages, it seems a common approach to packaging JNI/Java code is using the scheme:

  libcephfs-jni  --> .so
  libcephfs-java --> .jar

The debhelper tool and friends seem to do a good job of packaging up the Java in this way, but integrating this into Ceph's default Debian scripts means that anyone would now need a JDK to build Ceph with dpkg-buildpackages.

Is there a way to parametrize the Debian build process so people who don't care about Java bindings can proceed? An alternative approach is to say have another set of Debian build scripts in src/client/java/debian.

Thanks,
Noah

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux