I pulled the Java lib from https://github.com/noahdesu/ceph/tree/wip-java-cephfs However, I use ceph 0.47.1 installed directly from Ubuntu's repository with apt-get, not the one that I built with the java library. I assumed that since the java lib is just a wrapper. >>There are only two segfaults that I've ever encountered, one in which the C wrappers are used with an unmounted client, and the error Nam is seeing (although they >> could be related). I will re-submit an updated patch for the former, which should rule that out as the culprit. No, this occurs when I call mount(null) with the monitor being taken down. The library should throw an Exception instead, but since SIGSEGV originates from libcephfs.so so I guess it's more related to Ceph's internal code. Best regards, Nam Dang Tokyo Institute of Technology Tokyo, Japan On Fri, Jun 1, 2012 at 8:58 AM, Noah Watkins <jayhawk@xxxxxxxxxxx> wrote: > > On May 31, 2012, at 3:39 PM, Greg Farnum wrote: >>> >>> Nevermind to my last comment. Hmm, I've seen this, but very rarely. >> Noah, do you have any leads on this? Do you think it's a bug in your Java code or in the C/++ libraries? > > I _think_ this is because the JVM uses its own threading library, and Ceph assumes pthreads and pthread compatible mutexes--is that assumption about Ceph correct? Hence the error that looks like Mutex::lock(bool) being reference for context during the segfault. To verify this all that is needed is some synchronization added to the Java. > > There are only two segfaults that I've ever encountered, one in which the C wrappers are used with an unmounted client, and the error Nam is seeing (although they could be related). I will re-submit an updated patch for the former, which should rule that out as the culprit. > > Nam: where are you grabbing the Java patches from? I'll push some updates. > > > The only other scenario that comes to mind is related to signaling: > > The RADOS Java wrappers suffered from an interaction between the JVM and RADOS client signal handlers, in which either the JVM or RADOS would replace the handlers for the other (not sure which order). Anyway, the solution was to link in the JVM libjsig.so signal chaining library. This might be the same thing we are seeing here, but I'm betting it is the first theory I mentioned. > > - Noah -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html