Re: SIGSEGV in cephfs-java, but probably in Ceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I pulled the Java lib from https://github.com/noahdesu/ceph/tree/wip-java-cephfs
However, I use ceph 0.47.1 installed directly from Ubuntu's repository
with apt-get, not the one that I built with the java library. I
assumed that since the java lib is just a wrapper.

>>There are only two segfaults that I've ever encountered, one in which the C wrappers are used with an unmounted client, and the error Nam is seeing (although they
>> could be related). I will re-submit an updated patch for the former, which should rule that out as the culprit.

No, this occurs when I call mount(null) with the monitor being taken
down. The library should throw an Exception instead, but since SIGSEGV
originates from libcephfs.so so I guess it's more related to Ceph's
internal code.

Best regards,

Nam Dang
Tokyo Institute of Technology
Tokyo, Japan


On Fri, Jun 1, 2012 at 8:58 AM, Noah Watkins <jayhawk@xxxxxxxxxxx> wrote:
>
> On May 31, 2012, at 3:39 PM, Greg Farnum wrote:
>>>
>>> Nevermind to my last comment. Hmm, I've seen this, but very rarely.
>> Noah, do you have any leads on this? Do you think it's a bug in your Java code or in the C/++ libraries?
>
> I _think_ this is because the JVM uses its own threading library, and Ceph assumes pthreads and pthread compatible mutexes--is that assumption about Ceph correct? Hence the error that looks like Mutex::lock(bool) being reference for context during the segfault. To verify this all that is needed is some synchronization added to the Java.
>
> There are only two segfaults that I've ever encountered, one in which the C wrappers are used with an unmounted client, and the error Nam is seeing (although they could be related). I will re-submit an updated patch for the former, which should rule that out as the culprit.
>
> Nam: where are you grabbing the Java patches from? I'll push some updates.
>
>
> The only other scenario that comes to mind is related to signaling:
>
> The RADOS Java wrappers suffered from an interaction between the JVM and RADOS client signal handlers, in which either the JVM or RADOS would replace the handlers for the other (not sure which order). Anyway, the solution was to link in the JVM libjsig.so signal chaining library. This might be the same thing we are seeing here, but I'm betting it is the first theory I mentioned.
>
> - Noah
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux