Re: Issue with THIS and libgfapi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/12/2015 07:36 PM, Poornima Gurusiddaiah wrote:
Hi,

We recently uncovered an issue with THIS and libgfapi, it can be generalized to any process having multiple glusterfs_ctxs.

Before the master xlator (fuse/libgfapi) is created, all the code that access THIS will be using global_xlator object,
defined globally for the whole of the process.
The problem is when multiple threads start modifying THIS, and overwrite thr global_xlators' ctx eg: glfs_new:
glfs_new () {
...
ctx = glusterfs_ctx_new();
glusterfs_globals_inti();
THIS = NULL;  /* implies THIS = &global_xlator */
THIS->ctx = ctx;
...
}
The issue is more severe than it appears, as the other threads like epoll, timer, sigwaiter, when not executing in
fop context will always refer to the global_xlator and global_xlator->ctx. Because of the probable race condition
explained above we may be referring to the stale ctxs and could lead to crashes.

Probable solution:
Currently THIS is thread specific, but the global xlator object it modifies is global to all threads!!
The obvious association would be to have global_xlator per ctx instead of per process.
The changes would be as follows:
- Have a new global_xlator object in glusterfs_ctx.
- After every creation of new ctx assign
  <store THIS>
  THIS = new_ctx->global_xlator
  <restore THIS>
- But how to set the THIS in every thread(epoll, timer etc) that gets created as a part of that ctx.
  Replace all the pthread_create for the ctx threads, with gf_pthread_create:
 
 gf_pthread_create (fn,..., ctx) {
  ...
  thr_id = pthread_create (global_thread_init, fn, ctx...);
  ...
  }

  global_thread_init (fn, ctx, args) {
  THIS = ctx->global_xlator;
  fn(args);
  }

 The other solution would be to not associate threads with ctx, instead shared among ctxs

Please let me know your thoughts on the same.

Regards,
Poornima

Hi Poornima,

Recently with glusterfs-3.7 beta1 rpms, while create VM Image using qemu-img, I see the following errors :

[2015-05-08 09:04:14.358896] E [rpc-transport.c:512:rpc_transport_unref] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f51f6bb6516] (--> /lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7f51f965e493] (--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7f51f96617dc] (--> /lib64/libglusterfs.so.0(+0x1edc1)[0x7f51f6bb2dc1] (--> /lib64/libglusterfs.so.0(+0x1ed55)[0x7f51f6bb2d55] ))))) 0-rpc_transport: invalid argument: this
[2015-05-08 09:04:14.359085] E [rpc-transport.c:512:rpc_transport_unref] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f51f6bb6516] (--> /lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7f51f965e493] (--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7f51f96617dc] (--> /lib64/libglusterfs.so.0(+0x1edc1)[0x7f51f6bb2dc1] (--> /lib64/libglusterfs.so.0(+0x1ed55)[0x7f51f6bb2d55] ))))) 0-rpc_transport: invalid argument: this
[2015-05-08 09:04:14.359241] E [rpc-transport.c:512:rpc_transport_unref] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f51f6bb6516] (--> /lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7f51f965e493] (--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7f51f96617dc] (--> /lib64/libglusterfs.so.0(+0x1edc1)[0x7f51f6bb2dc1] (--> /lib64/libglusterfs.so.0(+0x1ed55)[0x7f51f6bb2d55] ))))) 0-rpc_transport: invalid argument: this

Is this the consequence of the issue that you are talking about ?


-- Satheesaran
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux