On Thu, Sep 10, 2009 at 5:37 PM, Stephan von Krawczynski<skraw at ithnet.com> wrote: > >> > Only if backed up. Has the trace been shown to the linux developers? >> > What do they think? > > Maybe we should just ask questions about the source before bothering others... > > From 2.0.6 /transport/socket/src/socket.c line 867 ff: > > new_trans = CALLOC (1, sizeof (*new_trans)); > new_trans->xl = this->xl; > new_trans->fini = this->fini; > > memcpy (&new_trans->peerinfo.sockaddr, &new_sockaddr, > addrlen); > new_trans->peerinfo.sockaddr_len = addrlen; > > new_trans->myinfo.sockaddr_len = > sizeof (new_trans->myinfo.sockaddr); > > ret = getsockname (new_sock, > SA (&new_trans->myinfo.sockaddr), > &new_trans->myinfo.sockaddr_len); > > CALLOC from libglusterfs/src/mem-pool.h: > #define CALLOC(cnt,size) calloc(cnt,size) > > man calloc: > RETURN VALUE > For calloc() and malloc(), the value returned is a pointer to the allocated memory, which is suitably aligned for any > kind of variable, or NULL if the request fails. > > > Did I understand the source? What about calloc returning NULL? Now, failing to check for NULL pointer here is a bug which we will fix in future releases (blame it on our laziness for not doing the check already!) Thanks for pointing it out. As you can see if the bug is related to glusterfs we gracefully accept and fix it! Not accepting a problem in glusterfs will be counter productive for us. If you report a bug in glusterfs we thank you. Server kernel lockup is not a glusterfs related problem and we do not have any control over it :-) Anand and Mark have clearly and patiently explained why. As Mark suggested you can post it on Linux Kernel Mailing List, please get back to us even if one of the kernel developers reply that the kernel lockup you saw is not a kernel bug. Talking about analogy, in a car assume that engine is the glusterfs and tyres the kernel. If you get flat tyres and the car doesn't move you can't blame the engine! Thanks Krishna