On Tue, Mar 24, 2015 at 05:18:44PM +0000, Emmanuel Dreyfus wrote: > Hi > > The merge of http://review.gluster.org/9953/ removed a few crashes from > NetBSD regression tests, but the thing remains uterly broken since the > merge of http://review.gluster.org/9708/ though I cannot tell if I have > bugs leftover form this commit or if I face new problems. > > Here are the known problem so far: ...snip! I'll only give some info to your 2nd point. > 2) I still experience memory corruption, which usually crash glsuterfsd > because some pointer waas replaced by value 0x3. This strikes on iobref > most of the time, but it can happens elsewhere. > > I would be glad if someone could help here. On nbslave70:/autobuild I > added code to check for iobref/iobuf sanity at random place (by calling > iobref_sanity()). I do this in synask_wrap and in STACK_WIND/UNWIND, > but I have not been able to spot the source of the problem yet. > > The weird thing is that memory seems to always be overwritten by the > same values, and magic 0xcafebabe number before the buffer is preserved. > Here is an example: where iobref->iobrefs = 0xbb11a458 > 0xbb11a44c: 0xcafebabe 0x00000000 0x00000000 0x00000003 > 0xbb11a45c: 0x00000003 0x00000008 0x00000003 0x0000000c > 0xbb11a46c: 0x00000003 0x0000000e 0x00000003 0x00000010 > 0xbb11a47c: 0x00000003 0x00000009 0x00000003 0x0000000d > 0xbb11a48c: 0x00000003 0x00000015 0x00000003 0x00000016 > 0xbb11a49c: 0x00000003 0x00000032 0x00000034 0xbb1e2018 > 0xbb11a4ac: 0xcafebabe 0x00000000 0x00000000 0xbb11a5d8 Recently I was looking into something that involved some more understanding of GF_MALLOC(). I did not really continue with it becase other things got a higher priority. But, maybe this layout helps you a little: : : : : +----------------------+ | GF_MEM_TRAILER_MAGIC | +----------------------+ | | | ... | | | +----------------------+ | 8 bytes | +----------------------+ | GF_MEM_HEADER_MAGIC | +----------------------+ | *xlator_t | +----------------------+ | size | +----------------------+ | type | +----------------------+ : : : : #define GF_MEM_HEADER_MAGIC 0xCAFEBABE #define GF_MEM_TRAILER_MAGIC 0xBAADF00D Because there is no 0xbaadfood in your memory dump, I would assume that the memory has just been allocated, and the 0xcafebabe at 0xbb11a4ac is a left over from a previous allocation. You could try to run a test with more strict memory enforcing. All the GF_ASSERT() calls will actually call abort() in that case, and it may make things a little easier to debug. You would pass --enable-debug to the configure commandline: $ ./configure --enable-debug I hope that we will be able to setup scheduled automated regression tests with --enable-debug build binaries. It may be helpful to catch unintended NULL usage a little earlier. HTH, Niels _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel