Re: scrubber crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 06/01/2015 02:23 PM, Venky Shankar wrote:


On 06/01/2015 01:09 PM, Anand Nekkunti wrote:
Hi Venky
one of regression test in my patch, I found core dump from scrubber . Please have a look.

Link :http://build.gluster.org/job/rackspace-regression-2GB-triggered/9925/consoleFull

bt fir core ...

(gdb) bt
#0 0x00007f89d6224731 in gf_tw_mod_timer_pending (base=0xf2fbc0, timer=0x0, expires=233889) at /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/contrib/timer-wheel/timer-wheel.c:239 #1 0x00007f89c82ce7e8 in br_fsscan_reschedule (this=0x7f89c4008980, child=0x7f89c4011238, fsscan=0x7f89c4012290, fsscrub=0x7f89c4010010, pendingcheck=_gf_true)

The crash happens when scrubber is paused as reconfigure() blindly accesses scrubber specific data which is not available _after_ pause.

Thanks for reporting. I'll send a fix for this.
OK. This is not a straight forward crash. The crash is due to a race between CHILD_UP (marking the subvolume as "up" and initializing essential structures _later_) and reconfigure() which tries to access structures which are yet to be initialized.

For now we can induce delay before invoking reconfigure() {"pause" in the test case} and work on a proper fix for this.

Thoughts?

-Venky
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux