I have not used valgrind before, so I may be wrong here.
I think the valgrind_stack_deregister should have been after GF_FREE_STACK.From http://valgrind.org/docs/
It seems unlikely to me since we are allocating the stack from heap.
Did you try a run with the valgrind instrumentation, without changing stack size ?
None of this explains the crash though.. We had seen a memory overrun crash in same code path on netbsd earlier but did not follow up then.
Will look further into it.
On Thu, Jun 8, 2017 at 4:51 PM, Kinglong Mee <kinglongmee@xxxxxxxxx> wrote:
Maybe it's my fault, I found valgrind can't parse context switch(makecontext/swapcontext) by default.
So, I test with the following patch (tells valgrind new stack by VALGRIND_STACK_DEREGISTER).
With it, only some "Invalid read/write" by __gf_mem_invalidate, Does it right ??
So, there is only one problem, if without io-threads, the stack size is small for marker.
Am I right?
Ps:
valgrind-before.log is the log without the following patch, the valgrind-after.log is with the patch.
==35656== Invalid write of size 8
==35656== at 0x4E8FFD4: __gf_mem_invalidate (mem-pool.c:278)
==35656== by 0x4E90313: __gf_free (mem-pool.c:334)
==35656== by 0x4EA4E5B: synctask_destroy (syncop.c:394)
==35656== by 0x4EA4EDF: synctask_done (syncop.c:412)
==35656== by 0x4EA58B3: synctask_switchto (syncop.c:673)
==35656== by 0x4EA596B: syncenv_processor (syncop.c:704)
==35656== by 0x60B2DC4: start_thread (in /usr/lib64/libpthread-2.17.so)
==35656== by 0x67A873C: clone (in /usr/lib64/libc-2.17.so)
==35656== Address 0x1b104931 is 2,068,017 bytes inside a block of size 2,097,224 alloc'd
==35656== at 0x4C29975: calloc (vg_replace_malloc.c:711)
==35656== by 0x4E8FA5E: __gf_calloc (mem-pool.c:117)
==35656== by 0x4EA52F5: synctask_create (syncop.c:500)
==35656== by 0x4EA55AE: synctask_new1 (syncop.c:576)
==35656== by 0x143AE0D7: mq_synctask1 (marker-quota.c:1078)
==35656== by 0x143AE199: mq_synctask (marker-quota.c:1097)
==35656== by 0x143AE6F6: _mq_create_xattrs_txn (marker-quota.c:1236)
==35656== by 0x143AE82D: mq_create_xattrs_txn (marker-quota.c:1253)
==35656== by 0x143B0DCB: mq_inspect_directory_xattr (marker-quota.c:2027)
==35656== by 0x143B13A8: mq_xattr_state (marker-quota.c:2117)
==35656== by 0x143A6E80: marker_lookup_cbk (marker.c:2961)
==35656== by 0x141811E0: up_lookup_cbk (upcall.c:753)
----------------------- valgrind ------------------------------------------
Don't forget install valgrind-devel.
diff --git a/libglusterfs/src/syncop.c b/libglusterfs/src/syncop.c
index 00a9b57..97b1de1 100644
--- a/libglusterfs/src/syncop.c
+++ b/libglusterfs/src/syncop.c
@@ -10,6 +10,7 @@
#include "syncop.h"
#include "libglusterfs-messages.h"
+#include <valgrind/valgrind.h>
int
syncopctx_setfsuid (void *uid)
@@ -388,6 +389,8 @@ synctask_destroy (struct synctask *task)
if (!task)
return;
+VALGRIND_STACK_DEREGISTER(task->valgrind_ret);
+
GF_FREE (task->stack);
if (task->opframe)
@@ -509,6 +512,8 @@ synctask_create (struct syncenv *env, size_t stacksize, sync
newtask->ctx.uc_stack.ss_sp = newtask->stack;
+ newtask->valgrind_ret = VALGRIND_STACK_REGISTER(newtask->stack, newtask-
+
makecontext (&newtask->ctx, (void (*)(void)) synctask_wrap, 2, newtask)
newtask->state = SYNCTASK_INIT;
diff --git a/libglusterfs/src/syncop.h b/libglusterfs/src/syncop.h
index c2387e6..247325b 100644
--- a/libglusterfs/src/syncop.h
+++ b/libglusterfs/src/syncop.h
@@ -63,6 +63,7 @@ struct synctask {
int woken;
int slept;
int ret;
+ int valgrind_ret;
uid_t uid;
gid_t gid;
diff --git a/xlators/features/marker/src/marker-quota.c b/xlators/features/marke On 6/8/2017 19:02, Sanoj Unnikrishnan wrote:
index 902b8e5..f3d2507 100644
--- a/xlators/features/marker/src/marker-quota.c
+++ b/xlators/features/marker/src/marker-quota.c
@@ -1075,7 +1075,7 @@ mq_synctask1 (xlator_t *this, synctask_fn_t task, gf_boole
}
if (spawn) {
- ret = synctask_new1 (this->ctx->env, 1024 * 16, task,
+ ret = synctask_new1 (this->ctx->env, 0, task,
mq_synctask_cleanup, NULL, args);
if (ret) {
gf_log (this->name, GF_LOG_ERROR, "Failed to spawn "
> I would still be worried about the Invalid read/write. IMO whether an illegal access causes a crash depends on whether the page is currently mapped.
> So, it could so happen that there is a use after free / use outside of bounds happening in the code and it turns out that this location gets mapped in a different (unmapped) page when IO threads is not loaded.
>
> Could you please share the valgrind logs as well.
>
> On Wed, Jun 7, 2017 at 8:22 PM, Kinglong Mee <kinglongmee@xxxxxxxxx <mailto:kinglongmee@xxxxxxxxx>> wrote: > Revert http://review.gluster.org/
>
> After deleting io-threads from the vols, quota operates (list/set/modify) lets glusterfsd crash.
> I use it at CentOS 7 (CentOS Linux release 7.3.1611) with glusterfs 3.8.12.
> It seems the stack corrupt, when testing with the following diff, glusterfsd runs correctly.
>
> There are two questions as,
> 1. When using valgrind, it shows there are many "Invalid read/write" when with io-threads.
> Why glusterfsd runs correctly with io-threads? but crash without io-threads?
>
> 2. With the following diff, valgrind also shows many "Invalid read/write" when without io-threads?
> but no any crash.
>
> Any comments are welcome.
>
11499 <http://review.gluster.org/11499 > seems better than the diff.
> Gluster-devel@xxxxxxxxxxx <mailto:Gluster-devel@gluster.>
> diff --git a/xlators/features/marker/src/marker-quota.c b/xlators/features/marke
> index 902b8e5..f3d2507 100644
> --- a/xlators/features/marker/src/marker-quota.c
> +++ b/xlators/features/marker/src/marker-quota.c
> @@ -1075,7 +1075,7 @@ mq_synctask1 (xlator_t *this, synctask_fn_t task, gf_boole
> }
>
> if (spawn) {
> - ret = synctask_new1 (this->ctx->env, 1024 * 16, task,
> + ret = synctask_new1 (this->ctx->env, 0, task,
> mq_synctask_cleanup, NULL, args);
> if (ret) {
> gf_log (this->name, GF_LOG_ERROR, "Failed to spawn "
>
> -----------------------------------test steps ------------------------------ ----
> 1. gluster volume create gvtest node1:/test/ node2:/test/
> 2. gluster volume start gvtest
> 3. gluster volume quota enable gvtest
>
> 4. "deletes io-threads from all vols"
> 5. reboot node1 and node2.
> 6. sh quota-set.sh
>
> # cat quota-set.sh
> gluster volume quota gvtest list
> gluster volume quota gvtest limit-usage / 10GB
> gluster volume quota gvtest limit-usage /1234 1GB
> gluster volume quota gvtest limit-usage /hello 1GB
> gluster volume quota gvtest limit-usage /test 1GB
> gluster volume quota gvtest limit-usage /xyz 1GB
> gluster volume quota gvtest list
> gluster volume quota gvtest remove /hello
> gluster volume quota gvtest remove /test
> gluster volume quota gvtest list
> gluster volume quota gvtest limit-usage /test 1GB
> gluster volume quota gvtest remove /xyz
> gluster volume quota gvtest list
>
> -----------------------glusterfsd crash without the diff-------------------------- ------
>
> /usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+ 0xf5)[0x7f6e1e950af1]
> /usr/local/lib/libglusterfs.so.0(gf_print_trace+0x21f)[ 0x7f6e1e956943]
> /usr/local/sbin/glusterfsd(glusterfsd_print_trace+0x1f)[ 0x409c83]
> /lib64/libc.so.6(+0x35250)[0x7f6e1d025250]
> /lib64/libc.so.6(gsignal+0x37)[0x7f6e1d0251d7]
> /lib64/libc.so.6(abort+0x148)[0x7f6e1d0268c8]
> /lib64/libc.so.6(+0x74f07)[0x7f6e1d064f07]
> /lib64/libc.so.6(+0x7baf5)[0x7f6e1d06baf5]
> /lib64/libc.so.6(+0x7c3e6)[0x7f6e1d06c3e6]
> /usr/local/lib/libglusterfs.so.0(__gf_free+0x311)[ 0x7f6e1e981327]
> /usr/local/lib/libglusterfs.so.0(synctask_destroy+0x82)[ 0x7f6e1e995c20]
> /usr/local/lib/libglusterfs.so.0(synctask_done+0x25)[ 0x7f6e1e995c47]
> /usr/local/lib/libglusterfs.so.0(synctask_switchto+0xcf)[ 0x7f6e1e996585]
> /usr/local/lib/libglusterfs.so.0(syncenv_processor+0x60)[ 0x7f6e1e99663d]
> /lib64/libpthread.so.0(+0x7dc5)[0x7f6e1d7a2dc5]
> /lib64/libc.so.6(clone+0x6d)[0x7f6e1d0e773d]
>
> or
>
> package-string: glusterfs 3.8.12
> /usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+ 0xf5)[0x7fa15e623af1]
> /usr/local/lib/libglusterfs.so.0(gf_print_trace+0x21f)[ 0x7fa15e629943]
> /usr/local/sbin/glusterfsd(glusterfsd_print_trace+0x1f)[ 0x409c83]
> /lib64/libc.so.6(+0x35250)[0x7fa15ccf8250]
> /lib64/libc.so.6(gsignal+0x37)[0x7fa15ccf81d7]
> /lib64/libc.so.6(abort+0x148)[0x7fa15ccf98c8]
> /lib64/libc.so.6(+0x74f07)[0x7fa15cd37f07]
> /lib64/libc.so.6(+0x7dd4d)[0x7fa15cd40d4d]
> /lib64/libc.so.6(__libc_calloc+0xb4)[0x7fa15cd43a14]
> /usr/local/lib/libglusterfs.so.0(__gf_calloc+0xa7)[ 0x7fa15e653a5f]
> /usr/local/lib/libglusterfs.so.0(iobref_new+0x2b)[ 0x7fa15e65875a]
> /usr/local/lib/glusterfs/3.8.12/rpc-transport/socket.so(+ 0xa98c)[0x7fa153a8398c]
> /usr/local/lib/glusterfs/3.8.12/rpc-transport/socket.so(+ 0xacbc)[0x7fa153a83cbc]
> /usr/local/lib/glusterfs/3.8.12/rpc-transport/socket.so(+ 0xad10)[0x7fa153a83d10]
> /usr/local/lib/glusterfs/3.8.12/rpc-transport/socket.so(+ 0xb2a7)[0x7fa153a842a7]
> /usr/local/lib/libglusterfs.so.0(+0x97ea9)[0x7fa15e68eea9]
> /usr/local/lib/libglusterfs.so.0(+0x982c6)[0x7fa15e68f2c6]
> /lib64/libpthread.so.0(+0x7dc5)[0x7fa15d475dc5]
> /lib64/libc.so.6(clone+0x6d)[0x7fa15cdba73d]
>
> _______________________________________________
> Gluster-devel mailing list
org >
> http://lists.gluster.org/mailman/listinfo/gluster-devel <http://lists.gluster.org/mailman/listinfo/gluster-devel >
>
>
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-devel