----- On Jul 23, 2016, at 5:26 PM, Dave Watson davejwatson@xxxxxx wrote: > Hi Mathieu, > > Implements two basic tests of RSEQ functionality, and one more > > exhaustive parameterizable test. > Thanks for beefing up the tests. I ran this set through our jemalloc > tests using rseq, and everything looks good so far. > +static inline __attribute__((always_inline)) > +bool rseq_finish(struct rseq_lock *rlock, > + intptr_t *p, intptr_t to_write, > + struct rseq_state start_value) > +{ > + RSEQ_INJECT_C(9) > + > + if (unlikely(start_value.lock_state != RSEQ_LOCK_STATE_RESTART)) { > + if (start_value.lock_state == RSEQ_LOCK_STATE_LOCK) > + rseq_fallback_wait(rlock); > + return false; > + } > + > +#ifdef __x86_64__ > + /* > + * The __rseq_table section can be used by debuggers to better > + * handle single-stepping through the restartable critical > + * sections. > + */ > + __asm__ __volatile__ goto ( > + ".pushsection __rseq_table, \"aw\"\n\t" > + ".balign 8\n\t" > + "4:\n\t" > + ".quad 1f, 2f, 3f\n\t" > + ".popsection\n\t" > Is there a reason we're also passing the start ip? It looks unused. > I see the "for debuggers" comment, but it looks like all the debugger > support is done in userspace. > + "1:\n\t" > + RSEQ_INJECT_ASM(1) > + "movq $4b, (%[rseq_cs])\n\t" > + RSEQ_INJECT_ASM(2) > + "cmpl %[start_event_counter], %[current_event_counter]\n\t" > + "jnz 3f\n\t" > + RSEQ_INJECT_ASM(3) > + "movq %[to_write], (%[target])\n\t" > + "2:\n\t" > + RSEQ_INJECT_ASM(4) > + "movq $0, (%[rseq_cs])\n\t" > + "jmp %l[succeed]\n\t" > + "3: movq $0, (%[rseq_cs])\n\t" > + : /* no outputs */ > + : [start_event_counter]"r"(start_value.event_counter), > + [current_event_counter]"m"(start_value.rseqp->abi.u.e.event_counter), > + [to_write]"r"(to_write), > + [target]"r"(p), > + [rseq_cs]"r"(&start_value.rseqp->abi.rseq_cs) > + RSEQ_INJECT_INPUT > + : "memory", "cc" > + RSEQ_INJECT_CLOBBER > + : succeed > + ); > This ABI looks like it will work fine for our use case. I don't think it > has been mentioned yet, but we may still need multiple asm blocks > for differing numbers of writes. For example, an array-based freelist push: > void push(void *obj) { > if (index < maxlen) { > freelist[index++] = obj; > } > } > would be more efficiently implemented with a two-write rseq_finish: > rseq_finish2(&freelist[index], obj, // first write > &index, index + 1, // second write > ...); > where it is ok to abort between the two writes, but both need to happen > on the same cpu. (re-send without html formatting for the mailing lists) Would pairing one rseq_start with two rseq_finish do the trick there ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html