Re: [PATCH v2] clarify usage expectations for cnt32_to_63()

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Mon, 10 Nov 2008 15:22:21 -0800

On Mon, 10 Nov 2008 18:15:32 -0500 (EST)
Nicolas Pitre <nico@xxxxxxx> wrote:

> On Mon, 10 Nov 2008, Andrew Morton wrote:
> 
> > On Mon, 10 Nov 2008 16:34:54 -0500 (EST)
> > Nicolas Pitre <nico@xxxxxxx> wrote:
> > 
> > > > It is far better to make the management of the state explicit and at
> > > > the control of the caller.  Get the caller to allocate the state and
> > > > pass its address into this function.  Simple, clear, explicit and
> > > > robust.
> > > 
> > > Sigh...  What about this compromize then?
> > > 
> > > diff --git a/include/linux/cnt32_to_63.h b/include/linux/cnt32_to_63.h
> > > index 7605fdd..74ce767 100644
> > > --- a/include/linux/cnt32_to_63.h
> > > +++ b/include/linux/cnt32_to_63.h
> > > @@ -32,8 +32,9 @@ union cnt32_to_63 {
> > >  
> > >  
> > >  /**
> > > - * cnt32_to_63 - Expand a 32-bit counter to a 63-bit counter
> > > + * __cnt32_to_63 - Expand a 32-bit counter to a 63-bit counter
> > >   * @cnt_lo: The low part of the counter
> > > + * @cnt_hi_p: Pointer to storage for the extended part of the counter
> > >   *
> > >   * Many hardware clock counters are only 32 bits wide and therefore have
> > >   * a relatively short period making wrap-arounds rather frequent.  This
> > > @@ -75,16 +76,31 @@ union cnt32_to_63 {
> > >   * clear-bit instruction. Otherwise caller must remember to clear the top
> > >   * bit explicitly.
> > >   */
> > > -#define cnt32_to_63(cnt_lo) \
> > > +#define __cnt32_to_63(cnt_lo, cnt_hi_p) \
> > >  ({ \
> > > -	static u32 __m_cnt_hi; \
> > >  	union cnt32_to_63 __x; \
> > > -	__x.hi = __m_cnt_hi; \
> > > +	__x.hi = *(cnt_hi_p); \
> > >   	smp_rmb(); \
> > >  	__x.lo = (cnt_lo); \
> > >  	if (unlikely((s32)(__x.hi ^ __x.lo) < 0)) \
> > > -		__m_cnt_hi = __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31); \
> > > +		*(cnt_hi_p) = __x.hi = (__x.hi ^ 0x80000000) + (__x.hi >> 31); \
> > >  	__x.val; \
> > >  })
> > 
> > This references its second argument twice, which can cause correctness
> > or efficiency problems.
> > 
> > There is no reason that this had to be implemented in cpp. 
> > Implementing it in C will fix the above problem.
> 
> No, it won't, for correctness and efficiency reasons.
> 
> And I've explained why already.

I'd be very surprised if you've really found a case where a macro is
faster than an inlined function.  I don't think that has happened
before.

--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html