Re: Suggestion to Improve performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/27/2017 04:17 AM, Mohit Agrawal wrote:
> Niels,
> 
>    Thanks for your reply, I think these built-in function provides by
> gcc and it should support most of the architecture.
>    In your view what could be the archietecure that does not support
> these builtin function ??
see


https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins,


https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/_005f_005fatomic-Builtins.html#g_t_005f_005fatomic-Builtins,
and

  https://llvm.org/docs/Atomics.html

The _legacy_ __sync*() functions have been superceded by the __atomic*()
functions.

A quick search seems to suggest that ARM has atomic insns since armv6.
Fedora supports armv7hl and aarch64 and IMO ARM should be okay in this
regard.

A ten year old gcc bug report
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34115) speaks to __sync*
(and now __atomic* ?) not being supported on the i386 and by extension
to i686 because of how glibc was built for most distributions back then.
We (gluster) are conservative in this regard, but frankly, since hardly
anyone runs 32-bit these days, the perf hit of not using atomic is not a
serious concern.

It's easy to fall into the trap of saying "it works on my box." Where
"my box" is a x86_64 machine running latest {Fedora,Debian,whatever} but
please keep in mind that even Fedora runs on i686, x86_64, armv7hl,
aarch64, ppc64, ppc64le, and s390x these days. IMO we don't want to fill
our source with lots of #ifdef glop if we can avoid it.

Clang/LLVM supports __sync*() and __atomic*(), and between Clang and gcc
I'd say that covers pretty much all the distributions we care about,
e.g. Linux and *BSD, and even many we don't care so much about any more.

I'd say proceed with caution, but proceed. ;-)


>     
> 
> Regards
> Mohit Agrawal
> 
> On Wed, Sep 27, 2017 at 1:20 PM, Niels de Vos <ndevos@xxxxxxxxxx
> <mailto:ndevos@xxxxxxxxxx>> wrote:
> 
>     On Wed, Sep 27, 2017 at 12:55:37PM +0530, Mohit Agrawal wrote:
>     > Hi,
>     >
>     >    I was checking code of internal data structures (dict,fd,rpc_clnt etc.)
>     > those we use in glusterfs to store data.
>     >    Usually we use common pattern to take reference of data structure in
>     > xlator level, in ref function we do take lock(mutex_lock)
>     >    and update(increase) reference counter and in unref function we do take
>     > lock and decrease reference counter and
>     >    check if ref counter is become 0 after decrease then destroy object.
>     >
>     >    I think to update reference counter we don't need to take a lock, we can
>     > use atomic in-built function those
>     >    can improve performance
> 
>     The below is not portable for all architectures. However we have
>     refcount.h in libglusterfs/src/ which hides the portability things. One
>     of the big advantages to use this, is that the code for reference
>     counting is the same everywhere. Some structures have been updated with
>     GF_REF_* macros, more can surely be done.
> 
>     For other more basic counters that do not function as reference counter,
>     the libglusterfs/src/atomic.h macros can be used. The number of lock
>     instructions on modern architectures can be reduced considerably this
>     way. It will likely come with a performance increase, but the usage of a
>     standard API makes the code simpler to understand and that is my main
>     interest :)
> 
>     Obviously I'm all for replacing the lock+count+unlock sequences for many
>     structures!
> 
>     Thanks,
>     Niels
> 
> 
>     >
>     >    For ex: Below is a example for specific to dict_ref/unref
>     >    To increase refCount we can use below built-in function
>     >    dict_ref
>     >    {
>     >        __atomic_add_fetch (&this->refcount, 1, __ATOMIC_SEQ_CST);
>     >
>     >    }
>     >
>     >    dict_unref
>     >    {
>     >       __atomic_sub_fetch (&this->refcount, 1, __ATOMIC_SEQ_CST);
>     >       __atomic_load (&this->refcount, &ref, __ATOMIC_SEQ_CST);
>     >    }
>     >
>     >    In the same way we can use for all other shared data-structure also in
>     > case of take/release reference.
>     >
>     >    I have not tested yet how much performance improvement we can gain but i
>     > think there should be some improvement.
>     >   Please share your input on this, appreciate your input.
>     >
>     > Regards
>     > Mohit Agrawal
> 
>     > _______________________________________________
>     > Gluster-devel mailing list
>     > Gluster-devel@xxxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxxx>
>     > http://lists.gluster.org/mailman/listinfo/gluster-devel
>     <http://lists.gluster.org/mailman/listinfo/gluster-devel>
> 
> 
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxxx
> http://lists.gluster.org/mailman/listinfo/gluster-devel
> 

-- 

Kaleb
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux