Re: GlusterD 2.0 status updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 09/07/2015 10:38 PM, Vijay Bellur wrote:
On Monday 07 September 2015 09:38 PM, Jeff Darcy wrote:
Agree here. I think we can use Go as the other non-C language for
components like glusterd, geo-replication etc. and retire the python
implementation of gsyncd that exists in 3.x over time.

That sounds like a pretty major effort.  Are you suggesting that it
(or even part of it) should be part of 4.0?


Not for the first cut of 4.0 but we can take this up as a goal for a subsequent 4.x release. We do not have a lot of non C code. syncdaemon and glusterfind are probably the only non C pieces that exist today in the codebase (excluding tests, glupy). The python implementation of gsyncd is not one of my favorite pieces of code in the repository. I find it obscure and am slightly concerned about it from a maintainability perspective. A refactor might make it easier to manage over time.

Geo-replication code base is now lot better compared to previous
releases. It is unfortunate that today Geo-rep codebase is example of
how not to write Python, hopefully this will change in future.

I don't think Geo-replication's performance problems due to a
language, it was written without any of the advanced features/external
libraries just to support Python 2.4. We have not spent enough
resources/time to improve the Geo-replication, the existing
codebase(during Gluster 3.5 I guess) is hacked to make Geo-replication
distributed and consume changelog. I think modularizing the codebase
will help in improving the maintainability.

We faced issues with threading because of GIL, but we can use
Multiprocessing/Queues and other technologies to avoid using
Threading. If we are migrating to Python 3(not soon) then asyncio
library is available in standard library to write better concurrent
applications.

I think Python is still a good choice if we rewrite Geo-replication
today for the following reasons.

1. Good stable libraries support(zerorpc/zeromq, paramiko, pyxattr etc)
2. Performance was/is not really a concern since it uses external
   tools like Rsync for sync. Changelog parsing is written in
   C(libgfchangelog)
3. Can expect good Community contribution
4. Easy to learn and easy to find resources
5. There are good stuffs available in existing codebase, which can be
   reused.
6. It is very easy to integrate with libraries written in C, if
   performance is required for some portion we can write in C and use it
   in Python. For example xsync crawl.
7. For performance, we can use Cython(http://cython.org/) to generate
    performant Python extensions.

I am sure Go also will have the same advantageous, we may have to
evaluate once again before we really start rewriting Geo-rep.



python will still be present as part of our distaf test infrastructure.
We can possibly port our bash test scripts to python and retire the
usage of BASH with Gluster.next releases.

Getting rid of bash scripts sounds great, but we're talking about 420
scripts - many of them quite opaque and thus tedious to port to
another language/framework.  I suspect that some of these tests will
linger in their current form for quite some time to come.


Yes, agree with you. We can look at having new tests in python and do a lazy conversion of existing test units to python.

-Vijay
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel


regards
Aravinda
http://aravindavk.in


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel



[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux