Fwd: [GSoC] Implement a Cassandra/NoSQL Connector or Translator for GlusterFS

jilinxpd at gmail.com (Jilin Xpd) · Sat, 27 Apr 2013 09:48:33 +0800

Hi, all,

I'm applying for the GSOC project "*Implement a Cassandra/NoSQL Connector
or Translator for GlusterFS*".
Since I have completed my GSOC proposal, I would like to post it here, any
sugggestions will be welcome.

Here is my application in fedora project wiki:
https://fedoraproject.org/wiki/GSOC_2013/Student_Application_Jilinxpd

Here is my application with *proposal *in google-melange:
https://google-melange.appspot.com/gsoc/proposal/review/google/gsoc2013/jilinxpd/18001

Best regards,
Peidong

---------- Forwarded message ----------
From: Jilin Xpd <jilinxpd at gmail.com>
Date: 2013/4/25
Subject: Fwd: [GSoC] Implement a Cassandra/NoSQL Connector or Translator
for GlusterFS
To: avati at redhat.com, Anand Babu Periasamy <abperiasamy at gmail.com>,
johnmark at redhat.com
Cc: Buddhike Kurera <bckurera at fedoraproject.org>

Dear mentors,

I'm Peidong, the guy applying for the GSOC project "*Implement a
Cassandra/NoSQL Connector or Translator for GlusterFS*".
I have finished my proposal, I hope you can help review it, thanks very
much!

Here is my application in fedora project wiki:
https://fedoraproject.org/wiki/GSOC_2013/Student_Application_Jilinxpd

Here is my application with *proposal *in google-melange:
https://google-melange.appspot.com/gsoc/proposal/review/google/gsoc2013/jilinxpd/18001

Best Regards,
Peidong

---------- Forwarded message ----------
From: Jilin Xpd <jilinxpd at gmail.com>
 Date: 2013/4/23
Subject: Fwd: [GSoC] Implement a Cassandra/NoSQL Connector or Translator
for GlusterFS
To: avati at redhat.com, abperiasamy at gmail.com, johnmark at redhat.com
Cc: Buddhike Kurera <bckurera at fedoraproject.org>

Dear mentors,

I'm a student willing to apply for the GSOC project "*Implement a Cassandra/
NoSQL Connector or Translator for GlusterFS*".
I have contacted with Mr Walker before, he hasn't reply yet.
As I'm now writing my proposal, I have some questions about this project.
Would you kindly help me solving my questions? Thanks very much!

My questions is as follows:

(1) As I understand it, the project is to write a storage translator for
GlusterFS, so that GlusterFS can use Cassandra as its backend storage.
One of the benefits is that legacy applications which are incompatible with
NoSQL can now store key-value pairs into Cassandra indirectly.
Am I right?

(2) Since the users will only store key-value pairs as a file into our
system, they may not use directory, file attribute and extended file
attribute, do we need to provide fops to support these features?
If we do, then as for the directory, I find it not very difficult to
support it, since directory can map to the super column and column family
in Cassandra.

That's all my questions. Thanks for your time!

I'm still designing and writing my proposal, I will post to your all as
soon as I finish.

Best regards,
Peidong

---------- Forwarded message ----------
From: Jilin Xpd <jilinxpd at gmail.com>
Date: 2013/4/22
Subject: [GSoC] Implement a Cassandra/NoSQL Connector or Translator for
GlusterFS
To: johnmark at redhat.com

Hi, Mr Walker,

I'm Peidong Xie, a third year master student from Institute of Software,
Chinese Academy of Sciences.

Sorry to communicate with you so late, I want to express my interest in the
idea "*Implement a Cassandra/NoSQL Connector or Translator for GlusterFS* ".

I have read the documents in the GlusterFS website, from where I got the
knowledge of GlusterFS architecture and the way of writing translators.
Also, I roughly read the code of posix translator and bdb translator, and
figured out the skeleton of a storage translator.

I noticed that GlusterFS had bdb as one of its storage backends, but it's
obsoleted. To implement a Cassandra translator for Glusterfs, I think the
bdb translator is a good reference.
Cassandra doesn't provide native interface for C, there is a C++ client (
libQtCassandra) which involves 3rd party libraries, so I think it's better
to use raw Thrift API in Glusterfs.

I have participated in some projects, most of my work is related with file
system:

(1) In 2011, I together with another student, developed a shared fs based
on FUSE, it's used to store libvirt checkpoint file and image file, then
multiple VMs could read/write a checkpoint or image  simultaneously. The
key idea is parting the whole file into small blocks and cache them in
memory, so that VMs could share the file blocks. COW is used to make sure a
VM's write won't  influence others.

(2) During last year's GSoC, I made the smbfs(CIFS client) in illumossupport
mmap. Firstly, I implemented mmap with block i/o, the main work it to
implement the VFS interfaces, such as smbfs_mmap, smbfs_getpage,
smbfsputpage. Secondly, I add page cache support to file i/o, mainly
modified smbfs_read, smbfs_write. With mmap, smbfs could cache file in
memory and reduce the i/o request over the wire, so the efficiency of i/o
increases.

(3) In last year, I spent some time porting ecryptfs-utils to
RedFlagLinux, making it work with
ecryptfs, to support encrypted home directory.

Currently, I concentrate on the storage issues in big data. I have done
study on some distributed systems such as hdfs, hbase, mongodb, cassandra,
and storage engines such as bdb and leveldb.

I hope my project experience and background knowledge could help in
"Implement a Cassandra/NoSQL Connector or Translator for GlusterFS ".
I haven't finished my proposal yet, I will finish it in one or two days.

Best regards,
Peidong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/summer-coding/attachments/20130427/4c65bc6d/attachment.html>