Re: Feature help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/01/2014 10:20 AM, Rudra Siva wrote:
Hi,

I'm very interested in helping with this feature by way of development
help, testing and or benchmarking.

Features/Feature Smallfile Perf

One of the things I was looking into was possibility of adding a few
API calls to libgfapi to help allow reading and writing multiple small
files as objects - just as librados does for ceph - cutting out FUSE
and other semantics that tend to be overheads for really small files.
I don't know what else I will have to add for libgfapi to support
this.

The response below is based on, reading into this mail and the other mail that you sent titled "libgfapi object api" which I believe expands on that actual APIs that you are thinking of. (the following commentary is to glean more information, as this is something that can help small file performance, and could be the result of my own misunderstanding :) )

- Who are the consumers of such an API?
The way I see it, FUSE does not have a direct way to use this enhancement, unless we think of ways, like the ones that Ben proposed to defer and detect small file creates.

Neither does NFS or SMB protocol implementations.

Swift has a use case here, as they need to put/get objects atomically and can have a good benefit of having a single API rather than plough through multiple ones and ensuring atomicity using renames (again stated by Ben in the other mail). BUT, we cannot have the entire object that we need to write and then invoke the API (consider an object 1GB in size). So instead Swift would have to use this when it has an entire object and do some optimization like the ones suggested for FUSE like Ben.

Hence the question, who are the consumers of this API?

- Do these interfaces create the files if absent on writes?

IOW, is this for existing objects/files or to extend the use case into creating and writing files as objects?


The following is what I was thinking - please feel free to correct me
or guide me if someone has already done some ground work on this.

For read, multiple objects can be provided and they should be
separated for read from appropriate brick based on the DHT flag - this
will help avoid multiple lookups from all servers. In the absence of
DHT they would be sent to all but only the ones that contain the
object respond (it's more like a multiple file lookup request).

The above section for me is sketchy in details, but the following questions do crop up, - What do you mean by "separated for read from appropriate brick based on the DHT flag"?

If *objects* array is a list of names of objects/files under *store_path*, we still need to determine which subvolume of DHT these exist on (which could be AFR subvols) and then read from the right subvol. This information could be cached on the inode in the client stack already in DHT, which would avoid the lookup anyway, if not these need to be looked up and found in the appropriate subvols. What is it that we are trying to avoid or optimize here and how?


For write, same as the case of read, complete object writes (no
partial updates, file offsets etc.)

For delete, most of the lookup and batching logic remains the same.

I can help with testing, documentation or benchmarks if someone has
already done some work.

There was a mention of writing a feature page for this enhancement, I would suggest doing that, even if premature, so that details are better elaborated and understood (by me at least).

HTH,
Shyam
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux