Re: BlueStore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 24 Mar 2016, Allen Samuels wrote:
> Just wanted to drop a quick note and ask for some feedback.
> 
> We've been digging into the BlueStore code and there's a list of things that
> we either missed, don't understand or believe need to be addressed before
> BlueStore can be production ready.
> 
> (1) Block Allocator.
> 
> As I've described earlier, the memory performance of the current allocator
> is a concern. IMO this MUST be fixed before production status.

Agreed.  Let's make it a goal to settle on the approach by the end 
of the hackathon next Friday.

> (2) BlueFS Cache Memory Consumption
> 
> There doesn't appear to be a way to limit the memory consumption of cached
> BlueFS files.

The only data cache is the page cache address_space assocated with the 
block device, which is handled by the kernel.  You're right that there 
isn't a way to prevent it from consuming otherwise unused memory, but it 
won't prevent that memory from being reclaimed and used by anything else, 
so I don't think that is a problem.

Eventually we may need finer-grained control of the cache, but I don't 
think we are anywhere close to that point yet...

> (3) BlueFS Journal Compaction
> 
> Current implementation periodically re-compacts the BlueFS journal (log) and
> inhibits other write operations during this time. Not clear if this has a
> significant impact on front-end latency or not.

The plan has always been to make this an async process, but it wasn't 
trivial enough to do in the first pass.  Some but not all of the 
infrastructure is there.

> (4) WAL Throttling
> 
> There doesn't appear to be a mechanism to limit the amount of outstanding
> WAL data. Some limit is needed here.

There are throttle_wal_{ops,bytes} throttles.  Nothing very sophisticated.  
Sam and Somnath, we should revisit the overall throttling approach given 
what you've learned with FileStore.

> (5) Compression
> 
> While technically not required to make bluestore production ready. I think
> there will be enough revision of the code and data structures in order to
> support compression that we really really want to get this into BlueStore
> before it goes "GA", so that we don't have to re-stabilize it.

Agreed, and I would combine (5.5) Checksums with this as the data 
structures and overall IO flow are essentially the same.

> (6) Erasure Coding Overwrites
> 
> As previously documented, this feature requires support in BlueStore.
> Fortunately, implementing support is easy and non-disruptive. Hence, one
> might argue that it doesn't belong on this list.

Yeah, this one should be quite straightforward.  We can add it sooner 
rather than later, but there won't be any users (aside from the 
unit/functional tests), so I'm fine waiting.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux