Re: architecture help (iscsi, rbd, backups?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There is also a direct RBD client for MS Windows, though it's relatively young.

> On Apr 27, 2023, at 18:20, Bailey Allison <ballison@xxxxxxxxxxxx> wrote:
> 
> Hey Angelo,
> 
> Just to make sure I'm understanding correctly, the main idea for the use
> case is to be able to present Ceph storage to windows clients as SMB? 
> 
> If so, you can absolutely use CephFS to get that done. This is something we
> do all the time with our cluster configurations, if we're looking to present
> ceph storage to windows clients for the use case of a file server is our
> standard choice, and to your point of security/ACLs we can make use of
> joining the samba server that to an existing active directory, and then
> assigning permissions through Windows. 
> 
> I will provide a high level overview of an average setup to hopefully
> explain it better, and of course if you have any questions please let me
> know. I understand that this is way different of a setup of what you
> currently have planned, but it's a different choice that could prove useful
> in your case.
> 
> Essentially how it works is we have ceph cluster with CephFS configured, of
> which we map CephFS kernel mounts onto some gateway nodes, at which point we
> expose to clients via CTDB with SMB shares (CTDB for high availability). 
> 
> i.e
> 
> ceph cluster > ceph fs > map cephfs kernel mount on linux client > create
> smb share on top of cephfs kernel mount > connect to samba share with
> windows clients.
> 
> The SMB gateway nodes hosting samba also can be joined to an Active
> Directory to allow setting Windows ACL permissions to allow more in depth
> control of ACLs.
> 
> Also I will say +1 for the RBD driver on Windows, something we also make use
> of a lot and have a lot of success with.
> 
> Again, please let me know if you need any insight or clarification, or have
> any further questions. Hope this is of assistance.
> 
> Regards,
> 
> Bailey
> 
> -----Original Message-----
>> From: Angelo Höngens <angelo@xxxxxxxxxx> 
>> Sent: April 27, 2023 6:06 PM
>> To: ceph-users@xxxxxxx
>> Subject:  architecture help (iscsi, rbd, backups?)
>> 
>> Hey guys and girls,
>> 
>> I'm working on a project to build storage for one of our departments, and I
> want to ask you guys and girls for input on the high-level overview part.
> It's a long one, I hope you read along and comment.
>> 
>> SUMMARY
>> 
>> I made a plan last year to build a 'storage solution' including ceph and
> some windows VM's to expose the data over SMB to clients. A year later I
> finally have the hardware, built a ceph cluster, and I'm doing tests. Ceph
> itself runs great, but when I wanted to start exposing the data using iscsi
> to our VMware farm, I ran into some issues. I know the iscsi gateways will
> introduce some new performance bottlenecks, but I'm seeing really slow
> performance, still working on that.
>> 
>> But then I ran into the warning on the iscsi gateway page: "The iSCSI
> gateway is in maintenance as of November 2022. This means that it is no
> longer in active development and will not be updated to add new features.".
> Wait, what? Why!? What does this mean? Does this mean that iSCSI is now
> 'feature complete' and will still be supported the next 5 years, or will it
> be deprecated in the future? I tried searching, but couldn't find any info
> on the decision and the roadmap.
>> 
>> My goal is to build a future-proof setup, and using deprecated components
> should not be part of that of course.
>> 
>> If the iscsi gateway will still be supported the next few years and I can
> iron out the performance issues, I can still go on with my original plan. If
> not, I have to go back to the drawing board. And maybe you guys would advise
> me to take another route anyway.
>> 
>> GOALS
>> 
>> My goals/considerations are:
>> 
>> - we want >1PB of storage capacity for cheap (on a tight budget) for
> research data. Most of it is 'store once, read sometimes'. <1% of the data
> is 'hot'.
>> - focus is on capacity, but it would be nice to have > 200MB/s of
> sequential write/read performance and not 'totally suck' on random i/o. Yes,
> not very well quantified, but ah. Sequential writes are most important.
>> - end users all run Windows computers (mostly VDI's) and a lot of
> applications require SMB shares.
>> - security is a big thing, we want really tight ACL's, specific monitoring
> agents, etc.
>> - our data is incredibly important to us, we still want the 3-2-1 backup
> rule. Primary storage solution, a second storage solution in a different
> place, and some of the data that is not reproducible is also written to
> tape. We also want to be protected from ransomware or user errors (so no
> direct replication to the second storage).
>> - I like open source, reliability, no fork-lift upgrades, no vendor
> lock-in, blah, well, I'm on the ceph list here, no need to convince you guys
> ;)
>> - We're hiring a commercial company to do ceph maintenance and support for
> when I'm on leave or leaving the company, but they won't support clients,
> backup software, etc, so I want something as simple as possible. We do have
> multiple Windows/VMware admins, but no other real linux guru's.
>> 
>> THE INITIAL PLAN
>> 
>> Given these considerations, I ordered two identical clusters, each
> consisting of 3 monitor nodes and 8 osd nodes, Each osd node has 2 ssd's and
> 10 capacity disks (EC 4:2 for the data), and each node is connected using a
> 2x25Gbps bond. Ceph is running like a charm. Now I just have to think about
> exposing the data to end users, and I've been testing different setups.
>> 
>> My original plan was to expose for example 10x100TB rbd images using iSCSI
> to our VMware farm, formatting the luns with VMFS6, and run for example 2
> Windows file servers per datastore on that with a single DFS namespace to
> end users. Then backup the file servers using our existing Veeam
> infrastructure to RGW running on the second cluster with an immutable
> bucket. This way we would have easily defined security boundaries: the
> clients can only reach the file servers, the file servers only see their
> local VMDK's, ESX only sees the luns on the iSCSI target, etc. When a file
> server would be compromised, it would have no access to ceph. We have easy
> incremental backups, immutability for ransomware protection, etc. And the
> best part is that the ceph admin can worry about ceph, the vmware admin can
> focus on ESX, VMFS and all the vmware stuff, and the Windows admins can
> focus on the Windows boxes, Windows-specific ACLS and tools and Veeam
> backups and stuff.
>> 
>> CURRENT SITUATION
>> 
>> I'm building out this plan now, but I'm running into issues with iSCSI. Are
> any of you doing something similar? What is your iscsi performance compared
> to direct rbd?
>> 
>> In regard to performance: If I take 2 test windows VM's, I put one on an
> iSCSI datastore and another with direct rbd access using the windows rbd
> driver, I create a share on those boxes and push data to it, I see different
> results (of course). Copying some iso images over SMB to the 'windows vm
> running direct rbd' I see around 800MB/s write, and 200MB/s read, which is
> pretty okay. When I send data to the 'windows vm running on top of iscsi' it
> starts writing at around 350MB/s, but after like 10-20 seconds drops to
> 100MB/s and won't go faster. Reads are anywhere from 40MB/s to 80MB/s, which
> is not really acceptable.
>> 
>> Another really viable and performant scenario would be to have the Windows
> file servers connect to rbd directly with the windows rbd driver. It seems
> to work well, it's fast, and you don't have the bottleneck that the iscsi
> gateway creates. But I see this driver is still in beta. Is anyone using
> this in production? What are your experiences? We would miss out on the
> separation of layers and thus have less security, but at the same time, it
> really increases efficiency and performance.
>> 
>> And if I use rbd, then vmware won't see the storage, and I cannot do an
> image backup using veeam. I could of course do backups of the rbd images,
> using tools like restic or backy to rgw running on the second cluster with
> immutable buckets. What are your experiences? Is it easy to do differential
> backups of lots of 50TB rbd images? Change rate is usually like 0.005% per
> day or something ;)
>> 
>> By the way, we also thought about CephFS, but we have some complex stuff
> going on with extended ACL's that I don't think will play nice with CephFS,
> and I think it's a lot more complex to backup CephFS than block images.
>> 
>> If you made it here, thank you for your time! I hope you can share thoughts
> on my questions!
>> 
>> Angelo.
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email
> to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux