Re: architecture help (iscsi, rbd, backups?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Angelo,

You can always use Samba to serve shares, it works well with AD, if that is
needed.  You may want to benchmark your prototypes in an as close to
production setting as possible.

--
Alex Gorbachev
ISS Storcium
iss-integration.com



On Sat, Apr 29, 2023 at 10:58 PM Angelo Hongens <angelo@xxxxxxxxxx> wrote:

>
> Thanks Alex, interesting perspectives.
>
> I already thought about proxmox as well, and that would also work quite
> nicely. I think that would be the most performant option to put VM's on
> RBD.
>
> But my entire goal was to run SMB servers on top of that hypervisor
> layer, to serve SMB shares to Windows.
>
> So I think Bailey's suggestion makes more sense, to use CephFS with
> Linux SMB gateways, which cuts out a layer in between, greatly improving
> performance.
>
> That also has the benefit of being able to use a single 1PB CephFS
> filesystem served by multiple SMB gateways instead of my initial plan of
> having like 10x100TB windows SMB file servers (I would not dare have a
> single 1PB windows VM with an NTFS disk)
>
>
>
> Angelo.
>
> On 27/04/2023 20:05, Alex Gorbachev wrote:
> > Hi Angelo,
> >
> > Just some thoughts to consider from our experience with similar setups:
> >
> > 1. Use Proxmox instead of VMWare, or anything KVM based.  These VMs can
> > consume Ceph directly, and provide the same level of service (some may
> > say better) for live ,migration, hyperconvergence etc.  Then you run
> > Windows VMs in KVM, bring RBD storage to them as virtual disks and share
> > out as needed.
> >
> > 2. Use NFS - all modern Windows  OSs support it.  You can use any NFS
> > gateway you like, or set up your own machine or cluster (which is what
> > we did with Storcium) and export your storage as needed.
> >
> > 3. If you must use VMWare, you can present datastores via NFS as well,
> > this has a lot of indirection but is easier to manage.
> >
> > --
> > Alex Gorbachev
> > ISS Storcium
> > https://www.iss-integration.com <https://www.iss-integration.com>
> >
> >
> >
> > On Thu, Apr 27, 2023 at 5:06 PM Angelo Höngens <angelo@xxxxxxxxxx
> > <mailto:angelo@xxxxxxxxxx>> wrote:
> >
> >     Hey guys and girls,
> >
> >     I'm working on a project to build storage for one of our departments,
> >     and I want to ask you guys and girls for input on the high-level
> >     overview part. It's a long one, I hope you read along and comment.
> >
> >     SUMMARY
> >
> >     I made a plan last year to build a 'storage solution' including ceph
> >     and some windows VM's to expose the data over SMB to clients. A year
> >     later I finally have the hardware, built a ceph cluster, and I'm
> doing
> >     tests. Ceph itself runs great, but when I wanted to start exposing
> the
> >     data using iscsi to our VMware farm, I ran into some issues. I know
> >     the iscsi gateways will introduce some new performance bottlenecks,
> >     but I'm seeing really slow performance, still working on that.
> >
> >     But then I ran into the warning on the iscsi gateway page: "The iSCSI
> >     gateway is in maintenance as of November 2022. This means that it is
> >     no longer in active development and will not be updated to add new
> >     features.". Wait, what? Why!? What does this mean? Does this mean
> that
> >     iSCSI is now 'feature complete' and will still be supported the next
> 5
> >     years, or will it be deprecated in the future? I tried searching, but
> >     couldn't find any info on the decision and the roadmap.
> >
> >     My goal is to build a future-proof setup, and using deprecated
> >     components should not be part of that of course.
> >
> >     If the iscsi gateway will still be supported the next few years and I
> >     can iron out the performance issues, I can still go on with my
> >     original plan. If not, I have to go back to the drawing board. And
> >     maybe you guys would advise me to take another route anyway.
> >
> >     GOALS
> >
> >     My goals/considerations are:
> >
> >     - we want >1PB of storage capacity for cheap (on a tight budget) for
> >     research data. Most of it is 'store once, read sometimes'. <1% of the
> >     data is 'hot'.
> >     - focus is on capacity, but it would be nice to have > 200MB/s of
> >     sequential write/read performance and not 'totally suck' on random
> >     i/o. Yes, not very well quantified, but ah. Sequential writes are
> most
> >     important.
> >     - end users all run Windows computers (mostly VDI's) and a lot of
> >     applications require SMB shares.
> >     - security is a big thing, we want really tight ACL's, specific
> >     monitoring agents, etc.
> >     - our data is incredibly important to us, we still want the 3-2-1
> >     backup rule. Primary storage solution, a second storage solution in a
> >     different place, and some of the data that is not reproducible is
> also
> >     written to tape. We also want to be protected from ransomware or user
> >     errors (so no direct replication to the second storage).
> >     - I like open source, reliability, no fork-lift upgrades, no vendor
> >     lock-in, blah, well, I'm on the ceph list here, no need to convince
> >     you guys ;)
> >     - We're hiring a commercial company to do ceph maintenance and
> support
> >     for when I'm on leave or leaving the company, but they won't support
> >     clients, backup software, etc, so I want something as simple as
> >     possible. We do have multiple Windows/VMware admins, but no other
> real
> >     linux guru's.
> >
> >     THE INITIAL PLAN
> >
> >     Given these considerations, I ordered two identical clusters, each
> >     consisting of 3 monitor nodes and 8 osd nodes, Each osd node has 2
> >     ssd's and 10 capacity disks (EC 4:2 for the data), and each node is
> >     connected using a 2x25Gbps bond. Ceph is running like a charm. Now I
> >     just have to think about exposing the data to end users, and I've
> been
> >     testing different setups.
> >
> >     My original plan was to expose for example 10x100TB rbd images using
> >     iSCSI to our VMware farm, formatting the luns with VMFS6, and run for
> >     example 2 Windows file servers per datastore on that with a single
> DFS
> >     namespace to end users. Then backup the file servers using our
> >     existing Veeam infrastructure to RGW running on the second cluster
> >     with an immutable bucket. This way we would have easily defined
> >     security boundaries: the clients can only reach the file servers, the
> >     file servers only see their local VMDK's, ESX only sees the luns on
> >     the iSCSI target, etc. When a file server would be compromised, it
> >     would have no access to ceph. We have easy incremental backups,
> >     immutability for ransomware protection, etc. And the best part is
> that
> >     the ceph admin can worry about ceph, the vmware admin can focus on
> >     ESX, VMFS and all the vmware stuff, and the Windows admins can focus
> >     on the Windows boxes, Windows-specific ACLS and tools and Veeam
> >     backups and stuff.
> >
> >     CURRENT SITUATION
> >
> >     I'm building out this plan now, but I'm running into issues with
> >     iSCSI. Are any of you doing something similar? What is your iscsi
> >     performance compared to direct rbd?
> >
> >     In regard to performance: If I take 2 test windows VM's, I put one on
> >     an iSCSI datastore and another with direct rbd access using the
> >     windows rbd driver, I create a share on those boxes and push data to
> >     it, I see different results (of course). Copying some iso images over
> >     SMB to the 'windows vm running direct rbd' I see around 800MB/s
> write,
> >     and 200MB/s read, which is pretty okay. When I send data to the
> >     'windows vm running on top of iscsi' it starts writing at around
> >     350MB/s, but after like 10-20 seconds drops to 100MB/s and won't go
> >     faster. Reads are anywhere from 40MB/s to 80MB/s, which is not really
> >     acceptable.
> >
> >     Another really viable and performant scenario would be to have the
> >     Windows file servers connect to rbd directly with the windows rbd
> >     driver. It seems to work well, it's fast, and you don't have the
> >     bottleneck that the iscsi gateway creates. But I see this driver is
> >     still in beta. Is anyone using this in production? What are your
> >     experiences? We would miss out on the separation of layers and thus
> >     have less security, but at the same time, it really increases
> >     efficiency and performance.
> >
> >     And if I use rbd, then vmware won't see the storage, and I cannot do
> >     an image backup using veeam. I could of course do backups of the rbd
> >     images, using tools like restic or backy to rgw running on the second
> >     cluster with immutable buckets. What are your experiences? Is it easy
> >     to do differential backups of lots of 50TB rbd images? Change rate is
> >     usually like 0.005% per day or something ;)
> >
> >     By the way, we also thought about CephFS, but we have some complex
> >     stuff going on with extended ACL's that I don't think will play nice
> >     with CephFS, and I think it's a lot more complex to backup CephFS
> than
> >     block images.
> >
> >     If you made it here, thank you for your time! I hope you can share
> >     thoughts on my questions!
> >
> >     Angelo.
> >     _______________________________________________
> >     ceph-users mailing list -- ceph-users@xxxxxxx
> >     <mailto:ceph-users@xxxxxxx>
> >     To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >     <mailto:ceph-users-leave@xxxxxxx>
> >
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux