Re: Basic Ceph questions

Marcus White <roastedseaweed.k@xxxxxxxxx> · Sun, 12 Oct 2014 07:55:56 -0700

Thanks:)

If someone can help reg the question below, that would be great!

"
>
> For VMs, I am trying to visualize how the RBD device would be exposed.
> Where does the driver live exactly? If its exposed via libvirt and
> QEMU, does the kernel driver run in the host OS, and communicate with
> a backend Ceph cluster? If yes, does libRBD provide a target (SCSI?)
> interface which the kernel driver connects to? Trying to visualize
> what the stack looks like, and the flow of IOs for block devices.
"

MW

On Fri, Oct 10, 2014 at 10:18 AM, Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> wrote:
>>
>> Just curious, what kind of applications use RBD? It cant be
>> applications which need high speed SAN storage performance
>> characteristics?
>
>
> Most people seem to be using it as storage for OpenStack.
>
> I've heard about people using RDB + Heartbeats to make an HA NFS, while they
> wait for CephFS to be production ready.
>
> People that are re-exporting images via iSCSI and Fiber Channel are probably
> doing something different.  If I had to hazard a guess, I'd guess that
> they're doing some sort of HA Clustered service, like a database.  That's
> the traditional use for shared storage.
>
>
>
>>
>>
>> For VMs, I am trying to visualize how the RBD device would be exposed.
>> Where does the driver live exactly? If its exposed via libvirt and
>> QEMU, does the kernel driver run in the host OS, and communicate with
>> a backend Ceph cluster? If yes, does libRBD provide a target (SCSI?)
>> interface which the kernel driver connects to? Trying to visualize
>> what the stack looks like, and the flow of IOs for block devices.
>
>
> I'll have to leave that for others to answer.
>
>
>> >>
>> >> b. If it is strongly consistent, is that the case across sites also?
>> >> How can it be performant across geo sites if that is the case? If its
>> >> choosing consistency over partitioning and availability...For object,
>> >> I read somewhere that it is now eventually consistent(local CP,
>> >> remotely AP) via DR. Gets a bit confusing with all the literature out
>> >> there. If it is DR, isnt that slightly different from the Swift case?
>> >
>> >
>> > If you're referring to RadosGW Federation, no.  That replication is
>> > async.
>> > The replication has several delays built in, so the fastest you could to
>> > see
>> > your data show up in the secondary is about a minute.  Longer if the
>> > file
>> > takes a while to transfer, or you have a lot of activity to replicate.
>> >
>> > Each site is still CP.  There is just delay getting data from the
>> > primary to
>> > the secondary.
>> In that case, it is like Swift, only differently done. The async makes
>> it eventually consistent across sites, no?
>
>
> I'm not sure regarding Swift.  Also outside my experience.
>
> But yes, the async replication is eventually consistent, with no guarantee.
> Problems during replication can cause the clusters to get out of sync.  The
> replication agent will retry failures, but it doesn't store the information
> anywhere.  If you restart the replication agent when it had known failures,
> those failures won't be retried.  Every one of the errors is logged, so I
> was able to manually download & re-upload the file to the primary cluster,
> which triggered re-replication.
>
> So far, all of the inconsistencies have shown up by comparing bucket
> listing.  I'm in the process of manually verifying checksums (my application
> stores a SHA256 for every object uploaded).  So far, I haven't had any
> failures in files that were marked as successfully replicated.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com