Re: RBD journaling benchmarks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





--------------------------------------------------
From: "Jason Dillaman" <jdillama@xxxxxxxxxx>
Sent: Thursday, July 13, 2017 4:45 AM
To: "Maged Mokhtar" <mmokhtar@xxxxxxxxxxx>
Cc: "Mohamad Gebai" <mgebai@xxxxxxxx>; "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Subject: Re: RBD journaling benchmarks

> On Mon, Jul 10, 2017 at 3:41 PM, Maged Mokhtar <mmokhtar@xxxxxxxxxxx> wrote:
>> On 2017-07-10 20:06, Mohamad Gebai wrote:
>>
>>
>> On 07/10/2017 01:51 PM, Jason Dillaman wrote:
>>
>> On Mon, Jul 10, 2017 at 1:39 PM, Maged Mokhtar <mmokhtar@xxxxxxxxxxx> wrote:
>>
>> These are significant differences, to the point where it may not make sense
>> to use rbd journaling / mirroring unless there is only 1 active client.
>>
>> I interpreted the results as the same RBD image was being concurrently
>> used by two fio jobs -- which we strongly recommend against since it
>> will result in the exclusive-lock ping-ponging back and forth between
>> the two clients / jobs. Each fio RBD job should utilize its own
>> backing image to avoid such a scenario.
>>
>>
>> That is correct. The single job runs are more representative of the
>> overhead of journaling only, and it is worth noting the (expected)
>> inefficiency of multiple clients for the same RBD image, as explained by
>> Jason.
>>
>> Mohamad
>>
>> Yes i expected a penalty but not as large. There are some use cases that
>> would benefit from concurrent access to the same block device, in vmware ad
>> hyper-v several hypervisors could share the same device which is formatted
>> via a clustered file system like MS CSV ( clustered shared volumes ) or
>> VMFS, which creates a volume/datastore that houses many VMs.
>
> Both of these use-cases would first need support for active/active
> iSCSI. While A/A iSCSI via MPIO is trivial to enable, getting it to
> properly handle failure conditions without the possibility of data
> corruption is not since it relies heavily on arbitrary initiator and
> target-based timers. The only realistic and safe solution is to rely
> on an MCS-based active/active implementation.
The case also applies to active/passive iSCSI.. you still have many initiators/hypervisors writing concurrently to the same rbd image using a clustered file system (csv/vmfs).

>> I was wondering if such a setup could be supported in the future and maybe
>> there could be a way to minimize the overhead of the exclusive lock..for
>> example by having a distributed sequence number to the different active
>> client writers and have each writer maintain its own journal, i doubt that
>> the overhead will reach the values you showed.
>
> The journal used by the librbd mirroring feature was designed to
> support multiple concurrent writers. Of course, that original design
> was more inline with the goal of supporting multiple images within a
> consistency group.
Yes but they will still suffer performance penalty , my understanding is that they would need the lock while writing the data to the journal entries and thus will be waiting turns, or  do they need the lock only for journal metadata like generating a sequence number ?     

>> Maged
>>
>>
>
> --
> Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux