Re: RBD journal draft design

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 9, 2015 at 12:08 PM, Jason Dillaman <dillaman@xxxxxxxxxx> wrote:
>> I must not be being clear. Tell me if this scenario is possible:
>>
>> * Client A writes to file foo many times and it is journaled to object set 1.
>> * Client B writes to file bar many times and it starts journaling to
>> object set 1, but hits the end and moves on to object set 2.
>> * Client A hits a synchronization point in its higher-level logic.
>> * Client A fsyncs file foo to object set 1 and then
>> * Client B hits the synchronization point, fsyncs file bar to object
>> set 2, and sends data back to Client A.
>> * Client A fsyncs the receipt of its data stream to object set 1, and
>> only then gets sent on to object set 2.
>> * The journal copier runs and migrates object set 1 to a remote data
>> center, then the data center explodes.
>> * In the remote data center they fail over and client A thinks it has
>> reached a synchronization point and gotten an acknowledgement that
>> client B has never heard of.
>>
>> Does that being a problem make sense? I don't think handling it is
>> overly complicated and it's kind of important.
>> -Greg
>
> Seems this case is solved if you delay the completion of client B's flush (fsync) until the "active set updated" notification is successfully delivered.  In that case, client A would know that it needs to re-read the active set collection and thus needs to now write to object set 2.  Thoughts?

Honestly at this point my head's a little wrapped around itself and
I'm not sure. :) I think that however it's set up we want to switch
from one object set to the next coherently (ie, no writing to object
set 2 for write N and object set 1 for write N+1) and that we force
each client to switch at the same point. I guess in general the
penalty for having to re-send ops when we find out late that the
object set is full probably wouldn't be a big deal? But I'm not sure
if if sending notifies on the objects is the best option or if
something else is.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux