nice, you don´t have two writers. 2011/1/31 Alexander Schreiber <als@xxxxxxxxxxxxxxx>: > On Mon, Jan 31, 2011 at 12:45:31PM -0200, Roberto Spadim wrote: >> i think filesystem is a problem... >> you can't have two writers over a filesystem that allow only one, or >> you will have filesystem crash (a lot of fsck repair... local cache >> and other's features), maybe a gfs ocfs or another is a better >> solution... > > No, for _our_ use case (replicated disks for VMs running under Xen > with live migration) the fileystem just _does_ _not_ _matter_ _at_ > _all_. Due to the way Xen live migration works, there is only one > writer at any one time: the VM "owning" the virtual disk provided > by drbd. > > To illustrate the point, a very short summary of what happens during > Xen live migration in our setup: > - VM is to be migrated from host A to host B, with the virtual block > device for the instance being provided by a drbd pair running on > those hosts > - host A/B are configured primary/secondary > - we reconfigure drbd to primary/primary > - start Xen live migration > - Xen creates a target VM on host B, this VM is not yet running > - Xen syncs live VM memory from host A to host B > - when most of the memory is synced over, Xen suspends execution of > the VM on host A > - Xen copies the remaining dirty VM memory from host A to host B > - Xen resumes VM execution on host B, destroys the source VM > on host A, Xen live migration is completed > - we reconfigure drbd on hosts A/B to secondary/primary > > There is no concurrent access to the virtual block device here anywhere. > And the only reason we go primary/primary during live migration is that > for Xen to attach the disks to the target VM, they have to be available > and accessible on the target node - as well as on the source node where > they are currently attached to the source VM. > > Now, if you were doing things like, say, use an primary/primary drbd > setup for NFS servers serving in parallel from two hosts, then yes, > you'd have to take special steps with a proper parallel filesystem > to avoid corruption. But this is a completely different problem. > > Kidn regards, > Alex. >> >> 2011/1/31 Alexander Schreiber <als@xxxxxxxxxxxxxxx>: >> > On Mon, Jan 31, 2011 at 06:42:44AM -0200, Denis wrote: >> >> 2011/1/29 Alexander Schreiber <als@xxxxxxxxxxxxxxx>: >> >> > On Sat, Jan 29, 2011 at 12:23:14PM -0200, Denis wrote: >> >> >> 2011/1/29 Alexander Schreiber <als@xxxxxxxxxxxxxxx> >> >> >> >> >> >> > >> >> >> > plain disk performance for writes, while reads should be reasonably >> >> >> > close to the plain disk performance - drbd optimizes reads by just reading >> >> >> > from the local disk if it can. >> >> >> > >> >> >> > >> >> >> However, I have not used it with active-active fashion. Have you? if yes, >> >> >> what is your overall experience? >> >> > >> >> > We are using drbd to provide mirrored disks for virtual machines running >> >> > under Xen. 99% of the time, the drbd devices run in primary/secondary >> >> > mode (aka active/passive), but they are switched to primary/primary >> >> > (aka active/active) for live migrations of domains, as that needs the >> >> > disks to be available on both nodes. From our experience, if the drbd >> >> > device is healthy, this is very reliable. No experience with running >> >> > drbd in primary/primary config for any extended period of time, though >> >> > (the live migrations are usually over after a few seconds to a minute at >> >> > most, then the drbd devices go back to primary/secondary). >> >> >> >> What filesystem are you using to enable the primary-primary mode? Have >> >> you evaluated it against any other available option? >> > >> > The filesystem is whatever the VM is using, usually ext3. But the >> > filesystem doesn't matter in our use case at all, because: >> > - the backing store for drbd are logical volumes >> > - the drbd block devices are directly exported as block devices >> > to the VMs >> > The filesystem is only active inside the VM - and the VM is not aware of >> > the drbd primary/secondary -> primary/primary -> primary/secondary dance >> > that happens "outside" to enable live migration. > > -- > "Opportunity is missed by most people because it is dressed in overalls and > looks like work." -- Thomas A. Edison > > -- Roberto Spadim Spadim Technology / SPAEmpresarial -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html