On 2010-11-06T05:32:03, Neil Brown <neilb@xxxxxxx> wrote: > Hi Lars, > the only issue that occurs to me is that if you want to report the first > success, then you need to copy the data to a private buffer before > submitting the write. Then wait for all writes to complete before freeing > the buffer. If you just return the first write the page would be unlocked > and so could be changed will another path was still writing it out. Right. This is, in a way, a mix of MPIO / RAID1 handling. We'd indeed need to have the write block several times - thankfully, we write really rarely and only one sector at a time, so the memory consumption is trivial. (However, we _really_ want to get those writes to disk. Right away.) > Finding a way to signal 'write all paths sounds tricky. This flag needs to > be state of the filedescriptor, not the whole device, so it would need to be > an fcntl rather than an ioctl. And defining new fcntls is a lot harder > because they need to be more generic - you cannot really make them device > specific... > Might it make sense to configure a range of the device where writes always > went down all paths? That would seem to fit with your problem description > and might be easiest?? Technically, it'd be possible, because that section is contiguous on the disk, yes. (Note that we don't open a real file in a file system, but use a raw block device; however, that could be a partition on top of MPIO.) But I'm a bit unclear how we'd define that; clearly, we don't want to by-pass multipathd management of the MPIO mapping, that being the whole point why we don't just handle that in user-space ;-) Hrm. I already have a dm-linear mapping (thanks to kpartx; otherwise it's trivially introduced). I could modify that to include a special flag that would mangle the bios that pass through - so I could set a bio flag that multipath could then act on ...? (There's precedent; the failfast bio flag.) Regards, Lars -- Architect Storage/HA, OPS Engineering, Novell, Inc. SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel