Hi Bart, On Sun, Sep 27, 2020 at 2:03 AM Bart Van Assche <bvanassche@xxxxxxx> wrote: > > On 2020-09-09 04:42, Danil Kipnis wrote: > > On Fri, Sep 4, 2020 at 5:33 PM Bart Van Assche <bvanassche@xxxxxxx> wrote: > >> On 2020-09-04 04:35, Danil Kipnis wrote: > >>> On Thu, Sep 3, 2020 at 1:07 AM Bart Van Assche <bvanassche@xxxxxxx> wrote: > >>>> How will it be guaranteed that the resulting software does > >>>> not suffer from the problems that have been solved by the introduction > >>>> of the DRBD activity log > >>>> (https://www.linbit.com/drbd-user-guide/users-guide-drbd-8-4/#s-activity-log)? > >>> > >>> The above would require some kind of activity log also, I'm afraid. > >> > >> How about collaborating with the DRBD team? My concern is that otherwise > >> we will end up with two drivers in the kernel that implement block device > >> replication between servers connected over a network. > > > > I have two general understanding questions: > > - What is the conceptual difference between DRBD and an md-raid1 with > > one local leg and one remote (imported over srp/nvmeof/rnbd)? > > I'm not sure there is a conceptual difference. But there will be a big > difference in recovery speed after a temporary network outage (assuming that > the md-raid write intent bitmap has been disabled). I think RMR is conceptually different to either of the setups (drbd or md-raid over srp/iser/nvmeof/rnbd devices) in the sense that the logic required for replication policies (coding) is present inside rdma subsystem which would allow to potentially offload it to underlying rdma devices. The user of the rdma enabled devices can then utilize them for both: block io transport and replication. Another difference is that one can put a volume manager on top of RMR and have it work as a distributed one. > > > - Is this possible to setup an md-raid1 on a client sitting on top of > > two remote DRBD devices, which are configured in "active-active" mode? > > I don't think that DRBD supports this. From the DRBD source code: > "this code path is to recover from a situation that "should not happen": > concurrent writes in multi-primary setup." This means md-raid on top of two block devices imported over rdma has write latency twice shorter (while having recovery latency twice as high) as drbd. RMR would allow for having single hop for both: write IO and resync IO. Thank you, Danil.