Re: does the RBD client block write when the Watcher times out?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I understood that the watcher cannot prevent multiple mounts. Based on
the feedback I received, I will consider countermeasures.

Thank you for your valuable insights.
Yuma.

2024年5月23日(木) 21:15 Frank Schilder <frans@xxxxxx>:
>
> Hi, we run into the same issue and there is actually another use case: live-migration of VMs. This requires an RBD image being mapped to two clients simultaneously, so this is intentional. If multiple clints map an image in RW-mode, the ceph back-end will cycle the write lock between the clients to allow each of them to flush writes, this is intentional. The way to coordinate here is the job of the orchestrator. In this case specifically, its explicitly managing a write lock during live-migration such that writes occur in the correct order.
>
> Its not a ceph job, its an orchestration job. The rbd interface just provides the tools to do it, for example, you can attach information that helps you hunting down dead-looking clients and kill them proper before mapping an image somewhere else.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Ilya Dryomov <idryomov@xxxxxxxxx>
> Sent: Thursday, May 23, 2024 2:05 PM
> To: Yuma Ogami
> Cc: ceph-users@xxxxxxx
> Subject:  Re: does the RBD client block write when the Watcher times out?
>
> On Thu, May 23, 2024 at 4:48 AM Yuma Ogami <yuma.ogami.cybozu@xxxxxxxxx> wrote:
> >
> > Hello.
> >
> > I'm currently verifying the behavior of RBD on failure. I'm wondering
> > about the consistency of RBD images after network failures. As a
> > result of my investigation, I found that RBD sets a Watcher to RBD
> > image if a client mounts this volume to prevent multiple mounts. In
>
> Hi Yuma,
>
> The watcher is created to watch for updates (technically, to listen to
> notifications) on the RBD image, not to prevent multiple mounts.  RBD
> allows the same image to be mapped multiple times on the same node or
> on different nodes.
>
> > addition, I found that if the client is isolated from the network for
> > a long time, the Watcher is released. However, the client still mounts
> > this image. In this situation, if another client can also mount this
> > image and the image is writable from both clients, data corruption
> > occurs. Could you tell me whether this is a realistic scenario?
>
> Yes, this is a realistic scenario which can occur even if the client
> isn't isolated from the network.  If the user does this, it's up to the
> user to ensure that everything remains consistent.  One use case for
> mapping the same image on multiple nodes is a clustered (also referred
> to as a shared disk) filesystem, such as OCFS2.
>
> Thanks,
>
>                 Ilya
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux