On Thu, Jun 8, 2023 at 1:07 AM Bart Van Assche <bvanassche@xxxxxxx> wrote: > > On 6/7/23 08:55, Jianlin Lv wrote: > > 1. MegaRAID adapters associated with 24 local disks. The disks are named > > sequentially as "sda," "sdb," and so on, up to "sdx." > > 2. STAT controllers associated with the root disk, named "sdy." > > > > Both the MegaRAID adapters and the SATA controller (PCH) are accessed via > > the PCIe bus. In theory, depending on their PCIe bus ID in ascending order, > > the devices should be initialized in ascending order as well. > > Hmm ... I don't think there is anything that prevents the PCIe maintainer > from changing the PCIe probing behavior from synchronous to asynchronous? > In other words, I don't think it is safe to assume that PCIe devices are > always scanned in the same order. > > > For cloud deployment, the local volume provisioner detects and creates PVs > > for each local disk (from sda to sdx) on the host, and it cleans up the > > disks when they are released. > > This requires the logical names of the disks to be deterministic. > > I see two possible solutions: > - Change the volume provisioner such that it uses disk references that do > not depend on the probing order, e.g. /dev/disk/by-id/... Yes, The "/dev/disk/by-id/" can uniquely identify SCSI devices. However, I don't think it is suitable for the volume provisioner workflow. For nodes of the same SKU , a unified YAML file will be defined to instruct the volume provisioner on how to manage the local disks. If use WWID, it would mean that a unique YAML file needs to be defined for each node. This approach becomes impractical when dealing with a large number of work nodes. Jianlin > - Implement an algorithm in systemd that makes disk names predictable. > An explanation of how predictable names work for network interfaces is > available here: https://wiki.debian.org/NetworkInterfaceNames. The > systemd documentation about predictable network names is available here: > https://www.freedesktop.org/software/systemd/man/systemd.net-naming-scheme.html > > These alternatives have the advantage that disk scanning remains asynchronous. > > Thanks, > > Bart. >