Qu Wenruo - 26.06.21, 02:27:54 CEST: > On 2021/6/26 上午3:06, Martin Steigerwald wrote: > > Hi! > > > > I found repeatedly that Baloo indexes the same files twice or even > > more often after a while. > > > > I reported this upstream in: > > > > Bug 438434 - Baloo appears to be indexing twice the number of files > > than are actually in my home directory > > > > https://bugs.kde.org/show_bug.cgi?id=438434 > > > > And got back that if the device number changes, Baloo will think it > > has new files even tough the path is still the same. And found over > > time that the device number for the single BTRFS filesystem on a > > NVMe SSD in a ThinkPad T14 Gen1 AMD can change. It is not (maybe > > yet) RAID 1. I do have BTRFS RAID 1 in another laptop and there I > > also had this issue already. > > Since btrfs has multi-device support by default, it reports anonymous > device number, just as if you use a filesystem over LVM. Ah, this! I forgot to mention that: I use BTRFS on top of LVM on top of LUKS based dm-crypt on a partition on the NVMe SSD. Sorry, somehow I forgot to mention that here. I mentioned it in the bug report. I'd use a different approach if there would be one that give me full disk encryption. I am not willing to use ecryptfs on top of BTRFS and as far as I know BTRFS cannot yet encrypt by itself. I still think this could give a fixed order of loading: 1. Unlock LUKS. 2. Activate LVM logical volumes. No idea whether that happens in a fixed order though or whether it can have a different order on each boot. 3. Mount BTRFS. /home is always on the same subvolume. So that should not change. > The problem is why the anonymous device number change. Good question. Maybe I have an idea about that. See below. > > I argued that a desktop application has no business to rely on a > > device number and got back that search/indexing is in the middle > > between an application and system software. And that Baloo needs an > > "invariant" for a file. See comment #11 of that bug report: > > > > https://bugs.kde.org/show_bug.cgi?id=438434#c11 > > Well, a lot of tools relies on device number to distinguish filesystem > boundary, like find. > Thus it's a little hard to argue. > > But on the other hand, it also means baloo can't handle regular fs > over LVM cases well neither. Yes. Also it could not handle the case of a driver loading race condition with two or more different controllers in a desktop machine. > > I got the suggestion to try to find a way to tell the kernel to use > > a fixed device number. > > I don't think it's possible for btrfs, as each subvolume get its > anonymous device number assigned when it gets first read. > > Thus it's really hard to make it fixed, as the reason for anonymous > device number is to avoid conflicts. Fair enough. > > I still think, an application or an infrastructure service for a > > desktop environment or even anything else in user space should not > > rely on a device number to be fixed and never change upon reboots. > > Well, LVM/device mapper is doing the same thing, a lot of behavior > change is never a good idea for the kernel. > > Thus for use cases where we really need a proper mapping, we use > hashes, not just device number, like what we did in dupremover. I think I suggested that some time ago. > > Another question would be whether I could somehow make sure that the > > device number does not change, even if just as a work-around. > > If you really just want a fixed device number, you can ensure that by: > > - Make sure all users of anonymous devices get fixed sequence > Things like device mapper/LVM, btrfs should get loaded/initialized > in a fixed order. Ah, I see. > - Make sure the subvolume you care always get mounted/read before any > other subvolumes > So that the target subvolume always get the first device number in > the pool. Hmm, that may be a pointer. This is what I currently have in fstab: /dev/nvme/home /home btrfs lazytime,compress=zstd 0 0 /dev/nvme/home /zeit/home btrfs subvol=zeit 0 0 In the first line the default subvolume is used which I changed accordingly after creating this BTRFS. I use the approach to keep (temporary) snapshots separated from the directory tree in /home. Could it be that this order between these two mounts is not the same on every boot? I use Devuan with Runit, so the mounting would happen by some init scripts (instead of Systemd). I am not aware of an option for fstab to mount this one first and then the other second, but I could set the second mount to noauto and mount it when I need it. > But this also means, all later subvolumes not in the fixed > mount/read sequence can not get a fixed number. I somehow thought this would get complicated. Best, -- Martin