Re: Assumption on fixed device numbers in Plasma's desktop search Baloo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Qu Wenruo - 26.06.21, 02:27:54 CEST:
> On 2021/6/26 上午3:06, Martin Steigerwald wrote:
> > Hi!
> > 
> > I found repeatedly that Baloo indexes the same files twice or even
> > more often after a while.
> > 
> > I reported this upstream in:
> > 
> > Bug 438434 - Baloo appears to be indexing twice the number of files
> > than are actually in my home directory
> > 
> > https://bugs.kde.org/show_bug.cgi?id=438434
> > 
> > And got back that if the device number changes, Baloo will think it
> > has new files even tough the path is still the same. And found over
> > time that the device number for the single BTRFS filesystem on a
> > NVMe SSD in a ThinkPad T14 Gen1 AMD can change. It is not (maybe
> > yet) RAID 1. I do have BTRFS RAID 1 in another laptop and there I
> > also had this issue already.
> 
> Since btrfs has multi-device support by default, it reports anonymous
> device number, just as if you use a filesystem over LVM.

Ah, this!

I forgot to mention that: I use BTRFS on top of LVM on top of LUKS based 
dm-crypt on a partition on the NVMe SSD. Sorry, somehow I forgot to 
mention that here. I mentioned it in the bug report. I'd use a different 
approach if there would be one that give me full disk encryption. I am 
not willing to use ecryptfs on top of BTRFS and as far as I know BTRFS 
cannot yet encrypt by itself.

I still think this could give a fixed order of loading:

1. Unlock LUKS.

2. Activate LVM logical volumes. No idea whether that happens in a fixed 
order though or whether it can have a different order on each boot.

3. Mount BTRFS. /home is always on the same subvolume. So that should 
not change.

> The problem is why the anonymous device number change.

Good question. Maybe I have an idea about that. See below.

> > I argued that a desktop application has no business to rely on a
> > device number and got back that search/indexing is in the middle
> > between an application and system software. And that Baloo needs an
> > "invariant" for a file. See comment #11 of that bug report:
> > 
> > https://bugs.kde.org/show_bug.cgi?id=438434#c11
> 
> Well, a lot of tools relies on device number to distinguish filesystem
> boundary, like find.
> Thus it's a little hard to argue.
> 
> But on the other hand, it also means baloo can't handle regular fs
> over LVM cases well neither.

Yes. Also it could not handle the case of a driver loading race 
condition with two or more different controllers in a desktop machine.

> > I got the suggestion to try to find a way to tell the kernel to use
> > a fixed device number.
> 
> I don't think it's possible for btrfs, as each subvolume get its
> anonymous device number assigned when it gets first read.
> 
> Thus it's really hard to make it fixed, as the reason for anonymous
> device number is to avoid conflicts.

Fair enough.

> > I still think, an application or an infrastructure service for a
> > desktop environment or even anything else in user space should not
> > rely on a device number to be fixed and never change upon reboots.
> 
> Well, LVM/device mapper is doing the same thing, a lot of behavior
> change is never a good idea for the kernel.
> 
> Thus for use cases where we really need a proper mapping, we use
> hashes, not just device number, like what we did in dupremover.

I think I suggested that some time ago.

> > Another question would be whether I could somehow make sure that the
> > device number does not change, even if just as a work-around.
> 
> If you really just want a fixed device number, you can ensure that by:
> 
> - Make sure all users of anonymous devices get fixed sequence
>    Things like device mapper/LVM, btrfs should get loaded/initialized
>    in a fixed order.

Ah, I see.

> - Make sure the subvolume you care always get mounted/read before any
>    other subvolumes
>    So that the target subvolume always get the first device number in
> the pool.

Hmm, that may be a pointer. This is what I currently have in fstab:

/dev/nvme/home /home btrfs lazytime,compress=zstd 0 0
/dev/nvme/home /zeit/home btrfs subvol=zeit 0 0

In the first line the default subvolume is used which I changed 
accordingly after creating this BTRFS. I use the approach to keep 
(temporary) snapshots separated from the directory tree in /home.

Could it be that this order between these two mounts is not the same on 
every boot? I use Devuan with Runit, so the mounting would happen by 
some init scripts (instead of Systemd).

I am not aware of an option for fstab to mount this one first and then 
the other second, but I could set the second mount to noauto and mount 
it when I need it.

>    But this also means, all later subvolumes not in the fixed
> mount/read sequence can not get a fixed number.

I somehow thought this would get complicated.

Best,
-- 
Martin





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux