On Wed, Mar 09, 2022 at 04:02:11PM +0100, Richard Weinberger wrote: > Bruce, > > ----- Ursprüngliche Mail ----- > > Von: "bfields" <bfields@xxxxxxxxxxxx> > >> Concurrent access to the database is synchronized using a shared rwlock (on > >> shared memory). > >> reexpdb_init.lock is used to make sure that creating and initializing the shared > >> memory/lock > >> happens once. > > > > Could you point me to sqlite documentation that explains why the user > > would need to do their own locking? > > https://www.sqlite.org/rescode.html#busy > > > I assumed sqlite would do any necessary locking for you. It seems like > > a core function for a database. > > Well, SQLite does locking but no queuing. > So, as soon somebody is writing the data base it is locked and all other > read/writes will fail either with SQLITE_BUSY or SQLITE_LOCKED. > It is up to the user how to react on that. > > That's why I chose to use a shared rwlock where a task can *wait* upon > conflicting access. > > Maybe there is a better way do it, dunno. Oh, got it, thanks for the explanation. Assuming writes are rare, maybe a dumb retry loop would be adequate. Sounds like that's what we'd need anyway if we were to share the database between cooperating re-export servers. (Would we have a performance problem in that case, if several reexport servers start at once and all start trying to populate the shared database? I don't know.) Anyway, it's a judgement call, fair enough. Might be worth a brief comment, at least. > >> > What are the two tables used for? Naively I'd've thought the > >> > "subvolumes" table was redundant. > >> > >> fsidnums is used to store generated and predefined fsid numbers. > >> It is only used in reexport modes auto-fsidnum and predefined-fsidnum. > >> > >> subvolumes contains a list of subvolumes which a are likely in use by > >> a client. Up start all these paths will get touched such that they can > >> be exported. > > > > The fsidnums also contains that same list of paths, right? So I don't > > understand why we need both. > > In the current design generated fsidnums will stay forever while the paths > in subvolumes can get cleaned. > > > Also, if we're depending on touching all the paths on startup, something > > is wrong. > > I think we talked about that already and agreed that it should work without > touching. So far I didn't had a chance to investigate into this. OK. Do you think you could look into that, and strip this down to the one auto-fsidnum case, and then continue the discussion? I think that'd clarify things. As I say, I wouldn't necessarily be opposed to later adding a reexport= option back in, but for now I'd first like to see if we can find the simplest patches that will solve the problem in one good-enough way. > > What we want to do is touch the path when we get an upcall for the given > > fsid. That way we don't have to assume, for example, that the system > > will never expire mounts that haven't been used recently. > > > >> >> +/* > >> >> + * This query is a little tricky. We use SQL to find and claim the smallest > >> >> free fsid number. > >> > > >> > Yes, that is a little tricky. Is it necessary? My SQL Is rusty, but > >> > the database should be able to pick a unique value for us, shouldn't it? > >> > >> SQLite can generate a unique value, but we cannot select the range. > >> It will give a value between 0 and 2^64. > >> We need an id between 1 and 2^32. > > > > Surely that CHECK constraint doesn't somehow cause sqlite to generate > > non-unique primary keys? At worst I'd think it would cause INSERTs to > > fail if the ordinary primary-key-choosing algorithm chooses something > > over 2^32. > > The CHECK is just a paranoid check. My SQL INSERT generates ids starting with 1. > Sure, if you run it 2^32 times, it will fail due to the CHECK. OK. --b.