Re: [RFC PATCH 1/6] Implement reexport helper library

Richard Weinberger <richard@xxxxxx> · Wed, 9 Mar 2022 10:43:34 +0100 (CET)

Bruce,

----- Ursprüngliche Mail -----
> Von: "bfields" <bfields@xxxxxxxxxxxx>
> On Thu, Feb 17, 2022 at 02:15:26PM +0100, Richard Weinberger wrote:
>> +#define REEXPDB_SHM_NAME "/nfs_reexport_db_lock"
>> +#define REEXPDB_SHM_SZ 4096
>> +#define REEXPDB_INIT_LOCK NFS_STATEDIR "/reexpdb_init.lock"
>> +#define REEXPDB_DBFILE NFS_STATEDIR "/reexpdb.sqlite3"
> 
> I don't know much about sqlite--why do we need to do our own file
> locking?  If we do need to do it ourself, could we lock the database
> file instead instead of using a separate lock file?

Concurrent access to the database is synchronized using a shared rwlock (on shared memory).
reexpdb_init.lock is used to make sure that creating and initializing the shared memory/lock
happens once.

>> +static const char initdb_sql[] = "CREATE TABLE IF NOT EXISTS fsidnums (num
>> INTEGER PRIMARY KEY CHECK (num > 0 AND num < 4294967296), path TEXT UNIQUE);
>> CREATE TABLE IF NOT EXISTS subvolumes (path TEXT PRIMARY KEY); CREATE INDEX IF
>> NOT EXISTS idx_ids_path ON fsidnums (path);";
> 
> I'd personally find it easier to read if these were defined in the place
> where they're used.  (And, honestly, if this is just used once, maybe
> the definition is unnecessary.)

Ok.

> What are the two tables used for?  Naively I'd've thought the
> "subvolumes" table was redundant.

fsidnums is used to store generated and predefined fsid numbers.
It is only used in reexport modes auto-fsidnum and predefined-fsidnum.

subvolumes contains a list of subvolumes which a are likely in use by
a client. Up start all these paths will get touched such that they can
be exported.

>> +/*
>> + * This query is a little tricky. We use SQL to find and claim the smallest
>> free fsid number.
> 
> Yes, that is a little tricky.  Is it necessary?  My SQL Is rusty, but
> the database should be able to pick a unique value for us, shouldn't it?

SQLite can generate a unique value, but we cannot select the range.
It will give a value between 0 and 2^64.
We need an id between 1 and 2^32. 

>> + * To find a free fsid the fsidnums is left joined to itself but with an offset
>> of 1.
>> + * Everything after the UNION statement is to handle the corner case where
>> fsidnums
>> + * is empty. In this case we want 1 as first fsid number.
>> + */
>> +static const char new_fsidnum_by_path_sql[] = "INSERT INTO fsidnums VALUES
>> ((SELECT ids1.num + 1 FROM fsidnums AS ids1 LEFT JOIN fsidnums AS ids2 ON
>> ids2.num = ids1.num + 1 WHERE ids2.num IS NULL UNION SELECT 1 WHERE NOT EXISTS
>> (SELECT NULL FROM fsidnums WHERE num = 1) LIMIT 1), ?1) RETURNING num;";
>> +static const char fsidnum_by_path_sql[] = "SELECT num FROM fsidnums WHERE path
>> = ?1;";
>> +static const char add_crossed_volume_sql[] = "REPLACE INTO subvolumes
>> VALUES(?1);";
>> +static const char drop_crossed_volume_sql[] = "DELETE FROM subvolumes WHERE
>> path = ?1;";
>> +static const char get_crossed_volumes_sql[] = "SELECT path from subvolumes;";
> ...
>> +/*
>> + * reexpdb_init - Initialize reexport database
>> + *
>> + * Setup shared lock (database is concurrently used by multiple processes),
> 
> So, this should all work when rpc.mountd is run with --num_threads > 1?

Yes, that's why we need the shared rwlock.

Thanks,
//richard