On 10/21/2011 01:10 PM, Trond Myklebust wrote: > On Fri, 2011-10-21 at 12:09 -0400, Nikolaus Rath wrote: >> On 10/21/2011 12:00 PM, Trond Myklebust wrote: >>> On Fri, 2011-10-21 at 09:54 -0400, Nikolaus Rath wrote: >>>> Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> writes: >>>>> On Thu, 2011-10-20 at 16:37 -0400, Nikolaus Rath wrote: >>>>>> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> writes: >>>>>>> On Thu, Oct 20, 2011 at 01:21:31PM -0400, Nikolaus Rath wrote: >>>>>>>> I'm working on a FUSE file system that stores file system metadata in an >>>>>>>> SQL database (http://code.google.com/p/s3ql/). Not having to keep track >>>>>>>> of inode generation numbers would keep the code much simpler, because I >>>>>>>> want to delete inode-rows from the SQL table when the last reference to >>>>>>>> the inode is deleted (so I can't keep track of the generation no). >>>>>>> >>>>>>> You can use current time, or a counter, or something, as the generation >>>>>>> number. >>>>>> >>>>>> With current time I'm screwed if the system clock doesn't have >>>>>> sufficiently fine granularity. With a counter, I either have to remember >>>>>> counter values per-inode even after the inode is deleted, or the global >>>>>> counter will overflow at some point (in which case I may just as well >>>>>> require unique inodes in the first place). >>>>> >>>>> The filehandle is between 32 (NFSv2) and 128(NFSv4) bytes long. How long >>>>> do you expect it to take you to create+destroy between 2^256 and 2^1024 >>>>> inodes? I'm guessing that we'll all be long dead and the universe will >>>>> have undergone heat death before that happens... >>>> >>>> Please stop assuming that I'm stupid or haven't thought about the >>>> problem at all. The bottleneck is not the length of the NFS file handle, >>>> but the length of the inode and generation number (both of which are >>>> restricted to 32bit by FUSE) together with the requirement that not only >>>> both of them together need to be unique forever, but the inode also >>>> needs to be unique at any given instant (so they cannot be trivially >>>> combined to form a 64bit value). >>> >>> No. The point is you don't need a generation number if you don't want to >>> implement one... >>> >>> You can use any unique identifier + the inode number, and the unique >>> identifier is only limited by the size of the filehandle. >> >> So how do you choose the unique identifier? It's limited by FUSE to >> 32bit and therefore can't be a global counter, it can't be a timestamp > > AFAICS fuse gives you a 64-bit inode number and a 32-bit generation > counter. Yes, with 64bit inodes everything would be fine. But fuse uses 'long' for inodes, so on 32bit systems you only have 32bit inodes even if ino_t is 64bit. > IOW: start allocating inode numbers incrementally from 0 - 2^64, then > each time you overflow the 64-bit inode number counter, bump the > generation number. You'll have to skip those inode numbers that are > already allocated in the subsequent generations, but the total number of > unique combinations is still likely to be more than large enough not to > be a worry. Yes, as I said eariler, it is possible to do with the available 32 + 32 bits, but it does introduce additional complexity. >> because the system clock may not have enough resolution, and it can't be >> a per-inode counter because then I can't discard the counter after the >> inode has been deleted. > > If you need more unique values, then modify fuse to allow your > filesystem to manage the exportfs interface. The fuse ABI is versioned, > and can be extended to support new features. FUSE 3 will have 64bit inodes, and I don't think this feature would make it into 2.x. Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html