Content-Type: text/plain; charset=US-ASCIItext/plain; charset=US-ASCII Well... to make life interesting, I've decided that it's probably a good idea to split cachefs into two: (*) FS-Cache An intermediary layer that sits between netfs's and cache backends. It doesn't care how a cache operates, as long as it provides certain operations. (*) CacheFS A cache backend that uses a block device for its data storage. See what you think. The wasn't really much work involved - it mostly split naturally. The attached patch include NFS on cachefs. AFS on CacheFS won't work with this patch yet. David --Multipart_Mon_Oct__4_17:32:25_2004-1 Content-Type: text/plain; type=patch; charset=US-ASCII Content-Disposition: attachment; filename="fscache-269rc2mm4.diff" Content-Transfer-Encoding: 7bit diff -uNr linux-2.6.9-rc2-mm4/Documentation/filesystems/cachefs.txt linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/cachefs.txt --- linux-2.6.9-rc2-mm4/Documentation/filesystems/cachefs.txt 2004-09-27 11:23:36.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/cachefs.txt 1970-01-01 01:00:00.000000000 +0100 @@ -1,892 +0,0 @@ - =========================== - CacheFS: Caching Filesystem - =========================== - -======== -OVERVIEW -======== - -CacheFS is a general purpose cache for network filesystems, though it could be -used for caching other things such as ISO9660 filesystems too. - -CacheFS uses a block device directly rather than a bunch of files under an -already mounted filesystem. For why this is so, see further on. If necessary, -however, a file can be loopback mounted as a cache. - -CacheFS does not follow the idea of completely loading every netfs file opened -into the cache before it can be operated upon, and then serving the pages out -of CacheFS rather than the netfs because: - - (1) It must be practical to operate without a cache. - - (2) The size of any accessible file must not be limited to the size of the - cache. - - (3) The combined size of all opened files (this includes mapped libraries) - must not be limited to the size of the cache. - - (4) The user should not be forced to download an entire file just to do a - one-off access of a small portion of it. - -It rather serves the cache out in PAGE_SIZE chunks as and when requested by -the netfs('s) using it. - - -CacheFS provides the following facilities: - - (1) More than one block device can be mounted as a cache. - - (2) Caches can be mounted / unmounted at any time. - - (3) The netfs is provided with an interface that allows either party to - withdraw caching facilities from a file (required for (2)). - - (4) The interface to the netfs returns as few errors as possible, preferring - rather to let the netfs remain oblivious. - - (5) Cookies are used to represent files and indexes to the netfs. The simplest - cookie is just a NULL pointer - indicating nothing cached there. - - (6) The netfs is allowed to propose - dynamically - any index hierarchy it - desires, though it must be aware that the index search function is - recursive and stack space is limited. - - (7) Data I/O is done direct to and from the netfs's pages. The netfs indicates - that page A is at index B of the data-file represented by cookie C, and - that it should be read or written. CacheFS may or may not start I/O on - that page, but if it does, a netfs callback will be invoked to indicate - completion. - - (8) Cookies can be "retired" upon release. At this point CacheFS will mark - them as obsolete and the index hierarchy rooted at that point will get - recycled. - - (9) The netfs provides a "match" function for index searches. In addition to - saying whether a match was made or not, this can also specify that an - entry should be updated or deleted. - -(10) All metadata modifications (this includes index contents) are performed - as journalled transactions. These are replayed on mounting. - - -============================================= -WHY A BLOCK DEVICE? WHY NOT A BUNCH OF FILES? -============================================= - -CacheFS is backed by a block device rather than being backed by a bunch of -files on a filesystem. This confers several advantages: - - (1) Performance. - - Going directly to a block device means that we can DMA directly to/from - the the netfs's pages. If another filesystem was managing the backing - store, everything would have to be copied between pages. Whilst DirectIO - does exist, it doesn't appear easy to make use of in this situation. - - New address space or file operations could be added to make it possible to - persuade a backing discfs to generate block I/O directly to/from disc - blocks under its control, but that then means the discfs has to keep track - of I/O requests to pages not under its control. - - Furthermore, we only have to do one lot of readahead calculations, not - two; in the discfs backing case, the netfs would do one and the discfs - would do one. - - (2) Memory. - - Using a block device means that we have a lower memory usage - all data - pages belong to the netfs we're backing. If we used a filesystem, we would - have twice as many pages at certain points - one from the netfs and one - from the backing discfs. In the backing discfs model, under situations of - memory pressure, we'd have to allocate or keep around a discfs page to be - able to write out a netfs page; or else we'd need to be able to punch a - hole in the backing file. - - Furthermore, whilst we have to keep a CacheFS inode around in memory for - every netfs inode we're backing, a backing discfs would have to keep the - dentry and possibly a file struct too. - - (3) Holes. - - The cache uses holes to indicate to the netfs that it hasn't yet - downloaded the data for that page. - - Since CacheFS is its own filesystem, it can support holes in files - trivially. Running on top of another discfs would limit us to using ones - that can support holes. - - Furthermore, it would have to be made possible to detect holes in a discfs - file, rather than just seeing zero filled blocks. - - (4) Data Consistency. - - Cachefs uses a pair of journals to keep track of the state of the cache - and all the pages contained therein. This means that it doesn't get into - an inconsistent state in the on-disc cache and it doesn't lose disc space. - - CacheFS takes especial care between the allocation of a block and its - splicing into the on-disc pointer tree, and the data having been written - to disc. If power is interrupted and then restored, the journals are - replayed and if it is seen that a block was allocated but not written it - is then punched out. Being backed by a discfs, I'm not certain what will - happen. It may well be possible to mark a discfs's journal, if it has one, - but how does the discfs deal with those marks? This also limits consistent - caching to running on journalled discfs's where there's a function to - write extraordinary marks into the journal. - - The alternative would be to keep flags in the superblock, and to - re-initialise the cache if it wasn't cleanly unmounted. - - Knowing that your cache is in a good state is vitally important if you, - say, put /usr on AFS. Some organisations put everything barring /etc, - /sbin, /lib and /var on AFS and have an enormous cache on every - computer. Imagine if the power goes out and renders every cache - inconsistent, requiring all the computers to re-initialise their caches - when the power comes back on... - - (5) Recycling. - - Recycling is simple on CacheFS. It can just scan the metadata index to - look for inodes that require reclamation/recycling; and it can also build - up a list of the least recently used inodes so that they can be reclaimed - later to make space. - - Doing this on a discfs would require a search going down through a nest - of directories, and would probably have to be done in userspace. - - (6) Disc Space. - - Whilst the block device does set a hard ceiling on the amount of space - available, CacheFS can guarantee that all that space will be available to - the cache. On a discfs-backed cache, the administrator would probably want - to set a cache size limit, but the system wouldn't be able guarantee that - all that space would be available to the cache - not unless that cache was - on a partition of its own. - - Furthermore, with a discfs-backed cache, if the recycler starts to reclaim - cache files to make space, the freed blocks may just be eaten directly by - userspace programs, potentially resulting in the entire cache being - consumed. Alternatively, netfs operations may end up being held up because - the cache can't get blocks on which to store the data. - - (7) Users. - - Users can't so easily go into CacheFS and run amok. The worst they can do - is cause bits of the cache to be recycled early. With a discfs-backed - cache, they can do all sorts of bad things to the files belonging to the - cache, and they can do this quite by accident. - - -On the other hand, there would be some advantages to using a file-based cache -rather than a blockdev-based cache: - - (1) Having to copy to a discfs's page would mean that a netfs could just make - the copy and then assume its own page is ready to go. - - (2) Backing onto a discfs wouldn't require a committed block device. You would - just nominate a directory and go from there. With CacheFS you have to - repartition or install an extra drive to make use of it in an existing - system (though the loopback device offers a way out). - - (3) CacheFS requires the netfs to store a key in any pertinent index entry, - and it also permits a limited amount arbitrary data to be stored there. - - A discfs could be requested to store the netfs's data in xattrs, and the - filename could be used to store the key, though the key would have to be - rendered as text not binary. Likewise indexes could be rendered as - directories with xattrs. - - (4) You could easily make your cache bigger if the discfs has plenty of space, - you could even go across multiple mountpoints. - - -====================== -GENERAL ON-DISC LAYOUT -====================== - -The filesystem is divided into a number of parts: - - 0 +---------------------------+ - | Superblock | - 1 +---------------------------+ - | Update Journal | - +---------------------------+ - | Validity Journal | - +---------------------------+ - | Write-Back Journal | - +---------------------------+ - | | - | Data | - | | - END +---------------------------+ - -The superblock contains the filesystem ID tags and pointers to all the other -regions. - -The update journal consists of a set of entries of sector size that keep track -of what changes have been made to the on-disc filesystem, but not yet -committed. - -The validity journal contains records of data blocks that have been allocated -but not yet written. Upon journal replay, all these blocks will be detached -from their pointers and recycled. - -The writeback journal keeps track of changes that have been made locally to -data blocks, but that have not yet been committed back to the server. This is -not yet implemented. - -The journals are replayed upon mounting to make sure that the cache is in a -reasonable state. - -The data region holds a number of things: - - (1) Index Files - - These are files of entries used by CacheFS internally and by filesystems - that wish to cache data here (such as AFS) to keep track of what's in - the cache at any given time. - - The first index file (inode 1) is special. It holds the CacheFS-specific - metadata for every file in the cache (including direct, single-indirect - and double-indirect block pointers). - - The second index file (inode 2) is also special. It has an entry for - each filesystem that's currently holding data in this cache. - - Every allocated entry in an index has an inode bound to it. This inode is - either another index file or it is a data file. - - (2) Cached Data Files - - These are caches of files from remote servers. Holes in these files - represent blocks not yet obtained from the server. - - (3) Indirection Blocks - - Should a file have more blocks than can be pointed to by the few - pointers in its storage management record, then indirection blocks will - be used to point to further data or indirection blocks. - - Three levels of indirection are currently supported: - - - single indirection - - double indirection - - (4) Allocation Nodes and Free Blocks - - The free blocks of the filesystem are kept in two single-branched - "trees". One tree is the blocks that are ready to be allocated, and the - other is the blocks that have just been recycled. When the former tree - becomes empty, the latter tree is decanted across. - - Each tree is arranged as a chain of "nodes", each node points to the next - node in the chain (unless it's at the end) and also up to 1022 free - blocks. - -Note that all blocks are PAGE_SIZE in size. The blocks are numbered starting -with the superblock at 0. Using 32-bit block pointers, a maximum number of -0xffffffff blocks can be accessed, meaning that the maximum cache size is ~16TB -for 4KB pages. - - -======== -MOUNTING -======== - -Since CacheFS is actually a quasi-filesystem, it requires a block device behind -it. The way to give it one is to mount it as cachefs type on a directory -somewhere. The mounted filesystem will then present the user with a set of -directories outlining the index structure resident in the cache. Indexes -(directories) and files can be turfed out of the cache by the sysadmin through -the use of rmdir and unlink. - -For instance, if a cache contains AFS data, the user might see the following: - - root>mount -t cachefs /dev/hdg9 /cache-hdg9 - root>ls -1 /cache-hdg9 - afs - root>ls -1 /cache-hdg9/afs - cambridge.redhat.com - root>ls -1 /cache-hdg9/afs/cambridge.redhat.com - root.afs - root.cell - -However, a block device that's going to be used for a cache must be prepared -before it can be mounted initially. This is done very simply by: - - echo "cachefs___" >/dev/hdg9 - -During the initial mount, the basic structure will be scribed into the cache, -and then a background thread will "recycle" the as-yet unused data blocks. - - -====================== -NETWORK FILESYSTEM API -====================== - -There is, of course, an API by which a network filesystem can make use of the -CacheFS facilities. This is based around a number of principles: - - (1) Every file and index is represented by a cookie. This cookie may or may - not have anything associated with it, but the netfs doesn't need to care. - - (2) Barring the top-level index (one entry per cached netfs), the index - hierarchy for each netfs is structured according the whim of the netfs. - - (3) Any netfs page being backed by the cache must have a small token - associated with it (possibly pointed to by page->private) so that CacheFS - can keep track of it. - -This API is declared in <linux/cachefs.h>. - - -NETWORK FILESYSTEM DEFINITION ------------------------------ - -CacheFS needs a description of the network filesystem. This is specified using -a record of the following structure: - - struct cachefs_netfs { - const char *name; - unsigned version; - struct cachefs_netfs_operations *ops; - struct cachefs_cookie *primary_index; - ... - }; - -This first three fields should be filled in before registration, and the fourth -will be filled in by the registration function; any other fields should just be -ignored and are for internal use only. - -The fields are: - - (1) The name of the netfs (used as the key in the toplevel index). - - (2) The version of the netfs (if the name matches but the version doesn't, the - entire on-disc hierarchy for this netfs will be scrapped and begun - afresh). - - (3) The operations table is defined as follows: - - struct cachefs_netfs_operations { - struct cachefs_page *(*get_page_cookie)(struct page *page); - }; - - The functions here must all be present. Currently the only one is: - - (a) get_page_cookie(): Get the token used to bind a page to a block in a - cache. This function should allocate it if it doesn't exist. - - Return -ENOMEM if there's not enough memory and -ENODATA if the page - just shouldn't be cached. - - Set *_page_cookie to point to the token and return 0 if there is now a - cookie. Note that the netfs must keep track of the cookie itself (and - free it later). page->private can be used for this (see below). - - (4) The cookie representing the primary index will be allocated according to - another parameter passed into the registration function. - -For example, kAFS (linux/fs/afs/) uses the following definitions to describe -itself: - - static struct cachefs_netfs_operations afs_cache_ops = { - .get_page_cookie = afs_cache_get_page_cookie, - }; - - struct cachefs_netfs afs_cache_netfs = { - .name = "afs", - .version = 0, - .ops = &afs_cache_ops, - }; - - -INDEX DEFINITION ----------------- - -Indexes are used for two purposes: - - (1) To speed up the finding of a file based on a series of keys (such as AFS's - "cell", "volume ID", "vnode ID"). - - (2) To make it easier to discard a subset of all the files cached based around - a particular key - for instance to mirror the removal of an AFS volume. - -However, since it's unlikely that any two netfs's are going to want to define -their index hierarchies in quite the same way, CacheFS tries to impose as few -restraints as possible on how an index is structured and where it is placed in -the tree. The netfs can even mix indexes and data files at the same level, but -it's not recommended. - -There are some limits on indexes: - - (1) All entries in any given index must be the same size. An array of such - entries needn't fit exactly into a page, but they will be not laid across - a page boundary. - - The netfs supplies a blob of data for each index entry, and CacheFS - provides an inode number and a flag. - - (2) The entries in one index can be of a different size to the entries in - another index. - - (3) The entry data must be journallable, and thus must be able to fit into an - update journal entry - this limits the maximum size to a little over 400 - bytes at present. - - (4) The index data must start with the key. The layout of the key is described - in the index definition, and this is used to display the key in some - appropriate way. - - (5) The depth of the index tree should be judged with care as the search - function is recursive. Too many layers will run the kernel out of stack. - -To define an index, a structure of the following type should be filled out: - - struct cachefs_index_def - { - uint8_t name[8]; - uint16_t data_size; - struct { - uint8_t type; - uint16_t len; - } keys[4]; - - cachefs_match_val_t (*match)(void *target_netfs_data, - const void *entry); - - void (*update)(void *source_netfs_data, void *entry); - }; - -This has the following fields: - - (1) The name of the index (NUL terminated unless all 8 chars are used). - - (2) The size of the data blob provided by the netfs. - - (3) A definition of the key(s) at the beginning of the blob. The netfs is - permitted to specify up to four keys. The total length must not exceed the - data size. It is assumed that the keys will be laid end to end in order, - starting at the first byte of the data. - - The type field specifies the way the data should be displayed. It can be - one of: - - (*) CACHEFS_INDEX_KEYS_NOTUSED - key field not used - (*) CACHEFS_INDEX_KEYS_BIN - display byte-by-byte in hex - (*) CACHEFS_INDEX_KEYS_ASCIIZ - NUL-terminated ASCII - (*) CACHEFS_INDEX_KEYS_IPV4ADDR - display as IPv4 address - (*) CACHEFS_INDEX_KEYS_IPV6ADDR - display as IPv6 address - - (4) A function to compare an in-page-cache index entry blob with the data - passed to the cookie acquisition function. This function can also be used - to extract data from the blob and copy it into the netfs's structures. - - The values this function can return are: - - (*) CACHEFS_MATCH_FAILED - failed to match - (*) CACHEFS_MATCH_SUCCESS - successful match - (*) CACHEFS_MATCH_SUCCESS_UPDATE - successful match, entry needs update - (*) CACHEFS_MATCH_SUCCESS_DELETE - entry should be deleted - - For example, in linux/fs/afs/vnode.c: - - static cachefs_match_val_t - afs_vnode_cache_match(void *target, const void *entry) - { - const struct afs_cache_vnode *cvnode = entry; - struct afs_vnode *vnode = target; - - if (vnode->fid.vnode != cvnode->vnode_id) - return CACHEFS_MATCH_FAILED; - - if (vnode->fid.unique != cvnode->vnode_unique || - vnode->status.version != cvnode->data_version) - return CACHEFS_MATCH_SUCCESS_DELETE; - - return CACHEFS_MATCH_SUCCESS; - } - - (5) A function to initialise or update an in-page-cache index entry blob from - netfs data passed to CacheFS by the netfs. This function should not assume - that there's any data yet in the in-page-cache. - - Continuing the above example: - - static void afs_vnode_cache_update(void *source, void *entry) - { - struct afs_cache_vnode *cvnode = entry; - struct afs_vnode *vnode = source; - - cvnode->vnode_id = vnode->fid.vnode; - cvnode->vnode_unique = vnode->fid.unique; - cvnode->data_version = vnode->status.version; - } - -To finish the above example, the index definition for the "vnode" level is as -follows: - - struct cachefs_index_def afs_vnode_cache_index_def = { - .name = "vnode", - .data_size = sizeof(struct afs_cache_vnode), - .keys[0] = { CACHEFS_INDEX_KEYS_BIN, 4 }, - .match = afs_vnode_cache_match, - .update = afs_vnode_cache_update, - }; - -The first element of struct afs_cache_vnode is the vnode ID. - -And for contrast, the cell index definition is: - - struct cachefs_index_def afs_cache_cell_index_def = { - .name = "cell_ix", - .data_size = sizeof(afs_cell_t), - .keys[0] = { CACHEFS_INDEX_KEYS_ASCIIZ, 64 }, - .match = afs_cell_cache_match, - .update = afs_cell_cache_update, - }; - -The cell index is the primary index for kAFS. - - -NETWORK FILESYSTEM (UN)REGISTRATION ------------------------------------ - -The first step is to declare the network filesystem to the cache. This also -involves specifying the layout of the primary index (for AFS, this would be the -"cell" level). - -The registration function is: - - int cachefs_register_netfs(struct cachefs_netfs *netfs, - struct cachefs_index_def *primary_idef); - -It just takes pointers to the netfs definition and the primary index -definition. It returns 0 or an error as appropriate. - -For kAFS, registration is done as follows: - - ret = cachefs_register_netfs(&afs_cache_netfs, - &afs_cache_cell_index_def); - -The last step is, of course, unregistration: - - void cachefs_unregister_netfs(struct cachefs_netfs *netfs); - - -INDEX REGISTRATION ------------------- - -The second step is to inform cachefs about part of an index hierarchy that can -be used to locate files. This is done by requesting a cookie for each index in -the path to the file: - - struct cachefs_cookie * - cachefs_acquire_cookie(struct cachefs_cookie *iparent, - struct cachefs_index_def *idef, - void *netfs_data); - -This function creates an index entry in the index represented by iparent, -loading the associated blob by calling iparent's update method with the -supplied netfs_data. - -It also creates a new index inode, formatted according to the definition -supplied in idef. The new cookie is then returned in *_cookie. - -Note that this function never returns an error - all errors are handled -internally. It may also return CACHEFS_NEGATIVE_COOKIE. It is quite acceptable -to pass this token back to this function as iparent (or even to the relinquish -cookie, read page and write page functions - see below). - -Note also that no indexes are actually created on disc until a data file needs -to be created somewhere down the hierarchy. Furthermore, an index may be -created in several different caches independently at different times. This is -all handled transparently, and the netfs doesn't see any of it. - -For example, with AFS, a cell would be added to the primary index. This index -entry would have a dependent inode containing a volume location index for the -volume mappings within this cell: - - cell->cache = - cachefs_acquire_cookie(afs_cache_netfs.primary_index, - &afs_vlocation_cache_index_def, - cell); - -Then when a volume location was accessed, it would be entered into the cell's -index and an inode would be allocated that acts as a volume type and hash chain -combination: - - vlocation->cache = - cachefs_acquire_cookie(cell->cache, - &afs_volume_cache_index_def, - vlocation); - -And then a particular flavour of volume (R/O for example) could be added to -that index, creating another index for vnodes (AFS inode equivalents): - - volume->cache = - cachefs_acquire_cookie(vlocation->cache, - &afs_vnode_cache_index_def, - volume); - - -DATA FILE REGISTRATION ----------------------- - -The third step is to request a data file be created in the cache. This is -almost identical to index cookie acquisition. The only difference is that a -NULL index definition is passed. - - vnode->cache = - cachefs_acquire_cookie(volume->cache, - NULL, - vnode); - - - -PAGE ALLOC/READ/WRITE ---------------------- - -And the fourth step is to propose a page be cached. There are two functions -that are used to do this. - -Firstly, the netfs should ask CacheFS to examine the caches and read the -contents cached for a particular page of a particular file if present, or else -allocate space to store the contents if not: - - typedef - void (*cachefs_rw_complete_t)(void *cookie_data, - struct page *page, - void *end_io_data, - int error); - - int cachefs_read_or_alloc_page(struct cachefs_cookie *cookie, - struct page *page, - cachefs_rw_complete_t end_io_func, - void *end_io_data, - unsigned long gfp); - -The cookie argument must specify a data file cookie, the page specified will -have the data loaded into it (and is also used to specify the page number), and -the gfp argument is used to control how any memory allocations made are satisfied. - -If the cookie indicates the inode is not cached: - - (1) The function will return -ENOBUFS. - -Else if there's a copy of the page resident on disc: - - (1) The function will submit a request to read the data off the disc directly - into the page specified. - - (2) The function will return 0. - - (3) When the read is complete, end_io_func() will be invoked with: - - (*) The netfs data supplied when the cookie was created. - - (*) The page descriptor. - - (*) The data passed to the above function. - - (*) An argument that's 0 on success or negative for an error. - - If an error occurs, it should be assumed that the page contains no usable - data. - -Otherwise, if there's not a copy available on disc: - - (1) A block may be allocated in the cache and attached to the inode at the - appropriate place. - - (2) The validity journal will be marked to indicate this page does not yet - contain valid data. - - (3) The function will return -ENODATA. - - -Secondly, if the netfs changes the contents of the page (either due to an -initial download or if a user performs a write), then the page should be -written back to the cache: - - int cachefs_write_page(struct cachefs_cookie *cookie, - struct page *page, - cachefs_rw_complete_t end_io_func, - void *end_io_data, - unsigned long gfp); - -The cookie argument must specify a data file cookie, the page specified should -contain the data to be written (and is also used to specify the page number), -and the gfp argument is used to control how any memory allocations made are -satisfied. - -If the cookie indicates the inode is not cached then: - - (1) The function will return -ENOBUFS. - -Else if there's a block allocated on disc to hold this page: - - (1) The function will submit a request to write the data to the disc directly - from the page specified. - - (2) The function will return 0. - - (3) When the write is complete: - - (a) Any associated validity journal entry will be cleared (the block now - contains valid data as far as CacheFS is concerned). - - (b) end_io_func() will be invoked with: - - (*) The netfs data supplied when the cookie was created. - - (*) The page descriptor. - - (*) The data passed to the above function. - - (*) An argument that's 0 on success or negative for an error. - - If an error happens, it can be assumed that the page has been - discarded from the cache. - - -PAGE UNCACHING --------------- - -To uncache a page, this function should be called: - - void cachefs_uncache_page(struct cachefs_cookie *cookie, - struct page *page); - -This detaches the page specified from the data file indicated by the cookie and -unbinds it from the underlying block. - -Note that pages can't be explicitly detached from the a data file. The whole -data file must be retired (see the relinquish cookie function below). - -Furthermore, note that this does not cancel the asynchronous read or write -operation started by the read/alloc and write functions. - - -INDEX AND DATA FILE UPDATE --------------------------- - -To request an update of the index data for an index or data file, the following -function should be called: - - void cachefs_update_cookie(struct cachefs_cookie *cookie); - -This function will refer back to the netfs_data pointer stored in the cookie by -the acquisition function to obtain the data to write into each revised index -entry. The update method in the parent index definition will be called to -transfer the data. - - -INDEX AND DATA FILE UNREGISTRATION ----------------------------------- - -To get rid of a cookie, this function should be called. - - void cachefs_relinquish_cookie(struct cachefs_cookie *cookie, - int retire); - -If retire is non-zero, then the index or file will be marked for recycling, and -all copies of it will be removed from all active caches in which it is present. - -If retire is zero, then the inode may be available again next the the -acquisition function is called. - -One very important note - relinquish must NOT be called unless all "child" -indexes, files and pages have been relinquished first. - - -PAGE TOKEN MANAGEMENT ---------------------- - -As previously mentioned, the netfs must keep a token associated with each page -currently actively backed by the cache. This is used by CacheFS to go from a -page to the internal representation of the underlying block and back again. It -is particularly important for managing the withdrawal of a cache whilst it is -in active service (eg: it got unmounted). - -The token is this: - - struct cachefs_page { - ... - }; - -Note that all fields are for internal CacheFS use only. - -The token only needs to be allocated when CacheFS asks for it. This it will do -by calling the get_page_cookie() method in the netfs definition ops table. Once -allocated, the same token should be presented every time the method is called -again for a particular page. - -The token should be retained by the netfs, and should be deleted only after the -page has been uncached. - -One way to achieve this is to attach the token to page->private (and set the -PG_private bit on the page) once allocated. Shortcut routines are provided by -CacheFS to do this. Firstly, to retrieve if present and allocate if not: - - struct cachefs_page *cachefs_page_get_private(struct page *page, - unsigned gfp); - -Secondly to retrieve if present and BUG if not: - - static inline - struct cachefs_page *cachefs_page_grab_private(struct page *page); - -To clean up the tokens, the netfs inode hosting the page should be provided -with address space operations that circumvent the buffer-head operations for a -page. For instance: - - struct address_space_operations afs_fs_aops = { - ... - .sync_page = block_sync_page, - .set_page_dirty = __set_page_dirty_nobuffers, - .releasepage = afs_file_releasepage, - .invalidatepage = afs_file_invalidatepage, - }; - - static int afs_file_invalidatepage(struct page *page, - unsigned long offset) - { - struct afs_vnode *vnode = AFS_FS_I(page->mapping->host); - int ret = 1; - - BUG_ON(!PageLocked(page)); - if (!PagePrivate(page)) - return 1; - cachefs_uncache_page(vnode->cache,page); - if (offset == 0) - return 1; - BUG_ON(!PageLocked(page)); - if (PageWriteback(page)) - return 0; - return page->mapping->a_ops->releasepage(page, 0); - } - - static int afs_file_releasepage(struct page *page, int gfp_flags) - { - struct cachefs_page *token; - struct afs_vnode *vnode = AFS_FS_I(page->mapping->host); - - if (PagePrivate(page)) { - cachefs_uncache_page(vnode->cache, page); - token = (struct cachefs_page *) page->private; - page->private = 0; - ClearPagePrivate(page); - if (token) - kfree(token); - } - return 0; - } - - -INDEX AND DATA FILE INVALIDATION --------------------------------- - -There is no direct way to invalidate an index subtree or a data file. To do -this, the caller should relinquish and retire the cookie they have, and then -acquire a new one. diff -uNr linux-2.6.9-rc2-mm4/Documentation/filesystems/caching/backend-api.txt linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/caching/backend-api.txt --- linux-2.6.9-rc2-mm4/Documentation/filesystems/caching/backend-api.txt 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/caching/backend-api.txt 2004-10-04 17:24:21.770274843 +0100 @@ -0,0 +1,313 @@ + ========================== + FS-CACHE CACHE BACKEND API + ========================== + +The FS-Cache system provides an API by which actual caches can be supplied to +FS-Cache for it to then serve out to network filesystems and other interested +parties.: + +This API is declared in <linux/fscache-cache.h>. + + +==================================== +INITIALISING AND REGISTERING A CACHE +==================================== + +To start off, a cache definition must be initialised and registered for each +cache the backend wants to make available. For instance, CacheFS does this in +the fill_super() operation on mounting. + +The cache definition (struct fscache_cache) should be initialised by calling: + + void fscache_init_cache(struct fscache_cache *cache, + struct fscache_cache_ops *ops, + unsigned fsdef_ino, + const char *idfmt, + ...) + +Where: + + (*) "cache" is a pointer to the cache definition; + + (*) "ops" is a pointer to the table of operations that the backend supports on + this cache; + + (*) "fsdef_ino" is the reference number of the FileSystem DEFinition index + (the top-level index), which in CacheFS is its inode number; + + (*) and a format and printf-style arguments for constructing a label for the + cache. + + +The cache should then be registered with FS-Cache by passing a pointer to the +previously initialised cache definition to: + + void fscache_add_cache(struct fscache_cache *cache) + + +===================== +UNREGISTERING A CACHE +===================== + +A cache can be withdrawn from the system by calling this function with a +pointer to the cache definition: + + void fscache_withdraw_cache(struct fscache_cache *cache) + +In CacheFS's case, this is called by put_super(). + +It is possible to check to see if a cache has been withdrawn by calling: + + int fscache_is_cache_withdrawn(struct fscache_cache *cache) + +Which will return non-zero if it has been, zero if it is still active. + + +================== +FS-CACHE UTILITIES +================== + +FS-Cache provides some utilities that a cache backend may make use of: + + (*) Find parent of node. + + struct fscache_node *fscache_find_parent_node(struct fscache_node *node) + + This allows a backend to find the logical parent of an index or data file + in the cache hierarchy. + + (*) Allocate a page token. + + struct fscache_page *fscache_page_get_private(struct page *page, + unsigned gfp); + + If the page has a page token attached, then this is returned by this + function. If it doesn't have one, then a page token is allocated with the + specified allocation flags and attached to the page's private value. The + error ENOMEM is returned if there's no memory available. + + (*) Grab an existing page token. + + struct fscache_page *fscache_page_grab_private(struct page *page) + + This function returns a pointer to the page token attached to the page's + private value if it exists, and BUG's if it does not. + + +======================== +RELEVANT DATA STRUCTURES +======================== + + (*) Index/Data file FS-Cache representation cookie. + + struct fscache_cookie { + struct fscache_index_def *idef; + struct fscache_netfs *netfs; + void *netfs_data; + ... + }; + + The fields that might be of use to the backend describe the index + definition (indexes only), the netfs definition and the netfs's data for + this cookie. The index definition contains a number of functions supplied + by the netfs for matching index entries; these are required to provide + some of the cache operations. + + (*) Cached search result. + + struct fscache_search_result { + unsigned ino; + ... + }; + + This is used by FS-Cache to keep track of what nodes it has found in what + caches. Some of the cache operations set the "cache node number" held + therein. + + (*) In-cache node representation. + + struct fscache_node { + struct fscache_cookie *cookie; + unsigned long flags; + #define FSCACHE_NODE_ISINDEX 0 + ... + }; + + Each node contains a pointer to the cookie that represents the index or + data file it is backing. It also contains a flag that indicates whether + this is an index or not. This should be initialised by calling + fscache_node_init(node). + + (*) Filesystem definition (FSDEF) index entry representation. + + struct fscache_fsdef_index_entry { + uint8_t name[24]; /* name of netfs */ + uint32_t version; /* version of layout */ + }; + + This structure defines the layout of the data in the FSDEF index + maintained by the FS-Cache facility for distinguishing between the caches + for separate netfs's. + + +================ +CACHE OPERATIONS +================ + +The cache backend provides FS-Cache with a table of operations that can be +performed on the denizens of the cache. These are held in a structure of type + + struct fscache_cache_ops + + (*) Name of cache provider [mandatory]. + + const char *name + + This isn't strictly an operation, but should be pointed at a string naming + the backend. + + (*) Node lookup [mandatory]. + + struct fscache_node *(*lookup_node)(struct fscache_cache *cache, + unsigned ino) + + This method is used to turn a logical cache node number into a handle on a + represention of that node. + + (*) Increment node refcount [mandatory]. + + struct fscache_node *(*grab_node)(struct fscache_node *node) + + This method is called to increment the reference count on a node. It may + fail (for instance if the cache is being withdrawn). + + (*) Lock/Unlock node [mandatory]. + + void (*lock_node)(struct fscache_node *node) + void (*unlock_node)(struct fscache_node *node) + + These methods are used to exclusively lock a node. It must be possible to + schedule with the lock held, so a spinlock isn't sufficient. + + (*) Unreference node [mandatory]. + + void (*put_node)(struct fscache_node *node) + + This method is used to discard a reference to a node. The node may be + destroyed when all the references held by FS-Cache are released. + + (*) Search an index [mandatory]. + + int (*index_search)(struct fscache_node *index, + struct fscache_cookie *cookie, + struct fscache_search_result *result) + + This method is called to search an index for a node that matches the + criteria attached to the cookie (cookie->netfs_data). This should be + matched by calling index->cookie->idef->match(). + + The cache backend is responsible for dealing with the match result, + including updating or discarding existing index entries. An index entry + can be updated by calling index->cookie->idef->update(). + + If the search is successful, the node number should be stored in + result->ino and zero returned. If not successful, error ENOENT should be + returned if no entry was found, or some other error otherwise. + + (*) Create a new node [mandatory]. + + int (*index_add)(struct fscache_node *index, + struct fscache_cookie *cookie, + struct fscache_search_result *result) + + This method is called to create a new node on disc and add an entry for it + to the specified index. The index entry for the new node should be + obtained by calling index->cookie->idef->update() and passing it the + argument cookie. + + If successful, the node number should be stored in result->ino and zero + should be returned. + + (*) Update a node [mandatory]. + + int (*index_update)(struct fscache_node *index, + struct fscache_node *node) + + This is called to update the on-disc index entry for the specified + node. The new information should be in node->cookie->netfs_data. This can + be obtained by calling index->cookie->idef->update() and passing it + node->cookie. + + (*) Synchronise a cache to disc [mandatory]. + + void (*sync)(struct fscache_cache *cache) + + This is called to ask the backend to synchronise a cache with disc. + + (*) Dissociate a cache [mandatory]. + + void (*dissociate_pages)(struct fscache_cache *cache) + + This is called to ask the cache to dissociate all netfs pages from + mappings to disc. It is assumed that the backend cache will have some way + of finding all the page tokens that refer to its own blocks. + + (*) Request page be read from cache [mandatory]. + + int (*read_or_alloc_page)(struct fscache_node *node, + struct page *page, + struct fscache_page *pageio, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp) + + This is called to attempt to read a netfs page from disc, or to allocate a + backing block if not. FS-Cache will have done as much checking as it can + before calling, but most of the work belongs to the backend. + + If there's no page on disc, then -ENODATA should be returned if the + backend managed to allocate a backing block; -ENOBUFS or -ENOMEM if it + didn't. + + If there is a page on disc, then a read operation should be queued and 0 + returned. When the read finishes, end_io_func() should be called with the + following arguments: + + (*end_io_func)(node->cookie->netfs_data, + page, + end_io_data, + error); + + (*) Request page be written to cache [mandatory]. + + int (*write_page)(struct fscache_node *node, + struct page *page, + struct fscache_page *pageio, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp) + + This is called to write from a page on which there was a previously + successful read_or_alloc_page() call. FS-Cache filters out pages that + don't have mappings. + + If there's no block on disc available, then -ENOBUFS should be returned + (or -ENOMEM if there wasn't any memory to be had). + + If the write operation could be queued, then 0 should be returned. When + the write completes, end_io_func() should be called with the following + arguments: + + (*end_io_func)(node->cookie->netfs_data, + page, + end_io_data, + error); + + (*) Discard mapping [mandatory]. + + void (*uncache_page)(struct fscache_node *node, + struct fscache_page *page_token) + + This is called when a page is being booted from the pagecache. The cache + backend needs to break the links between the page token and whatever + internal representations it maintains. diff -uNr linux-2.6.9-rc2-mm4/Documentation/filesystems/caching/cachefs.txt linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/caching/cachefs.txt --- linux-2.6.9-rc2-mm4/Documentation/filesystems/caching/cachefs.txt 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/caching/cachefs.txt 2004-10-04 17:25:36.878085706 +0100 @@ -0,0 +1,274 @@ + =========================== + CacheFS: Caching Filesystem + =========================== + +======== +OVERVIEW +======== + +CacheFS is a backend for the general filesystem cache facility. + +CacheFS uses a block device directly rather than a bunch of files under an +already mounted filesystem. For why this is so, see further on. If necessary, +however, a file can be loopback mounted as a cache. + + +CacheFS provides the following facilities: + + (1) More than one block device can be mounted as a cache. + + (2) Caches can be mounted / unmounted at any time. + + (3) All metadata modifications (this includes index contents) are performed + as journalled transactions. These are replayed on mounting. + + +============================================= +WHY A BLOCK DEVICE? WHY NOT A BUNCH OF FILES? +============================================= + +CacheFS is backed by a block device rather than being backed by a bunch of +files on a filesystem. This confers several advantages: + + (1) Performance. + + Going directly to a block device means that we can DMA directly to/from + the the netfs's pages. If another filesystem was managing the backing + store, everything would have to be copied between pages. Whilst DirectIO + does exist, it doesn't appear easy to make use of in this situation. + + New address space or file operations could be added to make it possible to + persuade a backing discfs to generate block I/O directly to/from disc + blocks under its control, but that then means the discfs has to keep track + of I/O requests to pages not under its control. + + Furthermore, we only have to do one lot of readahead calculations, not + two; in the discfs backing case, the netfs would do one and the discfs + would do one. + + (2) Memory. + + Using a block device means that we have a lower memory usage - all data + pages belong to the netfs we're backing. If we used a filesystem, we would + have twice as many pages at certain points - one from the netfs and one + from the backing discfs. In the backing discfs model, under situations of + memory pressure, we'd have to allocate or keep around a discfs page to be + able to write out a netfs page; or else we'd need to be able to punch a + hole in the backing file. + + Furthermore, whilst we have to keep a CacheFS inode around in memory for + every netfs inode we're backing, a backing discfs would have to keep the + dentry and possibly a file struct too. + + (3) Holes. + + The cache uses holes to indicate to the netfs that it hasn't yet + downloaded the data for that page. + + Since CacheFS is its own filesystem, it can support holes in files + trivially. Running on top of another discfs would limit us to using ones + that can support holes. + + Furthermore, it would have to be made possible to detect holes in a discfs + file, rather than just seeing zero filled blocks. + + (4) Data Consistency. + + Cachefs uses a pair of journals to keep track of the state of the cache + and all the pages contained therein. This means that it doesn't get into + an inconsistent state in the on-disc cache and it doesn't lose disc space. + + CacheFS takes especial care between the allocation of a block and its + splicing into the on-disc pointer tree, and the data having been written + to disc. If power is interrupted and then restored, the journals are + replayed and if it is seen that a block was allocated but not written it + is then punched out. Being backed by a discfs, I'm not certain what will + happen. It may well be possible to mark a discfs's journal, if it has one, + but how does the discfs deal with those marks? This also limits consistent + caching to running on journalled discfs's where there's a function to + write extraordinary marks into the journal. + + The alternative would be to keep flags in the superblock, and to + re-initialise the cache if it wasn't cleanly unmounted. + + Knowing that your cache is in a good state is vitally important if you, + say, put /usr on AFS. Some organisations put everything barring /etc, + /sbin, /lib and /var on AFS and have an enormous cache on every + computer. Imagine if the power goes out and renders every cache + inconsistent, requiring all the computers to re-initialise their caches + when the power comes back on... + + (5) Recycling. + + Recycling is simple on CacheFS. It can just scan the metadata index to + look for inodes that require reclamation/recycling; and it can also build + up a list of the least recently used inodes so that they can be reclaimed + later to make space. + + Doing this on a discfs would require a search going down through a nest + of directories, and would probably have to be done in userspace. + + (6) Disc Space. + + Whilst the block device does set a hard ceiling on the amount of space + available, CacheFS can guarantee that all that space will be available to + the cache. On a discfs-backed cache, the administrator would probably want + to set a cache size limit, but the system wouldn't be able guarantee that + all that space would be available to the cache - not unless that cache was + on a partition of its own. + + Furthermore, with a discfs-backed cache, if the recycler starts to reclaim + cache files to make space, the freed blocks may just be eaten directly by + userspace programs, potentially resulting in the entire cache being + consumed. Alternatively, netfs operations may end up being held up because + the cache can't get blocks on which to store the data. + + (7) Users. + + Users can't so easily go into CacheFS and run amok. The worst they can do + is cause bits of the cache to be recycled early. With a discfs-backed + cache, they can do all sorts of bad things to the files belonging to the + cache, and they can do this quite by accident. + + +On the other hand, there would be some advantages to using a file-based cache +rather than a blockdev-based cache: + + (1) Having to copy to a discfs's page would mean that a netfs could just make + the copy and then assume its own page is ready to go. + + (2) Backing onto a discfs wouldn't require a committed block device. You would + just nominate a directory and go from there. With CacheFS you have to + repartition or install an extra drive to make use of it in an existing + system (though the loopback device offers a way out). + + (3) CacheFS requires the netfs to store a key in any pertinent index entry, + and it also permits a limited amount arbitrary data to be stored there. + + A discfs could be requested to store the netfs's data in xattrs, and the + filename could be used to store the key, though the key would have to be + rendered as text not binary. Likewise indexes could be rendered as + directories with xattrs. + + (4) You could easily make your cache bigger if the discfs has plenty of space, + you could even go across multiple mountpoints. + + +====================== +GENERAL ON-DISC LAYOUT +====================== + +The filesystem is divided into a number of parts: + + 0 +---------------------------+ + | Superblock | + 1 +---------------------------+ + | Update Journal | + +---------------------------+ + | Validity Journal | + +---------------------------+ + | Write-Back Journal | + +---------------------------+ + | | + | Data | + | | + END +---------------------------+ + +The superblock contains the filesystem ID tags and pointers to all the other +regions. + +The update journal consists of a set of entries of sector size that keep track +of what changes have been made to the on-disc filesystem, but not yet +committed. + +The validity journal contains records of data blocks that have been allocated +but not yet written. Upon journal replay, all these blocks will be detached +from their pointers and recycled. + +The writeback journal keeps track of changes that have been made locally to +data blocks, but that have not yet been committed back to the server. This is +not yet implemented. + +The journals are replayed upon mounting to make sure that the cache is in a +reasonable state. + +The data region holds a number of things: + + (1) Index Files + + These are files of entries used by CacheFS internally and by filesystems + that wish to cache data here (such as AFS) to keep track of what's in + the cache at any given time. + + The first index file (inode 1) is special. It holds the CacheFS-specific + metadata for every file in the cache (including direct, single-indirect + and double-indirect block pointers). + + The second index file (inode 2) is also special. It has an entry for + each filesystem that's currently holding data in this cache. + + Every allocated entry in an index has an inode bound to it. This inode is + either another index file or it is a data file. + + (2) Cached Data Files + + These are caches of files from remote servers. Holes in these files + represent blocks not yet obtained from the server. + + (3) Indirection Blocks + + Should a file have more blocks than can be pointed to by the few + pointers in its storage management record, then indirection blocks will + be used to point to further data or indirection blocks. + + Three levels of indirection are currently supported: + + - single indirection + - double indirection + + (4) Allocation Nodes and Free Blocks + + The free blocks of the filesystem are kept in two single-branched + "trees". One tree is the blocks that are ready to be allocated, and the + other is the blocks that have just been recycled. When the former tree + becomes empty, the latter tree is decanted across. + + Each tree is arranged as a chain of "nodes", each node points to the next + node in the chain (unless it's at the end) and also up to 1022 free + blocks. + +Note that all blocks are PAGE_SIZE in size. The blocks are numbered starting +with the superblock at 0. Using 32-bit block pointers, a maximum number of +0xffffffff blocks can be accessed, meaning that the maximum cache size is ~16TB +for 4KB pages. + + +======== +MOUNTING +======== + +Since CacheFS is actually a quasi-filesystem, it requires a block device behind +it. The way to give it one is to mount it as cachefs type on a directory +somewhere. The mounted filesystem will then present the user with a set of +directories outlining the index structure resident in the cache. Indexes +(directories) and files can be turfed out of the cache by the sysadmin through +the use of rmdir and unlink. + +For instance, if a cache contains AFS data, the user might see the following: + + root>mount -t cachefs /dev/hdg9 /cache-hdg9 + root>ls -1 /cache-hdg9 + afs + root>ls -1 /cache-hdg9/afs + cambridge.redhat.com + root>ls -1 /cache-hdg9/afs/cambridge.redhat.com + root.afs + root.cell + +However, a block device that's going to be used for a cache must be prepared +before it can be mounted initially. This is done very simply by: + + echo "cachefs___" >/dev/hdg9 + +During the initial mount, the basic structure will be scribed into the cache, +and then a background thread will "recycle" the as-yet unused data blocks. diff -uNr linux-2.6.9-rc2-mm4/Documentation/filesystems/caching/fscache.txt linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/caching/fscache.txt --- linux-2.6.9-rc2-mm4/Documentation/filesystems/caching/fscache.txt 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/caching/fscache.txt 2004-10-04 15:49:44.108031173 +0100 @@ -0,0 +1,74 @@ + ========================== + General Filesystem Caching + ========================== + +======== +OVERVIEW +======== + +This facility is a general purpose cache for network filesystems, though it +could be used for caching other things such as ISO9660 filesystems too. + +FS-Cache mediates between cache backends (such as CacheFS) and network +filesystems. FS-Cache does not follow the idea of completely loading every +netfs file opened in its entirety into a cache before permitting it to be +accessed and then serving the pages out of that cache rather than the netfs +inode because: + + (1) It must be practical to operate without a cache. + + (2) The size of any accessible file must not be limited to the size of the + cache. + + (3) The combined size of all opened files (this includes mapped libraries) + must not be limited to the size of the cache. + + (4) The user should not be forced to download an entire file just to do a + one-off access of a small portion of it (such as might be done with the + "file" program). + +It instead serves the cache out in PAGE_SIZE chunks as and when requested by +the netfs('s) using it. + + +FS-Cache provides the following facilities: + + (1) More than one cache can be used at once. + + (2) Caches can be added / removed at any time. + + (3) The netfs is provided with an interface that allows either party to + withdraw caching facilities from a file (required for (2)). + + (4) The interface to the netfs returns as few errors as possible, preferring + rather to let the netfs remain oblivious. + + (5) Cookies are used to represent files and indexes to the netfs. The simplest + cookie is just a NULL pointer - indicating nothing cached there. + + (6) The netfs is allowed to propose - dynamically - any index hierarchy it + desires, though it must be aware that the index search function is + recursive and stack space is limited. + + (7) Data I/O is done direct to and from the netfs's pages. The netfs indicates + that page A is at index B of the data-file represented by cookie C, and + that it should be read or written. The cache backend may or may not start + I/O on that page, but if it does, a netfs callback will be invoked to + indicate completion. The I/O may be either synchronous or asynchronous. + + (8) Cookies can be "retired" upon release. At this point FS-Cache will mark + them as obsolete and the index hierarchy rooted at that point will get + recycled. + + (9) The netfs provides a "match" function for index searches. In addition to + saying whether a match was made or not, this can also specify that an + entry should be updated or deleted. + + +The netfs API to FS-Cache can be found in: + + Documentation/filesystems/caching/netfs-api.txt + +The cache backend API to FS-Cache can be found in: + + Documentation/filesystems/caching/backend-api.txt diff -uNr linux-2.6.9-rc2-mm4/Documentation/filesystems/caching/netfs-api.txt linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/caching/netfs-api.txt --- linux-2.6.9-rc2-mm4/Documentation/filesystems/caching/netfs-api.txt 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/Documentation/filesystems/caching/netfs-api.txt 2004-10-04 16:50:10.373849277 +0100 @@ -0,0 +1,575 @@ + =============================== + FS-CACHE NETWORK FILESYSTEM API + =============================== + +There's an API by which a network filesystem can make use of the FS-Cache +facilities. This is based around a number of principles: + + (1) Every file and index is represented by a cookie. This cookie may or may + not have anything associated with it, but the netfs doesn't need to care. + + (2) Barring the top-level index (one entry per cached netfs), the index + hierarchy for each netfs is structured according the whim of the netfs. + + (3) Any netfs page being backed by the cache must have a small token + associated with it (possibly pointed to by page->private) so that FS-Cache + can keep track of it. + +This API is declared in <linux/fscache.h>. + + +============================= +NETWORK FILESYSTEM DEFINITION +============================= + +FS-Cache needs a description of the network filesystem. This is specified using +a record of the following structure: + + struct fscache_netfs { + const char *name; + unsigned version; + struct fscache_netfs_operations *ops; + struct fscache_cookie *primary_index; + ... + }; + +This first three fields should be filled in before registration, and the fourth +will be filled in by the registration function; any other fields should just be +ignored and are for internal use only. + +The fields are: + + (1) The name of the netfs (used as the key in the toplevel index). + + (2) The version of the netfs (if the name matches but the version doesn't, the + entire on-disc hierarchy for this netfs will be scrapped and begun + afresh). + + (3) The operations table is defined as follows: + + struct fscache_netfs_operations { + struct fscache_page *(*get_page_cookie)(struct page *page); + }; + + The functions here must all be present. Currently the only one is: + + (a) get_page_token(): Get the token used to bind a page to a block in a + cache. This function should allocate it if it doesn't exist. + + Return -ENOMEM if there's not enough memory and -ENODATA if the page + just shouldn't be cached. + + Set *_page_token to point to the token and return 0 if there is now a + token. Note that the netfs must keep track of the token itself (and + free it later). page->private can be used for this (see below). + + (4) The cookie representing the primary index will be allocated according to + another parameter passed into the registration function. + +For example, kAFS (linux/fs/afs/) uses the following definitions to describe +itself: + + static struct fscache_netfs_operations afs_cache_ops = { + .get_page_token = afs_cache_get_page_token, + }; + + struct fscache_netfs afs_cache_netfs = { + .name = "afs", + .version = 0, + .ops = &afs_cache_ops, + }; + + +================ +INDEX DEFINITION +================ + +Indexes are used for two purposes: + + (1) To speed up the finding of a file based on a series of keys (such as AFS's + "cell", "volume ID", "vnode ID"). + + (2) To make it easier to discard a subset of all the files cached based around + a particular key - for instance to mirror the removal of an AFS volume. + +However, since it's unlikely that any two netfs's are going to want to define +their index hierarchies in quite the same way, FS-Cache tries to impose as few +restraints as possible on how an index is structured and where it is placed in +the tree. The netfs can even mix indexes and data files at the same level, but +it's not recommended. + +There are some limits on indexes: + + (1) All entries in any given index must be the same size. The netfs supplies a + blob of data for each index entry. + + (2) The entries in one index can be of a different size to the entries in + another index. + + (3) The entry data must be atomically journallable, so it is limited to 400 + bytes at present. + + (4) The index data must start with the key. The layout of the key is described + in the index definition, and this is used to display the key in some + appropriate way. + + (5) The depth of the index tree should be judged with care as the search + function is recursive. Too many layers will run the kernel out of stack. + +To define an index, a structure of the following type should be filled out: + + struct fscache_index_def + { + uint8_t name[8]; + uint16_t data_size; + struct { + uint8_t type; + uint16_t len; + } keys[4]; + + fscache_match_val_t (*match)(void *target_netfs_data, + const void *entry); + + void (*update)(void *source_netfs_data, void *entry); + }; + +This has the following fields: + + (1) The name of the index (NUL terminated unless all 8 chars are used). + + (2) The size of the data blob provided by the netfs. + + (3) A definition of the key(s) at the beginning of the blob. The netfs is + permitted to specify up to four keys. The total length must not exceed the + data size. It is assumed that the keys will be laid end to end in order, + starting at the first byte of the data. + + The type field specifies the way the data should be displayed. It can be + one of: + + (*) FSCACHE_INDEX_KEYS_NOTUSED - key field not used + (*) FSCACHE_INDEX_KEYS_BIN - display byte-by-byte in hex + (*) FSCACHE_INDEX_KEYS_ASCIIZ - NUL-terminated ASCII + (*) FSCACHE_INDEX_KEYS_IPV4ADDR - display as IPv4 address + (*) FSCACHE_INDEX_KEYS_IPV6ADDR - display as IPv6 address + + (4) A function to compare an in-page-cache index entry blob with the data + passed to the cookie acquisition function. This function can also be used + to extract data from the blob and copy it into the netfs's structures. + + The values this function can return are: + + (*) FSCACHE_MATCH_FAILED - failed to match + (*) FSCACHE_MATCH_SUCCESS - successful match + (*) FSCACHE_MATCH_SUCCESS_UPDATE - successful match, entry needs update + (*) FSCACHE_MATCH_SUCCESS_DELETE - entry should be deleted + + For example, in linux/fs/afs/vnode.c: + + static fscache_match_val_t + afs_vnode_cache_match(void *target, const void *entry) + { + const struct afs_cache_vnode *cvnode = entry; + struct afs_vnode *vnode = target; + + if (vnode->fid.vnode != cvnode->vnode_id) + return FSCACHE_MATCH_FAILED; + + if (vnode->fid.unique != cvnode->vnode_unique || + vnode->status.version != cvnode->data_version) + return FSCACHE_MATCH_SUCCESS_DELETE; + + return FSCACHE_MATCH_SUCCESS; + } + + (5) A function to initialise or update an in-page-cache index entry blob from + netfs data passed to FS-Cache by the netfs. This function should not assume + that there's any data yet in the in-page-cache. + + Continuing the above example: + + static void afs_vnode_cache_update(void *source, void *entry) + { + struct afs_cache_vnode *cvnode = entry; + struct afs_vnode *vnode = source; + + cvnode->vnode_id = vnode->fid.vnode; + cvnode->vnode_unique = vnode->fid.unique; + cvnode->data_version = vnode->status.version; + } + +To finish the above example, the index definition for the "vnode" level is as +follows: + + struct fscache_index_def afs_vnode_cache_index_def = { + .name = "vnode", + .data_size = sizeof(struct afs_cache_vnode), + .keys[0] = { FSCACHE_INDEX_KEYS_BIN, 4 }, + .match = afs_vnode_cache_match, + .update = afs_vnode_cache_update, + }; + +The first element of struct afs_cache_vnode is the vnode ID. + +And for contrast, the cell index definition is: + + struct fscache_index_def afs_cache_cell_index_def = { + .name = "cell_ix", + .data_size = sizeof(struct afs_cell), + .keys[0] = { FSCACHE_INDEX_KEYS_ASCIIZ, 64 }, + .match = afs_cell_cache_match, + .update = afs_cell_cache_update, + }; + +The cell index is the primary index for kAFS. + + +=================================== +NETWORK FILESYSTEM (UN)REGISTRATION +=================================== + +The first step is to declare the network filesystem to the cache. This also +involves specifying the layout of the primary index (for AFS, this would be the +"cell" level). + +The registration function is: + + int fscache_register_netfs(struct fscache_netfs *netfs, + struct fscache_index_def *primary_idef); + +It just takes pointers to the netfs definition and the primary index +definition. It returns 0 or an error as appropriate. + +For kAFS, registration is done as follows: + + ret = fscache_register_netfs(&afs_cache_netfs, + &afs_cache_cell_index_def); + +The last step is, of course, unregistration: + + void fscache_unregister_netfs(struct fscache_netfs *netfs); + + +================== +INDEX REGISTRATION +================== + +The second step is to inform FS-Cache about part of an index hierarchy that can +be used to locate files. This is done by requesting a cookie for each index in +the path to the file: + + struct fscache_cookie * + fscache_acquire_cookie(struct fscache_cookie *iparent, + struct fscache_index_def *idef, + void *netfs_data); + +This function creates an index entry in the index represented by iparent, +loading the associated blob by calling iparent's update method with the +supplied netfs_data. + +It also creates a new index inode, formatted according to the definition +supplied in idef. The new cookie is then returned in *_cookie. + +Note that this function never returns an error - all errors are handled +internally. It may also return FSCACHE_NEGATIVE_COOKIE. It is quite acceptable +to pass this token back to this function as iparent (or even to the relinquish +cookie, read page and write page functions - see below). + +Note also that no indexes are actually created on disc until a data file needs +to be created somewhere down the hierarchy. Furthermore, an index may be +created in several different caches independently at different times. This is +all handled transparently, and the netfs doesn't see any of it. + +For example, with AFS, a cell would be added to the primary index. This index +entry would have a dependent inode containing a volume location index for the +volume mappings within this cell: + + cell->cache = + fscache_acquire_cookie(afs_cache_netfs.primary_index, + &afs_vlocation_cache_index_def, + cell); + +Then when a volume location was accessed, it would be entered into the cell's +index and an inode would be allocated that acts as a volume type and hash chain +combination: + + vlocation->cache = + fscache_acquire_cookie(cell->cache, + &afs_volume_cache_index_def, + vlocation); + +And then a particular flavour of volume (R/O for example) could be added to +that index, creating another index for vnodes (AFS inode equivalents): + + volume->cache = + fscache_acquire_cookie(vlocation->cache, + &afs_vnode_cache_index_def, + volume); + + +====================== +DATA FILE REGISTRATION +====================== + +The third step is to request a data file be created in the cache. This is +almost identical to index cookie acquisition. The only difference is that a +NULL index definition is passed. + + vnode->cache = + fscache_acquire_cookie(volume->cache, + NULL, + vnode); + + +===================== +PAGE ALLOC/READ/WRITE +===================== + +And the fourth step is to propose a page be cached. There are two functions +that are used to do this. + +Firstly, the netfs should ask FS-Cache to examine the caches and read the +contents cached for a particular page of a particular file if present, or else +allocate space to store the contents if not: + + typedef + void (*fscache_rw_complete_t)(void *cookie_data, + struct page *page, + void *end_io_data, + int error); + + int fscache_read_or_alloc_page(struct fscache_cookie *cookie, + struct page *page, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp); + +The cookie argument must specify a data file cookie, the page specified will +have the data loaded into it (and is also used to specify the page number), and +the gfp argument is used to control how any memory allocations made are satisfied. + +If the cookie indicates the inode is not cached: + + (1) The function will return -ENOBUFS. + +Else if there's a copy of the page resident on disc: + + (1) The function will submit a request to read the data off the disc directly + into the page specified. + + (2) The function will return 0. + + (3) When the read is complete, end_io_func() will be invoked with: + + (*) The netfs data supplied when the cookie was created. + + (*) The page descriptor. + + (*) The data passed to the above function. + + (*) An argument that's 0 on success or negative for an error. + + If an error occurs, it should be assumed that the page contains no usable + data. + +Otherwise, if there's not a copy available on disc: + + (1) A block may be allocated in the cache and attached to the inode at the + appropriate place. + + (2) The validity journal will be marked to indicate this page does not yet + contain valid data. + + (3) The function will return -ENODATA. + + +Secondly, if the netfs changes the contents of the page (either due to an +initial download or if a user performs a write), then the page should be +written back to the cache: + + int fscache_write_page(struct fscache_cookie *cookie, + struct page *page, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp); + +The cookie argument must specify a data file cookie, the page specified should +contain the data to be written (and is also used to specify the page number), +and the gfp argument is used to control how any memory allocations made are +satisfied. + +If the cookie indicates the inode is not cached then: + + (1) The function will return -ENOBUFS. + +Else if there's a block allocated on disc to hold this page: + + (1) The function will submit a request to write the data to the disc directly + from the page specified. + + (2) The function will return 0. + + (3) When the write is complete: + + (a) Any associated validity journal entry will be cleared (the block now + contains valid data as far as FS-Cache is concerned). + + (b) end_io_func() will be invoked with: + + (*) The netfs data supplied when the cookie was created. + + (*) The page descriptor. + + (*) The data passed to the above function. + + (*) An argument that's 0 on success or negative for an error. + + If an error happens, it can be assumed that the page has been + discarded from the cache. + + +============== +PAGE UNCACHING +============== + +To uncache a page, this function should be called: + + void fscache_uncache_page(struct fscache_cookie *cookie, + struct page *page); + +This detaches the page specified from the data file indicated by the cookie and +unbinds it from the underlying block. + +Note that pages can't be explicitly detached from the a data file. The whole +data file must be retired (see the relinquish cookie function below). + +Furthermore, note that this does not cancel the asynchronous read or write +operation started by the read/alloc and write functions. + + +========================== +INDEX AND DATA FILE UPDATE +========================== + +To request an update of the index data for an index or data file, the following +function should be called: + + void fscache_update_cookie(struct fscache_cookie *cookie); + +This function will refer back to the netfs_data pointer stored in the cookie by +the acquisition function to obtain the data to write into each revised index +entry. The update method in the parent index definition will be called to +transfer the data. + + +================================== +INDEX AND DATA FILE UNREGISTRATION +================================== + +To get rid of a cookie, this function should be called. + + void fscache_relinquish_cookie(struct fscache_cookie *cookie, + int retire); + +If retire is non-zero, then the index or file will be marked for recycling, and +all copies of it will be removed from all active caches in which it is present. + +If retire is zero, then the inode may be available again next the the +acquisition function is called. + +One very important note - relinquish must NOT be called unless all "child" +indexes, files and pages have been relinquished first. + + +===================== +PAGE TOKEN MANAGEMENT +===================== + +As previously mentioned, the netfs must keep a token associated with each page +currently actively backed by the cache. This is used by FS-Cache to go from a +page to the internal representation of the underlying block and back again. It +is particularly important for managing the withdrawal of a cache whilst it is +in active service (eg: it got unmounted). + +The token is this: + + struct fscache_page { + ... + }; + +Note that all fields are for internal FS-Cache use only. + +The token only needs to be allocated when FS-Cache asks for it. This it will do +by calling the get_page_cookie() method in the netfs definition ops table. Once +allocated, the same token should be presented every time the method is called +again for a particular page. + +The token should be retained by the netfs, and should be deleted only after the +page has been uncached. + +One way to achieve this is to attach the token to page->private (and set the +PG_private bit on the page) once allocated. Shortcut routines are provided by +FS-Cache to do this. Firstly, to retrieve if present and allocate if not: + + struct fscache_page *fscache_page_get_private(struct page *page, + unsigned gfp); + +Secondly to retrieve if present and BUG if not: + + static inline + struct fscache_page *fscache_page_grab_private(struct page *page); + +To clean up the tokens, the netfs inode hosting the page should be provided +with address space operations that circumvent the buffer-head operations for a +page. For instance: + + struct address_space_operations afs_fs_aops = { + ... + .sync_page = block_sync_page, + .set_page_dirty = __set_page_dirty_nobuffers, + .releasepage = afs_file_releasepage, + .invalidatepage = afs_file_invalidatepage, + }; + + static int afs_file_invalidatepage(struct page *page, + unsigned long offset) + { + struct afs_vnode *vnode = AFS_FS_I(page->mapping->host); + int ret = 1; + + BUG_ON(!PageLocked(page)); + if (!PagePrivate(page)) + return 1; + fscache_uncache_page(vnode->cache,page); + if (offset == 0) + return 1; + BUG_ON(!PageLocked(page)); + if (PageWriteback(page)) + return 0; + return page->mapping->a_ops->releasepage(page, 0); + } + + static int afs_file_releasepage(struct page *page, int gfp_flags) + { + struct fscache_page *token; + struct afs_vnode *vnode = AFS_FS_I(page->mapping->host); + + if (PagePrivate(page)) { + fscache_uncache_page(vnode->cache, page); + token = (struct fscache_page *) page->private; + page->private = 0; + ClearPagePrivate(page); + if (token) + kfree(token); + } + return 0; + } + + +================================ +INDEX AND DATA FILE INVALIDATION +================================ + +There is no direct way to invalidate an index subtree or a data file. To do +this, the caller should relinquish and retire the cookie they have, and then +acquire a new one. diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/block.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/block.c --- linux-2.6.9-rc2-mm4/fs/cachefs/block.c 2004-09-27 11:23:55.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/block.c 2004-09-30 20:08:55.000000000 +0100 @@ -40,12 +40,12 @@ */ static int cachefs_block_dummy_filler(void *data, struct page *page) { - struct cachefs_page *pageio; + struct fscache_page *pageio; _enter("%p,{%lu}", data, page->index); /* we need somewhere to note journal ACKs that need to be made */ - pageio = cachefs_page_get_private(page, GFP_KERNEL); + pageio = fscache_page_get_private(page, GFP_KERNEL); if (IS_ERR(pageio)) return PTR_ERR(pageio); @@ -67,7 +67,7 @@ int cachefs_block_set(struct cachefs_super *super, struct cachefs_block *block, struct page *page, - struct cachefs_page *pageio) + struct fscache_page *pageio) { DECLARE_WAITQUEUE(myself,current); @@ -137,7 +137,7 @@ int cachefs_block_set2(struct cachefs_super *super, cachefs_blockix_t bix, struct page *page, - struct cachefs_page *pageio, + struct fscache_page *pageio, struct cachefs_block **_block) { struct cachefs_block *block; @@ -369,7 +369,7 @@ /* duplicate the page if it's flagged copy-on-write */ if (test_bit(CACHEFS_BLOCK_COW, &block->flags)) { - struct cachefs_page *newpageio; + struct fscache_page *newpageio; mapping = super->imisc->i_mapping; @@ -378,7 +378,7 @@ if (!newpage) goto error; - if (cachefs_page_get_private(newpage, &newpageio, + if (fscache_page_get_private(newpage, &newpageio, mapping_gfp_mask(mapping)) < 0) goto error_page; @@ -614,15 +614,18 @@ /* * withdraw from active service all the blocks residing on a device */ -void cachefs_block_withdraw(struct cachefs_super *super) +void cachefs_block_dissociate(struct fscache_cache *cache) { struct cachefs_block *block, *xblock; - struct cachefs_page *pageio; + struct cachefs_super *super; + struct fscache_page *pageio; struct rb_node *node; unsigned long flags; DECLARE_WAITQUEUE(myself, current); + super = container_of(cache, struct cachefs_super, cache); + _enter(""); /* first thing to do is mark all blocks withdrawn @@ -705,4 +708,4 @@ _leave(""); -} /* end cachefs_block_withdraw() */ +} /* end cachefs_block_dissociate() */ diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/cachefs-int.h linux-2.6.9-rc2-mm4-fscache/fs/cachefs/cachefs-int.h --- linux-2.6.9-rc2-mm4/fs/cachefs/cachefs-int.h 2004-09-27 11:23:55.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/cachefs-int.h 2004-10-04 16:15:28.481959199 +0100 @@ -12,7 +12,7 @@ #ifndef _LINUX_CACHEFS_INT_H #define _LINUX_CACHEFS_INT_H -#include <linux/cachefs.h> +#include <linux/fscache-cache.h> #include <linux/timer.h> #include <linux/bio.h> #include "cachefs-layout.h" @@ -28,9 +28,9 @@ struct cachefs_super; struct cachefs_block; struct cachefs_inode; -struct cachefs_search_result; struct cachefs_transaction; +extern struct fscache_cache_ops cachefs_cache_ops; extern struct address_space_operations cachefs_indr_io_addrspace_operations; extern struct address_space_operations cachefs_linear_io_addrspace_operations; extern struct file_operations cachefs_root_file_operations; @@ -46,7 +46,7 @@ extern int cachefs_io_dummy_filler(void *data, struct page *page); extern int cachefs_indr_io_get_block(struct inode *inode, struct page *page, - struct cachefs_page *pageio, int create); + struct fscache_page *pageio, int create); struct cachefs_reclaimable { unsigned ino; @@ -59,8 +59,8 @@ */ struct cachefs_super { + struct fscache_cache cache; /* cache handle */ struct super_block *sb; - struct list_head mnt_link; /* link in list of mounted caches */ struct cachefs_inode *imetadata; /* the metadata records file */ struct inode *imisc; /* an inode covering the whole blkdev */ @@ -70,15 +70,10 @@ #define CACHEFS_SUPER_DO_RECLAIM 2 /* T if should do reclamation */ #define CACHEFS_SUPER_RCM_IMM_SCAN 3 /* T if should scan for immediately * reclaimable inodes */ -#define CACHEFS_SUPER_WITHDRAWN 4 /* T if cache has been withdrawn */ -#define CACHEFS_SUPER_REPLAYING_UJNL 5 /* T if replaying u-journal */ +#define CACHEFS_SUPER_REPLAYING_UJNL 4 /* T if replaying u-journal */ int bio_wr_barrier; /* command to submit a write barrier BIO */ - /* index management */ - struct list_head ino_list; /* list of data/index inodes */ - spinlock_t ino_list_lock; - /* block allocation and recycling management */ struct rb_root blk_tree; /* block mapping tree */ rwlock_t blk_tree_lock; @@ -191,10 +186,6 @@ struct cachefs_ondisc_superblock *layout; }; -extern void cachefs_add_cache(struct cachefs_super *super, - struct cachefs_search_result *srch); -extern void cachefs_withdraw_cache(struct cachefs_super *super); - extern void cachefs_recycle_unready_blocks(struct cachefs_super *super); extern void cachefs_recycle_transfer_stack(struct cachefs_super *super); extern void cachefs_recycle_reclaim(struct cachefs_super *super); @@ -235,7 +226,7 @@ struct list_head batch_link; /* link in batch writer's list */ struct page *page; /* current data for this block */ struct page *writeback; /* source of writeback for this block */ - struct cachefs_page *ref; /* netfs's ref to this page */ + struct fscache_page *ref; /* netfs's ref to this page */ rwlock_t ref_lock; /* lock governing ref pointer */ struct cachefs_vj_entry *vjentry; /* invalid block record */ }; @@ -254,12 +245,12 @@ extern int cachefs_block_set(struct cachefs_super *super, struct cachefs_block *block, struct page *page, - struct cachefs_page *pageio); + struct fscache_page *pageio); extern int cachefs_block_set2(struct cachefs_super *super, cachefs_blockix_t bix, struct page *page, - struct cachefs_page *pageio, + struct fscache_page *pageio, struct cachefs_block **_block); extern int cachefs_block_read(struct cachefs_super *super, @@ -308,7 +299,7 @@ static inline struct cachefs_block *__cachefs_get_page_block(struct page *page) { BUG_ON(!PagePrivate(page)); - return ((struct cachefs_page *) page->private)->mapped_block; + return ((struct fscache_page *) page->private)->mapped_block; } static inline void cachefs_page_modify(struct cachefs_super *super, @@ -317,38 +308,10 @@ cachefs_block_modify(super, __cachefs_get_page_block(*page), page); } -extern void cachefs_block_withdraw(struct cachefs_super *super); +extern void cachefs_block_dissociate(struct fscache_cache *cache); -/*****************************************************************************/ -/* - * data file or index object cookie - * - a file will only appear in one cache - * - a request to cache a file may or may not be honoured, subject to - * constraints such as disc space - * - indexes files are created on disc just-in-time - */ -struct cachefs_cookie -{ - atomic_t usage; /* number of users of this cookie */ - atomic_t children; /* number of children of this cookie */ - struct cachefs_index_def *idef; /* index definition */ - struct cachefs_cookie *iparent; /* index holding this entry */ - struct list_head search_results; /* results of searching iparent */ - struct list_head backing_inodes; /* inode(s) backing this file/index */ - struct rw_semaphore sem; - struct cachefs_netfs *netfs; /* owner network fs definition */ - void *netfs_data; /* back pointer to netfs */ -}; - -struct cachefs_search_result { - struct list_head link; /* link in search_results */ - struct cachefs_super *super; /* superblock searched */ - unsigned ino; /* inode number (or 0 if negative) */ -}; - -extern kmem_cache_t *cachefs_cookie_jar; - -extern void cachefs_cookie_init_once(void *_cookie, kmem_cache_t *cachep, unsigned long flags); +#define cachefs_mapped_block(PGIO) ((struct cachefs_block *) (PGIO)->mapped_block) +#define cachefs_mapped_bix(PGIO) (((struct cachefs_block *) (PGIO)->mapped_block)->bix) /*****************************************************************************/ /* @@ -357,6 +320,7 @@ struct cachefs_inode { struct inode vfs_inode; /* VFS inode record for this file */ + struct fscache_node node; /* fscache handle */ struct cachefs_block *metadata; /* block containing metadata */ struct page *metadata_page; /* page mapped to metadata block */ @@ -366,16 +330,6 @@ unsigned short index_dsize; /* size of data in each index entry */ unsigned short index_esize; /* size of index entries */ unsigned short index_epp; /* number of index entries per page */ - - unsigned long flags; -#define CACHEFS_ACTIVE_INODE_ISINDEX 0 /* T if inode is index file (F if file) */ -#define CACHEFS_ACTIVE_INODE_RELEASING 1 /* T if inode is being released */ -#define CACHEFS_ACTIVE_INODE_RECYCLING 2 /* T if inode is being retired */ -#define CACHEFS_ACTIVE_INODE_WITHDRAWN 3 /* T if inode has been withdrawn */ - - struct list_head super_link; /* link in super->ino_list */ - struct list_head cookie_link; /* link in cookie->backing_inodes */ - struct cachefs_cookie *cookie; /* netfs's file/index object */ }; extern struct inode_operations cachefs_status_inode_operations; @@ -466,16 +420,16 @@ extern void cachefs_withdraw_inode(struct cachefs_inode *inode); -extern int cachefs_index_search(struct cachefs_inode *index, - struct cachefs_cookie *target, - unsigned *_entry, - unsigned *_ino); - -extern int cachefs_index_add(struct cachefs_inode *index, - struct cachefs_cookie *cookie, - unsigned *_newino); +extern int cachefs_index_search(struct fscache_node *node, + struct fscache_cookie *target, + struct fscache_search_result *result); + +extern int cachefs_index_add(struct fscache_node *node, + struct fscache_cookie *cookie, + struct fscache_search_result *result); -extern int cachefs_index_update(struct cachefs_inode *index); +extern int cachefs_index_update(struct fscache_node *ixnode, + struct fscache_node *node); extern int cachefs_index_reclaim_one_entry(struct cachefs_super *super, struct cachefs_transaction **_trans); @@ -591,7 +545,7 @@ static inline void cachefs_trans_affects_page(struct cachefs_transaction *trans, - struct cachefs_page *pageio, + struct fscache_page *pageio, unsigned offset, unsigned size) { @@ -614,7 +568,7 @@ { struct cachefs_super *super = trans->super; cachefs_trans_affects_page(trans, - cachefs_page_grab_private( + fscache_page_grab_private( virt_to_page(super->layout)), 0, super->sb->s_blocksize); diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/cachefs-layout.h linux-2.6.9-rc2-mm4-fscache/fs/cachefs/cachefs-layout.h --- linux-2.6.9-rc2-mm4/fs/cachefs/cachefs-layout.h 2004-09-27 11:23:55.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/cachefs-layout.h 2004-09-30 19:18:08.000000000 +0100 @@ -143,17 +143,6 @@ /*****************************************************************************/ /* - * on-disc cached network filesystem definition record - * - each entry resides in its own sector - */ -struct cachefs_ondisc_fsdef -{ - uint8_t name[24]; /* name of netfs */ - uint32_t version; /* version of layout */ -}; - -/*****************************************************************************/ -/* * Free blocks are kept in pair of a very one sided trees (more horsetail * plants than trees) * diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/index.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/index.c --- linux-2.6.9-rc2-mm4/fs/cachefs/index.c 2004-09-27 11:23:55.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/index.c 2004-10-04 15:09:45.926004147 +0100 @@ -20,17 +20,15 @@ */ #include <linux/module.h> -#include <linux/init.h> #include <linux/sched.h> -#include <linux/completion.h> #include <linux/slab.h> #include <linux/bio.h> #include <linux/circ_buf.h> #include "cachefs-int.h" -struct cachefs_index_search_record { - struct cachefs_cookie *index; - struct cachefs_cookie *target; +struct fscache_index_search_record { + struct fscache_cookie *index; + struct fscache_cookie *target; struct cachefs_inode *iinode; unsigned entsize; unsigned ino; @@ -42,7 +40,7 @@ * mark an inode/index entry pair for deletion when so requested by the match * function supplied by the netfs */ -static void cachefs_index_search_delete(struct cachefs_index_search_record *rec, +static void cachefs_index_search_delete(struct fscache_index_search_record *rec, struct page *ixpage, unsigned ixentry, unsigned ixoffset, @@ -69,7 +67,7 @@ return; } - BUG_ON(!list_empty(&inode->cookie_link)); + BUG_ON(!list_empty(&inode->node.cookie_link)); /* create a transaction to record the reclamation */ ret = -ENOMEM; @@ -87,7 +85,7 @@ trans->jentry->auxblock = inode->metadata->bix; trans->jentry->auxentry = inode->metadata_offset; - cachefs_trans_affects_page(trans, cachefs_page_grab_private(ixpage), + cachefs_trans_affects_page(trans, fscache_page_grab_private(ixpage), ixoffset, sizeof(*xent)); cachefs_trans_affects_inode(trans, inode); @@ -152,7 +150,7 @@ * mark an inode/index entry pair for deletion when so requested by the match * function supplied by the netfs */ -static void cachefs_index_search_update(struct cachefs_index_search_record *rec, +static void cachefs_index_search_update(struct fscache_index_search_record *rec, struct page *ixpage, unsigned ixentry, unsigned ixoffset, @@ -182,7 +180,7 @@ trans->jentry->entry = ixoffset; trans->jentry->count = rec->iinode->index_dsize; - cachefs_trans_affects_page(trans, cachefs_page_grab_private(ixpage), + cachefs_trans_affects_page(trans, fscache_page_grab_private(ixpage), ixoffset, sizeof(*xent)); /* have the netfs transcribe the update into the transaction */ @@ -225,14 +223,14 @@ unsigned long offset, unsigned long size) { - struct cachefs_index_search_record *rec; + struct fscache_index_search_record *rec; unsigned long stop, tmp, esize; void *content; int ret; _enter(",{%lu},%lu,%lu", page->index, offset, size); - rec = (struct cachefs_index_search_record *) desc->arg.buf; + rec = (struct fscache_index_search_record *) desc->arg.buf; ret = size; /* round up to the first record boundary after the offset */ @@ -257,7 +255,7 @@ for (; offset + esize <= stop; offset += esize) { struct cachefs_ondisc_index_entry *xent = content + offset; - cachefs_match_val_t result; + fscache_match_val_t result; unsigned ixentry; /* ignore invalid entries */ @@ -273,13 +271,13 @@ xent->u.data); switch (result) { - case CACHEFS_MATCH_SUCCESS_UPDATE: + case FSCACHE_MATCH_SUCCESS_UPDATE: /* the netfs said that it matched, but needs * updating */ cachefs_index_search_update(rec, page, ixentry, offset, xent->ino); - case CACHEFS_MATCH_SUCCESS: + case FSCACHE_MATCH_SUCCESS: /* the netfs said that it matched */ rec->entry = tmp; rec->ino = xent->ino; @@ -299,13 +297,13 @@ ret = 0; break; - case CACHEFS_MATCH_SUCCESS_DELETE: + case FSCACHE_MATCH_SUCCESS_DELETE: /* the netfs said that it matched, but this entry * should be marked obsolete */ cachefs_index_search_delete(rec, page, ixentry, offset, xent->ino); - case CACHEFS_MATCH_FAILED: + case FSCACHE_MATCH_FAILED: /* the netfs said there wasn't a valid match */ default: break; @@ -330,26 +328,23 @@ * - returns 0 if found, and stores the entry number in *_entry and the inode * number of the backing file in *_ino */ -int cachefs_index_search(struct cachefs_inode *index, - struct cachefs_cookie *target, - unsigned *_entry, - unsigned *_ino) +int cachefs_index_search(struct fscache_node *node, + struct fscache_cookie *target, + struct fscache_search_result *result) { - struct cachefs_index_search_record rec; + struct fscache_index_search_record rec; + struct cachefs_inode *index; struct file_ra_state ra; read_descriptor_t desc; loff_t pos; int ret; + index = container_of(node, struct cachefs_inode, node); + _enter("{%s,%lu,%Lu}", - index->cookie->idef->name, + index->node.cookie->idef->name, index->vfs_inode.i_ino, - i_size_read(index->vfs_inode)); - - if (_entry) - *_entry = UINT_MAX; - if (_ino) - *_ino = 0; + i_size_read(&index->vfs_inode)); ret = -ENOENT; if (i_size_read(&index->vfs_inode) == 0) @@ -357,7 +352,7 @@ /* prepare a record of what we want to do */ rec.iinode = index; - rec.index = index->cookie; + rec.index = index->node.cookie; rec.target = target; rec.entsize = rec.iinode->index_esize; rec.entry = UINT_MAX; @@ -388,11 +383,7 @@ else { /* we found an entry */ BUG_ON(rec.ino == 0); - - if (_entry) - *_entry = rec.entry; - if (_ino) - *_ino = rec.ino; + result->ino = rec.ino; ret = 0; } @@ -408,12 +399,12 @@ */ static int cachefs_index_preinit_page(void *data, struct page *page) { - struct cachefs_page *pageio; + struct fscache_page *pageio; _enter(",%p{%lu}", page, page->index); /* attach a mapping cookie to the page */ - pageio = cachefs_page_get_private(page, GFP_KERNEL); + pageio = fscache_page_get_private(page, GFP_KERNEL); if (IS_ERR(pageio)) { _leave(" = %ld", PTR_ERR(pageio)); return PTR_ERR(pageio); @@ -458,7 +449,7 @@ cachefs_metadata_postread(iinode, metadata); _debug("free entry: %u [size %Lu]", - newentry, i_size_read(iinode->vfs_inode)); + newentry, i_size_read(&iinode->vfs_inode)); /* extend the index file if there are no new entries */ if (newentry == UINT_MAX) { @@ -479,7 +470,7 @@ i_size_read(&iinode->vfs_inode) + PAGE_SIZE); ret = cachefs_indr_io_get_block(&iinode->vfs_inode, page, - cachefs_page_grab_private(page), + fscache_page_grab_private(page), 1); if (ret < 0) { i_size_write(&iinode->vfs_inode, @@ -561,24 +552,27 @@ * - if an inode is successfully allocated *_newino will be set with the inode * number */ -int cachefs_index_add(struct cachefs_inode *index, - struct cachefs_cookie *cookie, - unsigned *_newino) +int cachefs_index_add(struct fscache_node *node, + struct fscache_cookie *cookie, + struct fscache_search_result *result) +// unsigned *_newino) { struct cachefs_ondisc_index_entry *xent; struct cachefs_ondisc_ujnl_index *jindex; struct cachefs_ondisc_metadata *metadata; - struct cachefs_search_result *srch; struct cachefs_transaction *trans; struct cachefs_super *super; + struct cachefs_inode *index; struct page *inopage, *ixpage; unsigned ino, ixentry, offset, inonext, ixnext, ino_offset; int ret, loop; + index = container_of(node, struct cachefs_inode, node); + _enter("{%lu},{%s},", - index->vfs_inode.i_ino, index->cookie->idef->name); + index->vfs_inode.i_ino, index->node.cookie->idef->name); - *_newino = 0; +// *_newino = 0; super = index->vfs_inode.i_sb->s_fs_info; inopage = NULL; @@ -627,9 +621,9 @@ trans->jentry->upblock = index->metadata->bix; trans->jentry->upentry = index->metadata_offset; - cachefs_trans_affects_page(trans, cachefs_page_grab_private(ixpage), + cachefs_trans_affects_page(trans, fscache_page_grab_private(ixpage), offset, index->index_esize); - cachefs_trans_affects_page(trans, cachefs_page_grab_private(inopage), + cachefs_trans_affects_page(trans, fscache_page_grab_private(inopage), ino_offset, super->layout->metadata_size); cachefs_trans_affects_inode(trans, index); @@ -642,12 +636,12 @@ jindex->next_ino = inonext; jindex->next_index = ixnext; - index->cookie->idef->update(cookie->netfs_data, jindex->data); + index->node.cookie->idef->update(cookie->netfs_data, jindex->data); /* if we're adding a new index, we store its definition in the journal * too */ if (cookie->idef) { - struct cachefs_index_def *definition = cookie->idef; + struct fscache_index_def *definition = cookie->idef; jindex->def.dsize = definition->data_size; jindex->def.esize = definition->data_size; @@ -725,15 +719,8 @@ cachefs_trans_commit(trans); trans = NULL; - /* add the new inode to the cookie's list of search results */ - list_for_each_entry(srch, &cookie->search_results, link) { - if (srch->super == super) { - srch->ino = ino; - break; - } - } - - *_newino = ino; +// *_newino = ino; + result->ino = ino; error: cachefs_trans_put(trans); @@ -750,38 +737,29 @@ * update the index entry for an index or data file from the associated netfs * data */ -int cachefs_index_update(struct cachefs_inode *inode) +int cachefs_index_update(struct fscache_node *ixnode, + struct fscache_node *node) { struct cachefs_ondisc_index_entry *xent; struct cachefs_ondisc_metadata *meta; - struct cachefs_cookie *cookie = inode->cookie; + struct fscache_cookie *cookie = node->cookie; struct cachefs_super *super; - struct cachefs_inode *index; + struct cachefs_inode *index, *inode; struct cachefs_block *block; struct page *ixpage; unsigned offs; int ret; - _enter(""); + index = container_of(ixnode, struct cachefs_inode, node); + inode = container_of(node, struct cachefs_inode, node); + + _enter(","); super = inode->vfs_inode.i_sb->s_fs_info; - if (test_bit(CACHEFS_SUPER_WITHDRAWN, &super->flags)) + if (fscache_is_cache_withdrawn(&super->cache)) return 0; - /* the index entry for this inode lives in the parent index inode */ - list_for_each_entry(index, - &cookie->iparent->backing_inodes, - cookie_link) { - if (index->vfs_inode.i_sb == inode->vfs_inode.i_sb) - goto found_parent_index_inode; - } - - /* hmmm... the parent inode is strangely absent */ - BUG(); - return -ENOENT; - - found_parent_index_inode: /* find the entry number of this inode's index entry */ meta = cachefs_metadata_preread(inode); offs = meta->pindex_entry; diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/indirection-io.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/indirection-io.c --- linux-2.6.9-rc2-mm4/fs/cachefs/indirection-io.c 2004-09-27 11:23:55.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/indirection-io.c 2004-09-30 17:17:21.000000000 +0100 @@ -36,7 +36,7 @@ struct cachefs_io_block_path { struct page *page; - struct cachefs_page *pageio; /* page => block mapping */ + struct fscache_page *pageio; /* page => block mapping */ cachefs_blockix_t bix; /* block number for this level */ unsigned offset; /* offset into parent pointer block */ @@ -84,7 +84,7 @@ unsigned nr_pages, sector_t *last_block_in_bio) { - struct cachefs_page *pageio; + struct fscache_page *pageio; struct inode *inode = page->mapping->host; sector_t last_block; int ret; @@ -92,7 +92,7 @@ _enter(""); /* get the page mapping cookie */ - pageio = cachefs_page_get_private(page, GFP_KERNEL); + pageio = fscache_page_get_private(page, GFP_KERNEL); if (IS_ERR(pageio)) { ret = PTR_ERR(pageio); goto error; @@ -128,7 +128,7 @@ */ if (!*_bio) goto allocate_new_bio; - else if (*last_block_in_bio + 1 != pageio->mapped_block->bix) + else if (*last_block_in_bio + 1 != cachefs_mapped_bix(pageio)) goto dispatch_bio; /* add the page to the current BIO */ @@ -138,12 +138,12 @@ /* dispatch the BIO immediately if the current page lives on an * indirection chain boundary */ - if (test_bit(CACHEFS_PAGE_BOUNDARY, &pageio->flags)) { + if (test_bit(FSCACHE_PAGE_BOUNDARY, &pageio->flags)) { submit_bio(READ, *_bio); *_bio = NULL; } else { - *last_block_in_bio = pageio->mapped_block->bix; + *last_block_in_bio = cachefs_mapped_bix(pageio); } _leave(" = 0"); @@ -154,7 +154,7 @@ submit_bio(READ, *_bio); allocate_new_bio: ret = cachefs_io_alloc(inode->i_sb, - pageio->mapped_block->bix, + cachefs_mapped_bix(pageio), nr_pages, GFP_KERNEL, _bio); if (ret < 0) { *_bio = NULL; @@ -168,8 +168,7 @@ */ hole: ret = -ENODATA; - if (test_bit(CACHEFS_ACTIVE_INODE_ISINDEX, - &CACHEFS_FS_I(inode)->flags)) { + if (test_bit(FSCACHE_NODE_ISINDEX, &CACHEFS_FS_I(inode)->node.flags)) { printk("CacheFS: found unexpected hole in index/metadata file:" " ino=%lu pg=%lu\n", inode->i_ino, page->index); @@ -395,7 +394,7 @@ &block, &step->page); if (ret < 0) goto error_block; - step->pageio = cachefs_page_grab_private(step->page); + step->pageio = fscache_page_grab_private(step->page); } else { ret = cachefs_block_set2(super, jentry->block, @@ -581,7 +580,7 @@ * index and must be initialised as part of the final journalling mark */ int cachefs_indr_io_get_block(struct inode *vfs_inode, struct page *page, - struct cachefs_page *pageio, int create) + struct fscache_page *pageio, int create) { struct cachefs_io_block_path path[4]; struct cachefs_inode *inode = CACHEFS_FS_I(vfs_inode); @@ -688,10 +687,10 @@ path[pix].offset += inode->metadata_offset; down_read(&inode->metadata_sem); - path[pix + 1].pageio = cachefs_page_grab_private(inode->metadata_page); + path[pix + 1].pageio = fscache_page_grab_private(inode->metadata_page); up_read(&inode->metadata_sem); - path[pix + 1].bix = path[pix + 1].pageio->mapped_block->bix; + path[pix + 1].bix = cachefs_mapped_bix(path[pix + 1].pageio); ret = 0; for (; pix >= 0; pix--) { @@ -784,7 +783,7 @@ } if (!step->pageio) { - step->pageio = __cachefs_page_grab_private(step->page); + step->pageio = __fscache_page_grab_private(step->page); if (!step->pageio) { printk("step level %u" " { ptr={%lu}+%u / bix=%u }", @@ -812,21 +811,22 @@ return ret; } else if (path[0].flags & CACHEFS_BLOCK_INIT_NETFSDATA) { - set_bit(CACHEFS_BLOCK_NETFSDATA, &pageio->mapped_block->flags); + set_bit(CACHEFS_BLOCK_NETFSDATA, + &cachefs_mapped_block(pageio)->flags); } /* got the block - set the block offset in the page mapping record */ if (path[0].flags & CACHEFS_BLOCK_NEW) - set_bit(CACHEFS_PAGE_NEW, &pageio->flags); + set_bit(FSCACHE_PAGE_NEW, &pageio->flags); _debug("notboundary = %u", notboundary); if (!notboundary) - set_bit(CACHEFS_PAGE_BOUNDARY, &pageio->flags); + set_bit(FSCACHE_PAGE_BOUNDARY, &pageio->flags); _leave(" = 0 [bix=%u %c%c]", - pageio->mapped_block->bix, - test_bit(CACHEFS_PAGE_BOUNDARY, &pageio->flags) ? 'b' : '-', - test_bit(CACHEFS_PAGE_NEW, &pageio->flags) ? 'n' : '-' + cachefs_mapped_bix(pageio), + test_bit(FSCACHE_PAGE_BOUNDARY, &pageio->flags) ? 'b' : '-', + test_bit(FSCACHE_PAGE_NEW, &pageio->flags) ? 'n' : '-' ); return 0; diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/inode.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/inode.c --- linux-2.6.9-rc2-mm4/fs/cachefs/inode.c 2004-09-27 11:23:55.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/inode.c 2004-10-04 15:18:31.795339335 +0100 @@ -125,7 +125,7 @@ inode->index_esize = inode->index_dsize; inode->index_epp = PAGE_SIZE / inode->index_esize; - __set_bit(CACHEFS_ACTIVE_INODE_ISINDEX, &inode->flags); + __set_bit(FSCACHE_NODE_ISINDEX, &inode->node.flags); /* read the block containing this inode's meta-data from disc */ pos = inode->vfs_inode.i_ino << super->layout->metadata_bits; @@ -262,7 +262,7 @@ inode->vfs_inode.i_op = &cachefs_root_inode_operations; inode->vfs_inode.i_fop = &cachefs_root_file_operations; - __set_bit(CACHEFS_ACTIVE_INODE_ISINDEX, &inode->flags); + __set_bit(FSCACHE_NODE_ISINDEX, &inode->node.flags); } _leave(" = 0"); @@ -297,10 +297,12 @@ /* deal with an existing inode */ if (!(inode->vfs_inode.i_state & I_NEW)) { - _leave(" = 0 [exist]"); + _leave(" = %p [exist]", inode); return inode; } + inode->node.cache = &super->cache; + /* new inode - attempt to find in the on-disc catalogue */ switch (ino) { /* they've asked for the virtual inode that mirrors the @@ -346,7 +348,7 @@ /* success */ unlock_new_inode(&inode->vfs_inode); - _leave(" = %p", inode); + _leave(" = %p [new]", inode); return inode; /* failure */ diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/interface.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/interface.c --- linux-2.6.9-rc2-mm4/fs/cachefs/interface.c 2004-09-27 11:23:55.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/interface.c 2004-10-04 17:06:22.775465935 +0100 @@ -1,6 +1,6 @@ -/* interface.c: network FS interface to cache +/* interface.c: filesystem cache interface * - * Copyright (C) 2003 Red Hat, Inc. All Rights Reserved. + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. * Written by David Howells (dhowells@xxxxxxxxxx) * * This program is free software; you can redistribute it and/or @@ -10,980 +10,120 @@ */ #include <linux/module.h> +#include <linux/sched.h> +#include <linux/slab.h> +#include <linux/bio.h> #include "cachefs-int.h" struct cachefs_io_end { - cachefs_rw_complete_t func; + fscache_rw_complete_t func; void *data; void *cookie_data; struct cachefs_block *block; }; -LIST_HEAD(cachefs_netfs_list); -LIST_HEAD(cachefs_cache_list); -DECLARE_RWSEM(cachefs_addremove_sem); - -kmem_cache_t *cachefs_cookie_jar; - -static cachefs_match_val_t cachefs_fsdef_index_match(void *target, - const void *entry); - -static void cachefs_fsdef_index_update(void *source, void *entry); - -static struct cachefs_index_def cachefs_fsdef_index_def = { - .name = ".fsdef", - .data_size = sizeof(struct cachefs_ondisc_fsdef), - .match = cachefs_fsdef_index_match, - .update = cachefs_fsdef_index_update -}; - -static struct cachefs_cookie cachefs_fsdef_index = { - .usage = ATOMIC_INIT(1), - .idef = &cachefs_fsdef_index_def, - .sem = __RWSEM_INITIALIZER(cachefs_fsdef_index.sem), - .search_results = LIST_HEAD_INIT(cachefs_fsdef_index.search_results), - .backing_inodes = LIST_HEAD_INIT(cachefs_fsdef_index.backing_inodes), -}; - -static void __cachefs_cookie_put(struct cachefs_cookie *cookie); -static inline void cachefs_cookie_put(struct cachefs_cookie *cookie) -{ - BUG_ON(atomic_read(&cookie->usage) <= 0); - - if (atomic_dec_and_test(&cookie->usage)) - __cachefs_cookie_put(cookie); - -} - -/*****************************************************************************/ -/* - * register a network filesystem for caching - */ -int __cachefs_register_netfs(struct cachefs_netfs *netfs, - struct cachefs_index_def *primary_idef) -{ - struct cachefs_netfs *ptr; - int ret; - - _enter("{%s}", netfs->name); - - INIT_LIST_HEAD(&netfs->link); - - /* allocate a cookie for the primary index */ - netfs->primary_index = - kmem_cache_alloc(cachefs_cookie_jar, SLAB_KERNEL); - - if (!netfs->primary_index) { - _leave(" = -ENOMEM"); - return -ENOMEM; - } - - /* initialise the primary index cookie */ - memset(netfs->primary_index, 0, sizeof(*netfs->primary_index)); - - atomic_set(&netfs->primary_index->usage, 1); - atomic_set(&netfs->primary_index->children, 0); - - netfs->primary_index->idef = primary_idef; - netfs->primary_index->iparent = &cachefs_fsdef_index; - netfs->primary_index->netfs = netfs; - netfs->primary_index->netfs_data = netfs; - - atomic_inc(&netfs->primary_index->iparent->usage); - atomic_inc(&netfs->primary_index->iparent->children); - - INIT_LIST_HEAD(&netfs->primary_index->search_results); - INIT_LIST_HEAD(&netfs->primary_index->backing_inodes); - init_rwsem(&netfs->primary_index->sem); - - /* check the netfs type is not already present */ - down_write(&cachefs_addremove_sem); - - ret = -EEXIST; - list_for_each_entry(ptr, &cachefs_netfs_list,link) { - if (strcmp(ptr->name, netfs->name) == 0) - goto already_registered; - } - - list_add(&netfs->link, &cachefs_netfs_list); - ret = 0; - - printk("CacheFS: netfs '%s' registered for caching\n", netfs->name); - - already_registered: - up_write(&cachefs_addremove_sem); - - if (ret < 0) { - kmem_cache_free(cachefs_cookie_jar, netfs->primary_index); - netfs->primary_index = NULL; - } - - _leave(" = %d", ret); - return ret; - -} /* end __cachefs_register_netfs() */ - -EXPORT_SYMBOL(__cachefs_register_netfs); - -/*****************************************************************************/ -/* - * unregister a network filesystem from the cache - * - all cookies must have been released first - */ -void __cachefs_unregister_netfs(struct cachefs_netfs *netfs) -{ - _enter("{%s.%u}", netfs->name, netfs->version); - - down_write(&cachefs_addremove_sem); - - list_del(&netfs->link); - cachefs_relinquish_cookie(netfs->primary_index, 0); - - up_write(&cachefs_addremove_sem); - - printk("CacheFS: netfs '%s' unregistered from caching\n", netfs->name); - - _leave(""); - -} /* end __cachefs_unregister_netfs() */ - -EXPORT_SYMBOL(__cachefs_unregister_netfs); - -/*****************************************************************************/ -/* - * declare a mounted cache as being open for business - * - try not to allocate memory as disposing of the superblock is a pain - */ -void cachefs_add_cache(struct cachefs_super *super, - struct cachefs_search_result *srch) -{ - struct cachefs_inode *ifsdef; - - _enter(""); - - /* prepare an active-inode record for the FSDEF index of this cache */ - ifsdef = cachefs_iget(super, CACHEFS_INO_FSDEF_CATALOGUE); - if (IS_ERR(ifsdef)) - /* there shouldn't be an error as FSDEF is the root dir of the - * FS and so should already be in core */ - BUG(); - - if (!cachefs_igrab(ifsdef)) - BUG(); - - ifsdef->cookie = &cachefs_fsdef_index; - - srch->super = super; - srch->ino = CACHEFS_INO_FSDEF_CATALOGUE; - - down_write(&cachefs_addremove_sem); - - /* add the superblock to the list */ - list_add(&super->mnt_link, &cachefs_cache_list); - - /* add the cache's netfs definition index inode to the superblock's - * list */ - spin_lock(&super->ino_list_lock); - list_add_tail(&ifsdef->super_link, &super->ino_list); - spin_unlock(&super->ino_list_lock); - - /* add the cache's netfs definition index inode to the top level index - * cookie as a known backing inode */ - down_write(&cachefs_fsdef_index.sem); - - list_add_tail(&srch->link, &cachefs_fsdef_index.search_results); - list_add_tail(&ifsdef->cookie_link, - &cachefs_fsdef_index.backing_inodes); - atomic_inc(&cachefs_fsdef_index.usage); - - up_write(&cachefs_fsdef_index.sem); - - up_write(&cachefs_addremove_sem); - - _leave(""); - -} /* end cachefs_add_cache() */ - /*****************************************************************************/ /* - * withdraw an unmounted cache from the active service + * look up the nominated node for this cache */ -void cachefs_withdraw_cache(struct cachefs_super *super) +static struct fscache_node *cachefs_lookup_node(struct fscache_cache *cache, + unsigned ino) { + struct cachefs_super *super; struct cachefs_inode *inode; - _enter(""); - - /* make the cache unavailable for cookie acquisition */ - set_bit(CACHEFS_SUPER_WITHDRAWN, &super->flags); - - down_write(&cachefs_addremove_sem); - list_del_init(&super->mnt_link); - up_write(&cachefs_addremove_sem); - - /* mark all inodes as being withdrawn */ - spin_lock(&super->ino_list_lock); - list_for_each_entry(inode, &super->ino_list, super_link) { - set_bit(CACHEFS_ACTIVE_INODE_WITHDRAWN, &inode->flags); - } - spin_unlock(&super->ino_list_lock); - - /* make sure all pages pinned by operations on behalf of the netfs are - * written to disc */ - cachefs_trans_sync(super, CACHEFS_TRANS_SYNC_WAIT_FOR_ACK); - - /* mark all active blocks as being withdrawn */ - cachefs_block_withdraw(super); - - /* we now have to destroy all the active inodes pertaining to this - * superblock */ - spin_lock(&super->ino_list_lock); - - while (!list_empty(&super->ino_list)) { - inode = list_entry(super->ino_list.next, struct cachefs_inode, - super_link); - list_del(&inode->super_link); - spin_unlock(&super->ino_list_lock); - - /* we've extracted an active inode from the tree - now dispose - * of it */ - cachefs_withdraw_inode(inode); - cachefs_iput(inode); - - spin_lock(&super->ino_list_lock); - } - - spin_unlock(&super->ino_list_lock); - - _leave(""); - -} /* end cachefs_withdraw_cache() */ - -/*****************************************************************************/ -/* - * withdraw an inode from active service - * - need break the links to a cached object cookie - * - called under two situations: - * (1) recycler decides to reclaim an in-use inode - * (2) a cache is unmounted - * - have to take care as the cookie can be being relinquished by the netfs - * simultaneously - * - the active inode is pinned by the caller holding a refcount on it - */ -void cachefs_withdraw_inode(struct cachefs_inode *inode) -{ - struct cachefs_search_result *srch; - struct cachefs_cookie *cookie, *xcookie = NULL; - - _enter("{ino=%lu cnt=%u}", - inode->vfs_inode.i_ino, atomic_read(&inode->vfs_inode.i_count)); - - /* first of all we have to break the links between the inode and the - * cookie - * - we have to hold both semaphores BUT we have to get the cookie sem - * FIRST - */ - down(&inode->vfs_inode.i_sem); - - cookie = inode->cookie; - if (cookie) { - /* pin the cookie so that is doesn't escape */ - atomic_inc(&cookie->usage); - - /* re-order the locks to avoid deadlock */ - up(&inode->vfs_inode.i_sem); - down_write(&cookie->sem); - down(&inode->vfs_inode.i_sem); - - /* erase references from the inode to the cookie */ - list_del_init(&inode->cookie_link); - - xcookie = inode->cookie; - inode->cookie = NULL; - - /* delete the search result record for this inode from the - * cookie's list */ - list_for_each_entry(srch, &cookie->search_results, link) { - if (srch->super == inode->vfs_inode.i_sb->s_fs_info) - break; - } - - list_del(&srch->link); - dbgfree(srch); - kfree(srch); - - up_write(&cookie->sem); - } - - up(&inode->vfs_inode.i_sem); - - /* we've broken the links between cookie and inode */ - if (xcookie) { - cachefs_cookie_put(xcookie); - cachefs_iput(inode); - } - - /* unpin the cookie */ - if (cookie) - cachefs_cookie_put(cookie); - - _leave(""); - -} /* end cachefs_withdraw_inode() */ - -/*****************************************************************************/ -/* - * search for representation of an object in its parent cache - * - the cookie must be locked by the caller - * - returns -ENODATA if the object or one of its ancestors doesn't exist - */ -static int cachefs_search_for_object(struct cachefs_cookie *cookie, - struct cachefs_super *super) -{ - struct cachefs_search_result *srch; - struct cachefs_cookie *iparent; - struct cachefs_inode *ipinode, *inode; - int ret; - - iparent = cookie->iparent; - if (!iparent) - return 0; /* FSDEF entries don't have a parent */ - - _enter("{%s/%s},", - iparent->idef->name, - cookie->idef ? (char *) cookie->idef->name : "<file>"); - - /* see if there's a search result for this object already */ - list_for_each_entry(srch, &cookie->search_results, link) { - _debug("check entry %p x %p [ino %u]", - cookie, super, srch->ino); - - if (srch->super == super) { - _debug("found entry"); - - if (srch->ino) { - _leave(" = 0 [found ino %u]", srch->ino); - return 0; - } - - /* entry is negative */ - _leave(" = -ENODATA"); - return -ENODATA; - } - } - - /* allocate an initially negative entry for this object */ - _debug("alloc entry %p x %p", cookie, super); - - srch = kmalloc(sizeof(*srch), GFP_KERNEL); - if (!srch) { - _leave(" = -ENOMEM"); - return -ENOMEM; - } - - memset(srch, 0, sizeof(*srch)); - - srch->super = super; - srch->ino = 0; - INIT_LIST_HEAD(&srch->link); - - /* we need see if there's an entry for this cache in this object's - * parent index, so the first thing to do is to see if the parent index - * is represented on disc - */ - down_read(&iparent->sem); - - ret = cachefs_search_for_object(iparent, super); - if (ret < 0) { - if (ret != -ENODATA) - goto error; - - /* set a negative entry */ - list_add_tail(&srch->link, &cookie->search_results); - goto done; - } - - /* find the parent's backing inode */ - list_for_each_entry(ipinode, &iparent->backing_inodes, cookie_link) { - if (ipinode->vfs_inode.i_sb->s_fs_info == super) - goto found_parent_entry; - } - - BUG(); - - found_parent_entry: - _debug("found_parent_entry"); - - /* search the parent index for a reference compatible with this - * object */ - ret = cachefs_index_search(ipinode, cookie, NULL, &srch->ino); - switch (ret) { - default: - goto error; - - case 0: - /* found - allocate an inode */ - inode = cachefs_iget(super, srch->ino); - if (IS_ERR(inode)) { - ret = PTR_ERR(inode); - goto error; - } - - down(&inode->vfs_inode.i_sem); + _enter("%p,%d", cache, ino); - BUG_ON(!list_empty(&inode->cookie_link)); - - /* attach the inode to the superblock's inode list */ - if (list_empty(&inode->super_link)) { - if (!cachefs_igrab(inode)) - goto igrab_failed_upput; - - spin_lock(&super->ino_list_lock); - list_add_tail(&inode->super_link, &super->ino_list); - spin_unlock(&super->ino_list_lock); - } - - /* attach the inode to the cookie */ - inode->cookie = cookie; - list_add_tail(&srch->link, &cookie->search_results); - list_add_tail(&inode->cookie_link, &cookie->backing_inodes); - atomic_inc(&cookie->usage); - - up(&inode->vfs_inode.i_sem); - break; - - case -ENOENT: - /* we can at least set a valid negative entry */ - list_add_tail(&srch->link, &cookie->search_results); - ret = -ENODATA; - break; - } - - done: - up_read(&iparent->sem); - _leave(" = %d", ret); - return ret; - - igrab_failed_upput: - up(&inode->vfs_inode.i_sem); - cachefs_iput(inode); - ret = -ENOENT; - error: - up_read(&iparent->sem); - dbgfree(srch); - kfree(srch); - _leave(" = %d", ret); - return ret; - -} /* end cachefs_search_for_object() */ - -/*****************************************************************************/ -/* - * instantiate the object in the specified cache - * - the cookie must be write-locked by the caller - * - search must have been performed first (so lists of search results are - * filled out) - * - all parent index objects are instantiated if necessary - */ -static int cachefs_instantiate_object(struct cachefs_cookie *cookie, - struct cachefs_super *super) -{ - struct cachefs_search_result *srch; - struct cachefs_cookie *iparent; - struct cachefs_inode *ipinode, *inode; - int ret; + super = container_of(cache, struct cachefs_super, cache); - iparent = cookie->iparent; - if (!iparent) - return 0; /* FSDEF entries don't have a parent */ - - _enter("{%s/%s},", - iparent->idef->name, - cookie->idef ? (char *) cookie->idef->name : "<file>"); - - /* find the search result for this object */ - list_for_each_entry(srch, &cookie->search_results, link) { - if (srch->super == super) - goto found_search_result; - } - - BUG(); - - found_search_result: - if (srch->ino) { - /* it was instantiated already */ - _leave(" = 0 [found ino %u]", srch->ino); - return 0; - } - - /* we need to insert an entry for this cache in the object's parent - * index, so the first thing to do is make sure that the parent index - * is represented on disc - */ - down_write(&iparent->sem); - - ret = cachefs_instantiate_object(iparent, super); - if (ret < 0) - goto error; - - /* the parent index's inode should now be available */ - list_for_each_entry(ipinode, &iparent->backing_inodes, cookie_link) { - if (ipinode->vfs_inode.i_sb->s_fs_info == super) - goto found_parent_inode; - } - - BUG(); - - found_parent_inode: - _debug("found_parent_inode: ino=%lu", ipinode->vfs_inode.i_ino); - - BUG_ON(ipinode->cookie != iparent); - - /* allocate an entry within the parent index inode */ - ret = cachefs_index_add(ipinode, cookie, &srch->ino); - if (ret < 0) - goto error; - - /* we're going to need an in-memory reflection of the inode too */ - inode = cachefs_iget(super, srch->ino); + inode = cachefs_iget(super, ino); if (IS_ERR(inode)) { - ret = PTR_ERR(inode); - goto error_x; /* uh-oh... our search record is now wrong */ - } - - /* keep track of it */ - down(&inode->vfs_inode.i_sem); - - BUG_ON(!list_empty(&inode->cookie_link)); - - /* attach to the superblock's inode list */ - if (list_empty(&inode->super_link)) { - if (!cachefs_igrab(inode)) - goto error_xi; - - spin_lock(&super->ino_list_lock); - list_add_tail(&inode->super_link, &super->ino_list); - spin_unlock(&super->ino_list_lock); - } - - /* attach to the cookie's search result list */ - inode->cookie = cookie; - list_add_tail(&inode->cookie_link, &cookie->backing_inodes); - atomic_inc(&cookie->usage); - - /* done */ - up(&inode->vfs_inode.i_sem); - up_write(&iparent->sem); - _leave(" = 0 [new]"); - return 0; - - /* if we get an error after having instantiated an inode on disc, just - * discard the search record so we find it next time */ - error_xi: - up(&inode->vfs_inode.i_sem); - cachefs_iput(inode); - ret = -ENOENT; - error_x: - list_del(&srch->link); - dbgfree(srch); - kfree(srch); - srch = NULL; - error: - up_write(&iparent->sem); - _leave(" = %d", ret); - return ret; - -} /* end cachefs_instantiate_object() */ - -/*****************************************************************************/ -/* - * select a cache on which to store a file - * - the cache addremove semaphore must be at least read-locked by the caller - */ -static struct cachefs_super *cachefs_select_cache_for_file(void) -{ - struct cachefs_super *super; - - _enter(""); - - /* TODO: make more intelligent than just choosing the first cache */ - super = NULL; - if (!list_empty(&cachefs_cache_list)) - super = list_entry(cachefs_cache_list.next, - struct cachefs_super, - mnt_link); - - _leave(" = %p", super); - return super; - -} /* end cachefs_select_cache_for_file() */ - -/*****************************************************************************/ -/* - * request a cookie to represent a data file or an index - * - iparent specifies the parent index to pin in memory - * - the top level index cookie for each netfs is stored in the cachefs_netfs - * struct upon registration - * - idef is NULL for a data file - * - idef points to the definition for an index - * - the netfs_data will be passed to the functions pointed to in *idef - * - all attached caches will be searched to see if they contain this object - * - index objects aren't stored on disc until there's a dependent file that - * needs storing - * - file objects are stored in a selected cache immediately, and all the - * indexes forming the path to it are instantiated if necessary - * - we never let on to the netfs about errors - * - we may set a negative cookie pointer, but that's okay - */ -struct cachefs_cookie *__cachefs_acquire_cookie(struct cachefs_cookie *iparent, - struct cachefs_index_def *idef, - void *netfs_data) -{ - struct cachefs_cookie *cookie; - struct cachefs_super *super; - int ret = 0; - - _enter("{%s},{%s},%p", - iparent ? (char *) iparent->idef->name : "<no-parent>", - idef ? (char *) idef->name : "<file>", - netfs_data); - - /* if it's going to be an index then validate the index data */ - if (idef) { - int dsize; - int loop; - - if (!idef->name[0]) { - printk("CacheFS: %s.%s.%p: nameless index\n", - iparent->netfs->name, - iparent->idef->name, - idef); - return CACHEFS_NEGATIVE_COOKIE; - } - - dsize = CACHEFS_ONDISC_UJNL_MIN_REC_SIZE - - sizeof(struct cachefs_ondisc_update_journal); - - if (idef->data_size > dsize) { - printk("CacheFS: %s.%s.%s:" - " index data size exceeds maximum %u>%d\n", - iparent->netfs->name, - iparent->idef->name, - idef->name, - idef->data_size, - dsize); - return CACHEFS_NEGATIVE_COOKIE; - } - - for (loop = 0; loop < 4; loop++) { - if (idef->keys[loop].type >= - CACHEFS_INDEX_KEYS__LAST) { - printk("CacheFS: %s.%s.%s:" - " index type %u unsupported\n", - iparent->netfs->name, - iparent->idef->name, - idef->name, - idef->keys[loop].type); - return CACHEFS_NEGATIVE_COOKIE; - } - - dsize -= idef->keys[loop].len; - if (dsize < 0) { - printk("CacheFS: %s.%s.%s:" - " index key size exceeds data size\n", - iparent->netfs->name, - iparent->idef->name, - idef->name); - return CACHEFS_NEGATIVE_COOKIE; - } - } - } - - /* if there's no parent cookie, then we don't create one here either */ - if (iparent == CACHEFS_NEGATIVE_COOKIE) { - _leave(" [no parent]"); - return CACHEFS_NEGATIVE_COOKIE; - } - - /* allocate and initialise a cookie */ - cookie = kmem_cache_alloc(cachefs_cookie_jar, SLAB_KERNEL); - if (!cookie) { - _leave(" [ENOMEM]"); - return CACHEFS_NEGATIVE_COOKIE; + _leave(" = %ld [error]", PTR_ERR(inode)); + return ERR_PTR(PTR_ERR(inode)); } - atomic_set(&cookie->usage, 1); - atomic_set(&cookie->children, 0); + _leave(" = %p", &inode->node); + return &inode->node; - atomic_inc(&iparent->usage); - atomic_inc(&iparent->children); - - cookie->idef = idef; - cookie->iparent = iparent; - cookie->netfs = iparent->netfs; - cookie->netfs_data = netfs_data; - - /* now we need to see whether the backing objects for this cookie yet - * exist, if not there'll be nothing to search */ - down_read(&cachefs_addremove_sem); - - if (list_empty(&cachefs_cache_list)) { - up_read(&cachefs_addremove_sem); - _leave(" [no caches]"); - return cookie; - } - - down_write(&cookie->sem); - - /* search every cache we know about to see if the object is already - * present */ - list_for_each_entry(super, &cachefs_cache_list, mnt_link) { - ret = cachefs_search_for_object(cookie, super); - switch (ret) { - case 0: - if (!cookie->idef) - break; /* only want the first file entry */ - case -ENODATA: - ret = 0; - continue; - default: - goto error; - } - } - - /* if the object is a cookie then we need do nothing more here - we - * create indexes on disc when we need them as an index may exist in - * multiple caches */ - if (cookie->idef) - goto done; - - /* the object is a file - we need to select a cache in which to store - * it */ - ret = -ENOMEDIUM; - super = cachefs_select_cache_for_file(); - if (!super) - goto error; /* couldn't decide on a cache */ - - /* create a file index entry on disc, along with all the indexes - * required to find it again later */ - ret = cachefs_instantiate_object(cookie, super); - if (ret == 0) - goto done; - - error: - printk("CacheFS: error from cache fs: %d\n", ret); - if (cookie) { - kmem_cache_free(cachefs_cookie_jar, cookie); - cookie = CACHEFS_NEGATIVE_COOKIE; - atomic_dec(&iparent->usage); - atomic_dec(&iparent->children); - } - - done: - up_write(&cookie->sem); - up_read(&cachefs_addremove_sem); - _leave(" = %p", cookie); - return cookie; - -} /* end __cachefs_acquire_cookie() */ - -EXPORT_SYMBOL(__cachefs_acquire_cookie); +} /* end cachefs_lookup_node() */ /*****************************************************************************/ /* - * release a cookie back to the cache - * - the object will be marked as recyclable on disc if retire is true - * - all dependents of this cookie must have already been unregistered - * (indexes/files/pages) + * increment the usage count on this inode (may fail if unmounting) */ -void __cachefs_relinquish_cookie(struct cachefs_cookie *cookie, int retire) +static struct fscache_node *cachefs_grab_node(struct fscache_node *node) { struct cachefs_inode *inode; + struct fscache_node *ret; - _enter("{%s},%d", - cookie && cookie->idef ? (char *) cookie->idef->name : "<file>", - retire); - - if (cookie == CACHEFS_NEGATIVE_COOKIE) { - _leave(" [no cookie]"); - return; - } + _enter("%p", node); - if (atomic_read(&cookie->children) != 0) { - printk("CacheFS: cookie still has children\n"); - BUG(); - } - - /* detach pointers back to netfs */ - down_write(&cookie->sem); - - cookie->netfs_data = NULL; - cookie->idef = NULL; - - /* queue retired objects for recycling */ - if (retire) { - list_for_each_entry(inode, - &cookie->backing_inodes, - cookie_link) { - set_bit(CACHEFS_ACTIVE_INODE_RECYCLING, &inode->flags); - } - } - - /* break links with all the active inodes */ - while (!list_empty(&cookie->backing_inodes)) { - inode = list_entry(cookie->backing_inodes.next, - struct cachefs_inode, - cookie_link); - - /* detach each cache inode from the object cookie */ - set_bit(CACHEFS_ACTIVE_INODE_RELEASING, &inode->flags); - - list_del_init(&inode->cookie_link); - - down(&inode->vfs_inode.i_sem); - inode->cookie = NULL; - up(&inode->vfs_inode.i_sem); - - if (atomic_dec_and_test(&cookie->usage)) - /* the cookie refcount shouldn't be reduced to 0 yet */ - BUG(); + inode = container_of(node, struct cachefs_inode, node); + inode = cachefs_igrab(inode); + ret = (inode ? &inode->node : NULL); - cachefs_iput(inode); - } - - up_write(&cookie->sem); - - if (cookie->iparent) - atomic_dec(&cookie->iparent->children); - - /* finally dispose of the cookie */ - cachefs_cookie_put(cookie); - - _leave(""); - -} /* end __cachefs_relinquish_cookie() */ + _leave(" = %p", ret); + return ret; -EXPORT_SYMBOL(__cachefs_relinquish_cookie); +} /* end cachefs_grab_node() */ /*****************************************************************************/ /* - * update the index entries backing a cookie + * lock a semaphore on a node */ -void __cachefs_update_cookie(struct cachefs_cookie *cookie) +static void cachefs_lock_node(struct fscache_node *node) { struct cachefs_inode *inode; - _enter("{%s}", - cookie && - cookie->idef ? (char *) cookie->idef->name : "<file>"); - - if (cookie == CACHEFS_NEGATIVE_COOKIE) { - _leave(" [no cookie]"); - return; - } - - down_read(&cookie->sem); - down_read(&cookie->iparent->sem); - - /* update the index entry on disc in each cache backing this cookie */ - list_for_each_entry(inode, &cookie->backing_inodes, cookie_link) { - cachefs_index_update(inode); - } - - up_read(&cookie->iparent->sem); - up_read(&cookie->sem); - _leave(""); - -} /* end __cachefs_update_cookie() */ - -EXPORT_SYMBOL(__cachefs_update_cookie); + _enter("%p", node); -/*****************************************************************************/ -/* - * see if the netfs definition matches - */ -static cachefs_match_val_t cachefs_fsdef_index_match(void *target, - const void *entry) -{ - const struct cachefs_ondisc_fsdef *fsdef = entry; - struct cachefs_netfs *netfs = target; - - _enter("%p,%p", target, entry); - - /* name and version must both match with what's on disc */ - _debug("{%s.%u},{%s.%u}", - netfs->name, netfs->version, fsdef->name, fsdef->version); - - if (strncmp(netfs->name, fsdef->name, sizeof(fsdef->name)) != 0) { - _leave(" = FAILED"); - return CACHEFS_MATCH_FAILED; - } - - if (netfs->version == fsdef->version) { - _leave(" = SUCCESS"); - return CACHEFS_MATCH_SUCCESS; - } - - /* an entry of the same name but different version is scheduled for - * deletion */ - _leave(" = SUCCESS_DELETE"); - return CACHEFS_MATCH_SUCCESS_DELETE; + inode = container_of(node, struct cachefs_inode, node); + down(&inode->vfs_inode.i_sem); -} /* end cachefs_fsdef_index_match() */ +} /* end cachefs_lock_node() */ /*****************************************************************************/ /* - * update the netfs definition to be stored on disc + * unlock a semaphore on a node */ -static void cachefs_fsdef_index_update(void *source, void *entry) +static void cachefs_unlock_node(struct fscache_node *node) { - struct cachefs_ondisc_fsdef *fsdef = entry; - struct cachefs_netfs *netfs = source; - - _enter("{%s.%u},", netfs->name, netfs->version); + struct cachefs_inode *inode; - /* install the netfs name and version in the top-level index entry */ - strncpy(fsdef->name, netfs->name, sizeof(fsdef->name)); + _enter("%p", node); - fsdef->version = netfs->version; + inode = container_of(node, struct cachefs_inode, node); + up(&inode->vfs_inode.i_sem); -} /* end cachefs_fsdef_index_update() */ +} /* end cachefs_unlock_node() */ /*****************************************************************************/ /* - * destroy a cookie + * dispose of a reference to a node */ -static void __cachefs_cookie_put(struct cachefs_cookie *cookie) +static void cachefs_put_node(struct fscache_node *node) { - _enter(""); - - if (cookie->iparent) - cachefs_cookie_put(cookie->iparent); - - kmem_cache_free(cachefs_cookie_jar, cookie); + _enter("%p", node); - _leave(""); + if (node) + cachefs_iput(container_of(node, struct cachefs_inode, node)); -} /* end __cachefs_cookie_put() */ +} /* end cachefs_put_node() */ /*****************************************************************************/ /* - * initialise an cookie jar slab element prior to any use + * sync a cache */ -void cachefs_cookie_init_once(void *_cookie, kmem_cache_t *cachep, - unsigned long flags) +static void cachefs_sync(struct fscache_cache *cache) { - struct cachefs_cookie *cookie = _cookie; + _enter("%p", cache); - if ((flags & (SLAB_CTOR_VERIFY|SLAB_CTOR_CONSTRUCTOR)) == - SLAB_CTOR_CONSTRUCTOR) { - memset(cookie, 0, sizeof(*cookie)); - INIT_LIST_HEAD(&cookie->search_results); - INIT_LIST_HEAD(&cookie->backing_inodes); - init_rwsem(&cookie->sem); - } + /* make sure all pages pinned by operations on behalf of the netfs are + * written to disc */ + cachefs_trans_sync(container_of(cache, struct cachefs_super, cache), + CACHEFS_TRANS_SYNC_WAIT_FOR_ACK); -} /* end cachefs_cookie_init_once() */ +} /* end cachefs_sync() */ /*****************************************************************************/ /* @@ -1024,9 +164,7 @@ /*****************************************************************************/ /* * read a page from the cache or allocate a block in which to store it - * - if the cookie is not backed by a file: - * - -ENOBUFS will be returned and nothing more will be done - * - else if the page is backed by a block in the cache: + * - if the page is backed by a block in the cache: * - a read will be started which will call end_io_func on completion * - the wb-journal will be searched for an entry pertaining to this block * - if an entry is found: @@ -1038,44 +176,22 @@ * - the v-journal will be marked to note the block contains invalid data * - -ENODATA will be returned */ -int __cachefs_read_or_alloc_page(struct cachefs_cookie *cookie, - struct page *page, - cachefs_rw_complete_t end_io_func, - void *end_io_data, - unsigned long gfp) +static int cachefs_read_or_alloc_page(struct fscache_node *node, + struct page *page, + struct fscache_page *pageio, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp) { struct cachefs_io_end *end_io = NULL; struct cachefs_inode *inode; - struct cachefs_block *block; - struct cachefs_page *pageio; + struct cachefs_block *block = NULL; struct bio *bio = NULL; int ret; - _enter("%p,{%lu},", cookie, page->index); - - if (cookie == CACHEFS_NEGATIVE_COOKIE) { - _leave(" -ENOBUFS [no cookie]"); - return -ENOBUFS; /* no actual cookie */ - } - - BUG_ON(cookie->idef); /* not supposed to use this for indexes */ - - /* get the cache-cookie for this page */ - pageio = cookie->netfs->ops->get_page_cookie(page); - if (IS_ERR(pageio)) { - _leave(" = %ld", PTR_ERR(pageio)); - return PTR_ERR(pageio); - } - - /* prevent the file from being uncached whilst we access it */ - block = NULL; - down_read(&cookie->sem); + _enter(""); - /* if there's no disc space whatsoever backing this file, then leave - * now */ - ret = -ENOBUFS; - if (list_empty(&cookie->backing_inodes)) - goto error; + inode = container_of(node, struct cachefs_inode, node); /* handle the case of there already being a mapping, * - must protect against cache removal @@ -1084,22 +200,14 @@ read_lock(&pageio->lock); block = pageio->mapped_block; - if (block && !test_bit(CACHEFS_SUPER_WITHDRAWN, &block->super->flags)) + if (block && !fscache_is_cache_withdrawn(&block->super->cache)) goto available_on_disc; /* already mapped */ read_unlock(&pageio->lock); block = NULL; /* we don't know of a backing page, but there may be one recorded on - * disc... and if there isn't we'll request one be allocated */ - _debug("igrab"); - inode = cachefs_igrab(list_entry(cookie->backing_inodes.next, - struct cachefs_inode, - cookie_link)); - ret = -ENOBUFS; - if (!inode) - goto error; - + * disc... and if there isn't we'll request that one be allocated */ _debug("get block"); down(&inode->vfs_inode.i_sem); @@ -1109,13 +217,13 @@ if (ret < 0) goto error_i; - if (!test_and_clear_bit(CACHEFS_PAGE_NEW, &pageio->flags)) { + if (!test_and_clear_bit(FSCACHE_PAGE_NEW, &pageio->flags)) { /* there was data - pin the block underlying it and read */ read_lock(&pageio->lock); block = pageio->mapped_block; if (block && - !test_bit(CACHEFS_SUPER_WITHDRAWN, &block->super->flags)) + !fscache_is_cache_withdrawn(&block->super->cache)) goto available_on_disc_i; /* it went out of service for some reason */ @@ -1127,15 +235,13 @@ /* we allocated a new block, but didn't assign any data to it */ up(&inode->vfs_inode.i_sem); - cachefs_iput(inode); /* point the mapped block at its referencer */ - write_lock(&pageio->mapped_block->ref_lock); - pageio->mapped_block->ref = pageio; - write_unlock(&pageio->mapped_block->ref_lock); + write_lock(&cachefs_mapped_block(pageio)->ref_lock); + cachefs_mapped_block(pageio)->ref = pageio; + write_unlock(&cachefs_mapped_block(pageio)->ref_lock); - _debug("no data [bix=%u ref=%p]", pageio->mapped_block->bix, pageio); - up_read(&cookie->sem); + _debug("no data [bix=%u ref=%p]", cachefs_mapped_bix(pageio), pageio); /* tell the caller we've allocated a block, but we don't have any data * for them */ @@ -1147,7 +253,6 @@ available_on_disc_i: _debug("available_i"); up(&inode->vfs_inode.i_sem); - cachefs_iput(inode); available_on_disc: _debug("available"); @@ -1166,7 +271,7 @@ end_io->func = end_io_func; end_io->data = end_io_data; - end_io->cookie_data = cookie->netfs_data; + end_io->cookie_data = node->cookie->netfs_data; end_io->block = block; /* dispatch an operation to the block device */ @@ -1187,7 +292,6 @@ submit_bio(READ, bio); _debug("done"); - up_read(&cookie->sem); /* point the mapped block at its referencer */ write_lock(&block->ref_lock); @@ -1205,10 +309,8 @@ error_i: _debug("error_i"); up(&inode->vfs_inode.i_sem); - cachefs_iput(inode); error: _debug("error"); - up_read(&cookie->sem); cachefs_block_put(block); if (bio) bio_put(bio); @@ -1219,9 +321,7 @@ _leave(" = %d", ret); return ret; -} /* end __cachefs_read_or_alloc_page() */ - -EXPORT_SYMBOL(__cachefs_read_or_alloc_page); +} /* end cachefs_read_or_alloc_page() */ /*****************************************************************************/ /* @@ -1282,41 +382,25 @@ * be erased * - returns 0 */ -int __cachefs_write_page(struct cachefs_cookie *cookie, - struct page *page, - cachefs_rw_complete_t end_io_func, - void *end_io_data, - unsigned long gfp) +static int cachefs_write_page(struct fscache_node *node, + struct page *page, + struct fscache_page *pageio, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp) { struct cachefs_io_end *end_io = NULL; struct cachefs_block *block; - struct cachefs_page *pageio; struct bio *bio = NULL; int ret; - _enter("%p,{%lu},", cookie, page->index); - - if (cookie == CACHEFS_NEGATIVE_COOKIE) { - _leave(" -ENOBUFS [no cookie]"); - return -ENOBUFS; /* no actual cookie */ - } - - BUG_ON(cookie->idef); /* not supposed to use this for indexes */ - - /* get the cache-cookie for this page */ - pageio = cookie->netfs->ops->get_page_cookie(page); - if (IS_ERR(pageio)) { - _leave(" = %ld", PTR_ERR(pageio)); - return PTR_ERR(pageio); - } + _enter(""); - /* prevent the file from been uncached whilst we deal with it */ - down_read(&cookie->sem); read_lock(&pageio->lock); /* only write if there's somewhere to write to */ - block = pageio->mapped_block; - if (!block || test_bit(CACHEFS_SUPER_WITHDRAWN, &block->super->flags)) + block = cachefs_mapped_block(pageio); + if (!block || fscache_is_cache_withdrawn(&block->super->cache)) goto no_block; /* pin the block and drop the lock */ @@ -1334,7 +418,7 @@ end_io->func = end_io_func; end_io->data = end_io_data; - end_io->cookie_data = cookie->netfs_data; + end_io->cookie_data = node->cookie->netfs_data; end_io->block = block; /* dispatch an operation to the block device */ @@ -1355,11 +439,10 @@ if (!bio_add_page(bio, page, PAGE_SIZE, 0)) BUG(); - //dump_bio(bio,1); + //dump_bio(bio, 1); submit_bio(WRITE, bio); /* tell the caller it's in progress */ - up_read(&cookie->sem); _leave(" = 0"); return 0; @@ -1368,7 +451,6 @@ clear_bit(CACHEFS_BLOCK_NETFSBUSY, &block->flags); wake_up(&block->writewq); cachefs_block_put(block); - up_read(&cookie->sem); if (bio) bio_put(bio); if (end_io) { @@ -1381,40 +463,23 @@ /* tell the caller there wasn't a block to write into */ no_block: read_unlock(&pageio->lock); - up_read(&cookie->sem); _leave(" = -ENOBUFS"); return -ENOBUFS; -} /* end __cachefs_write_page() */ - -EXPORT_SYMBOL(__cachefs_write_page); +} /* end cachefs_write_page() */ /*****************************************************************************/ /* - * remove a page from the cache + * detach a backing block from a page * - if the block backing the page still has a vjentry then the block will be * recycled */ -void __cachefs_uncache_page(struct cachefs_cookie *cookie, struct page *page) +static void cachefs_uncache_page(struct fscache_node *node, + struct fscache_page *pageio) { struct cachefs_block *block, *xblock; - struct cachefs_page *pageio; - - _enter(",{%lu}", page->index); - - if (cookie == CACHEFS_NEGATIVE_COOKIE) { - _leave(" [no cookie]"); - return; - } - BUG_ON(cookie->idef); /* not supposed to use this for indexes */ - - /* get the cache-cookie for this page */ - pageio = cookie->netfs->ops->get_page_cookie(page); - if (IS_ERR(pageio)) { - _leave(" [get_page_cookie() = %ld]", PTR_ERR(pageio)); - return; - } + _enter(""); /* un-cross-link the page cookie and the block */ xblock = NULL; @@ -1448,8 +513,22 @@ } _leave(""); - return; -} /* end __cachefs_uncache_page() */ +} /* end cachefs_uncache_page() */ -EXPORT_SYMBOL(__cachefs_uncache_page); +struct fscache_cache_ops cachefs_cache_ops = { + .name = "cachefs", + .lookup_node = cachefs_lookup_node, + .grab_node = cachefs_grab_node, + .lock_node = cachefs_lock_node, + .unlock_node = cachefs_unlock_node, + .put_node = cachefs_put_node, + .index_search = cachefs_index_search, + .index_add = cachefs_index_add, + .index_update = cachefs_index_update, + .sync = cachefs_sync, + .dissociate_pages = cachefs_block_dissociate, + .read_or_alloc_page = cachefs_read_or_alloc_page, + .write_page = cachefs_write_page, + .uncache_page = cachefs_uncache_page, +}; diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/journal.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/journal.c --- linux-2.6.9-rc2-mm4/fs/cachefs/journal.c 2004-09-27 11:23:55.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/journal.c 2004-10-04 15:18:28.975573470 +0100 @@ -430,7 +430,7 @@ offset = (trans->index << super->sb->s_blocksize_bits) & ~PAGE_MASK; jentry = kmap_atomic(trans->jpage, KM_USER0) + offset; memcpy(jentry, trans->jentry, super->sb->s_blocksize); - kunmap_atomic(trans->jpage, KM_USER0); + kunmap_atomic(jentry, KM_USER0); SetPageWriteback(trans->jpage); @@ -1678,11 +1678,11 @@ unsigned int bytes_done, int error) { - kenter("%p{%lx},%u,%d", bio, bio->bi_flags, bytes_done, error); + _enter("%p{%lx},%u,%d", bio, bio->bi_flags, bytes_done, error); /* we're only interested in completion */ if (bio->bi_size > 0) { - kleave(" = 1"); + _leave(" = 1"); return 1; } @@ -1690,7 +1690,7 @@ end_page_writeback(bio->bi_io_vec[0].bv_page); bio_put(bio); - kleave(" = 0"); + _leave(" = 0"); return 0; } /* end cachefs_trans_ack_written() */ diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/linear-io.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/linear-io.c --- linux-2.6.9-rc2-mm4/fs/cachefs/linear-io.c 2004-09-27 11:23:55.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/linear-io.c 2004-09-30 17:18:42.000000000 +0100 @@ -57,14 +57,14 @@ cachefs_blockix_t *last_block_in_bio) { struct cachefs_block *block; - struct cachefs_page *pageio; + struct fscache_page *pageio; struct inode *inode = page->mapping->host; int ret; _enter(""); /* get the page mapping cookie */ - pageio = cachefs_page_get_private(page, GFP_KERNEL); + pageio = fscache_page_get_private(page, GFP_KERNEL); if (IS_ERR(pageio)) { ret = PTR_ERR(pageio); goto error; @@ -176,7 +176,7 @@ int cachefs_linear_io_readpage(struct file *file, struct page *page) { struct cachefs_block *block; - struct cachefs_page *pageio; + struct fscache_page *pageio; struct inode *inode = page->mapping->host; struct bio *bio; int ret; @@ -184,7 +184,7 @@ _enter(",{%lu}", page->index); /* get the page mapping cookie */ - pageio = cachefs_page_get_private(page, GFP_KERNEL); + pageio = fscache_page_get_private(page, GFP_KERNEL); if (IS_ERR(pageio)) { _leave(" = %ld [pgp]", PTR_ERR(pageio)); return PTR_ERR(pageio); diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/main.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/main.c --- linux-2.6.9-rc2-mm4/fs/cachefs/main.c 2004-09-27 11:23:56.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/main.c 2004-09-30 17:19:20.000000000 +0100 @@ -36,21 +36,8 @@ { int ret; - /* create ourselves a cookie jar and a block jar */ + /* create a block jar */ ret = -ENOMEM; - cachefs_cookie_jar = - kmem_cache_create("cachefs_cookie_jar", - sizeof(struct cachefs_cookie), - 0, - SLAB_HWCACHE_ALIGN, - cachefs_cookie_init_once, - NULL); - if (!cachefs_cookie_jar) { - printk(KERN_NOTICE - "CacheFS: Failed to allocate a cookie jar\n"); - goto error; - } - cachefs_block_jar = kmem_cache_create("cachefs_block_jar", sizeof(struct cachefs_block), @@ -61,7 +48,7 @@ if (!cachefs_block_jar) { printk(KERN_NOTICE "CacheFS: Failed to allocate a block jar\n"); - goto error_cookie_jar; + goto error; } /* initialise the filesystem */ @@ -75,8 +62,6 @@ error_block_jar: kmem_cache_destroy(cachefs_block_jar); - error_cookie_jar: - kmem_cache_destroy(cachefs_cookie_jar); error: printk(KERN_ERR "CacheFS: failed to register: %d\n", ret); return ret; @@ -92,7 +77,6 @@ cachefs_fs_exit(); kmem_cache_destroy(cachefs_block_jar); - kmem_cache_destroy(cachefs_cookie_jar); } /* end cachefs_exit() */ diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/misc.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/misc.c --- linux-2.6.9-rc2-mm4/fs/cachefs/misc.c 2004-09-27 11:23:56.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/misc.c 2004-09-30 19:06:08.000000000 +0100 @@ -30,10 +30,10 @@ * get a page caching token from for a page, allocating it and attaching it to * the page's private pointer if it doesn't exist */ -struct cachefs_page * __cachefs_page_get_private(struct page *page, +struct fscache_page * __cachefs_page_get_private(struct page *page, unsigned gfp_flags) { - struct cachefs_page *pageio = (struct cachefs_page *) page->private; + struct fscache_page *pageio = (struct fscache_page *) page->private; if (!pageio) { pageio = kmalloc(sizeof(*pageio), gfp_flags); @@ -145,7 +145,7 @@ */ int cachefs_invalidatepage(struct page *page, unsigned long offset) { - struct cachefs_page *pageio; + struct fscache_page *pageio; int ret = 1; _enter("{%lu},%lu", page->index, offset); @@ -153,7 +153,7 @@ BUG_ON(!PageLocked(page)); if (PagePrivate(page)) { - pageio = (struct cachefs_page *) page->private; + pageio = (struct fscache_page *) page->private; pageio->flags = 0; /* we release page attachments only if the entire page is being @@ -179,14 +179,14 @@ int cachefs_releasepage(struct page *page, int gfp_flags) { struct cachefs_block *block; - struct cachefs_page *pageio; + struct fscache_page *pageio; _enter("{%lu},%x", page->index, gfp_flags); /* detach the page mapping cookie and mapped block */ if (PagePrivate(page)) { /* detach the mapped block from the page if there is one */ - pageio = (struct cachefs_page *) page->private; + pageio = (struct fscache_page *) page->private; page->private = 0; ClearPagePrivate(page); diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/recycling.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/recycling.c --- linux-2.6.9-rc2-mm4/fs/cachefs/recycling.c 2004-09-27 11:23:56.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/recycling.c 2004-09-30 17:37:41.000000000 +0100 @@ -561,7 +561,7 @@ cachefs_metadata_postread(iinode, metadata); cachefs_trans_affects_page(trans, - cachefs_page_grab_private(ixpage), + fscache_page_grab_private(ixpage), trans->jentry->entry, trans->jentry->count); diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/replay.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/replay.c --- linux-2.6.9-rc2-mm4/fs/cachefs/replay.c 2004-09-27 11:23:56.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/replay.c 2004-10-04 13:33:04.681820540 +0100 @@ -1638,7 +1638,7 @@ cachefs_trans_replays_effect(trans, ptrblock, "ptr"); } - kunmap_atomic(ptrpage, KM_USER0); + kunmap_atomic(pbix, KM_USER0); } /* make sure the vjournal entry is cleared */ diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/rootdir.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/rootdir.c --- linux-2.6.9-rc2-mm4/fs/cachefs/rootdir.c 2004-09-27 11:23:56.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/rootdir.c 2004-09-30 17:48:40.000000000 +0100 @@ -722,7 +722,7 @@ trans->jentry->block = __cachefs_get_page_block(ixpage)->bix; cachefs_trans_affects_inode(trans, inode); - cachefs_trans_affects_page(trans, cachefs_page_grab_private(ixpage), + cachefs_trans_affects_page(trans, fscache_page_grab_private(ixpage), trans->jentry->entry, sizeof(*xent)); /* write the transaction mark to the journal */ diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/super.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/super.c --- linux-2.6.9-rc2-mm4/fs/cachefs/super.c 2004-09-27 11:23:56.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/super.c 2004-10-04 17:20:14.014693842 +0100 @@ -133,6 +133,13 @@ sb = get_sb_bdev(fs_type, flags, dev_name, options, cachefs_fill_super); + _debug("backing nodes %p: %p,%p -> %p,%p", + &fscache_fsdef_index.backing_nodes, + fscache_fsdef_index.backing_nodes.next, + fscache_fsdef_index.backing_nodes.prev, + fscache_fsdef_index.backing_nodes.next->next, + fscache_fsdef_index.backing_nodes.next->prev); + _leave(" = %p", sb); return sb; @@ -222,7 +229,6 @@ */ static int cachefs_fill_super(struct super_block *sb, void *_data, int silent) { - struct cachefs_search_result *srch = NULL; struct cachefs_super *super = NULL; struct cachefs_inode *inode = NULL, *inode2; struct dentry *root = NULL; @@ -267,10 +273,6 @@ super->vjnl_count = CACHEFS_ONDISC_VJNL_ENTS; - srch = kmalloc(sizeof(*srch), GFP_KERNEL); - if (!srch) - goto error; - /* initialise the superblock */ sb->s_magic = CACHEFS_FS_MAGIC; sb->s_op = &cachefs_super_ops; @@ -278,10 +280,13 @@ super->sb = sb; super->ujnl_step = bdev_hardsect_size(super->sb->s_bdev); - INIT_LIST_HEAD(&super->mnt_link); - - INIT_LIST_HEAD(&super->ino_list); - spin_lock_init(&super->ino_list_lock); + fscache_init_cache(&super->cache, + &cachefs_cache_ops, + CACHEFS_INO_FSDEF_CATALOGUE, + "%02x:%02x", + MAJOR(sb->s_dev), + MINOR(sb->s_dev) + ); rwlock_init(&super->blk_tree_lock); @@ -455,17 +460,12 @@ goto error; } - cachefs_add_cache((struct cachefs_super *) sb->s_fs_info, srch); + fscache_add_cache(&super->cache); _leave(" = 0 [super=%p]", super); return 0; error: - if (srch) { - dbgfree(srch); - kfree(srch); - } - if (super) { if (super->dmn_task) { super->dmn_die = 1; @@ -628,7 +628,7 @@ metadata->mtime = CURRENT_TIME.tv_sec; metadata->atime = CURRENT_TIME.tv_sec; - metadata->index.dsize = sizeof(struct cachefs_ondisc_fsdef); + metadata->index.dsize = sizeof(struct fscache_fsdef_index_entry); metadata->index.esize = sizeof(struct cachefs_ondisc_index_entry); metadata->index.esize += metadata->index.dsize; metadata->index.keys[0] = CACHEFS_ONDISC_INDEXKEY_ASCIIZ | 24; @@ -805,7 +805,7 @@ BUG_ON(!super); /* detach the cache from all cookies that reference it */ - cachefs_withdraw_cache(super); + fscache_withdraw_cache(&super->cache); /* wait for validity journalling to be sorted */ if (!list_empty(&super->vjnl_unallocq) || @@ -897,14 +897,14 @@ { struct cachefs_inode *inode = _inode; + _enter("%p,,1", _inode); + if ((flags & (SLAB_CTOR_VERIFY|SLAB_CTOR_CONSTRUCTOR)) == SLAB_CTOR_CONSTRUCTOR) { memset(inode, 0, sizeof(*inode)); inode_init_once(&inode->vfs_inode); init_rwsem(&inode->metadata_sem); - - INIT_LIST_HEAD(&inode->cookie_link); - INIT_LIST_HEAD(&inode->super_link); + fscache_node_init(&inode->node); } } /* end cachefs_i_init_once() */ @@ -922,6 +922,7 @@ if (!inode) return NULL; + _leave(" = %p", &inode->vfs_inode); return &inode->vfs_inode; } /* end cachefs_alloc_inode() */ diff -uNr linux-2.6.9-rc2-mm4/fs/cachefs/vjournal.c linux-2.6.9-rc2-mm4-fscache/fs/cachefs/vjournal.c --- linux-2.6.9-rc2-mm4/fs/cachefs/vjournal.c 2004-09-27 11:23:56.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/cachefs/vjournal.c 2004-10-04 13:34:40.114592740 +0100 @@ -307,7 +307,7 @@ ptr = kmap_atomic(vjentry->vpage, KM_USER0); memset(ptr + vjentry->ventry, 0, sizeof(struct cachefs_ondisc_validity_journal)); - kunmap_atomic(vjentry->vpage, KM_USER0); + kunmap_atomic(ptr, KM_USER0); /* queue the transaction to be written to disc */ cachefs_trans_commit(trans); @@ -380,7 +380,7 @@ struct cachefs_ondisc_validity_journal *vjmark; struct cachefs_vj_entry *vjentry; struct cachefs_super *super = (struct cachefs_super *) desc->arg.buf; - struct cachefs_page *pageio; + struct fscache_page *pageio; unsigned long stop; void *data; int ret; @@ -395,7 +395,7 @@ stop = offset + size; - pageio = cachefs_page_grab_private(page); + pageio = fscache_page_grab_private(page); cachefs_block_set(super, pageio->mapped_block, page, pageio); data = kmap(page); @@ -483,7 +483,7 @@ /* validate it */ ret = -EINVAL; - if (inode->flags & CACHEFS_ACTIVE_INODE_ISINDEX) { + if (inode->node.flags & FSCACHE_NODE_ISINDEX) { printk("CacheFS: Index inode %x has block in v-journal\n", vjentry->ino); goto error2; @@ -606,10 +606,10 @@ /* get the block number for this level */ if (!step->bix) { - u8 *data = kmap(step[1].page); + u8 *data = kmap_atomic(step[1].page, KM_USER0); step->bix = *(cachefs_blockix_t *)(data + step->offset); - kunmap(step[1].page); + kunmap_atomic(data, KM_USER0); } /* allocate this block if necessary */ diff -uNr linux-2.6.9-rc2-mm4/fs/fscache/cookie.c linux-2.6.9-rc2-mm4-fscache/fs/fscache/cookie.c --- linux-2.6.9-rc2-mm4/fs/fscache/cookie.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/fscache/cookie.c 2004-10-04 15:10:07.165240574 +0100 @@ -0,0 +1,1000 @@ +/* cookie.c: general filesystem cache cookie management + * + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@xxxxxxxxxx) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <linux/module.h> +#include "fscache-int.h" + +LIST_HEAD(fscache_netfs_list); +LIST_HEAD(fscache_cache_list); +DECLARE_RWSEM(fscache_addremove_sem); + +kmem_cache_t *fscache_cookie_jar; + +static void fscache_withdraw_node(struct fscache_cache *cache, + struct fscache_node *node); + +/*****************************************************************************/ +/* + * register a network filesystem for caching + */ +int __fscache_register_netfs(struct fscache_netfs *netfs, + struct fscache_index_def *primary_idef) +{ + struct fscache_netfs *ptr; + int ret; + + _enter("{%s}", netfs->name); + + INIT_LIST_HEAD(&netfs->link); + + /* allocate a cookie for the primary index */ + netfs->primary_index = + kmem_cache_alloc(fscache_cookie_jar, SLAB_KERNEL); + + if (!netfs->primary_index) { + _leave(" = -ENOMEM"); + return -ENOMEM; + } + + /* initialise the primary index cookie */ + memset(netfs->primary_index, 0, sizeof(*netfs->primary_index)); + + atomic_set(&netfs->primary_index->usage, 1); + atomic_set(&netfs->primary_index->children, 0); + + netfs->primary_index->idef = primary_idef; + netfs->primary_index->iparent = &fscache_fsdef_index; + netfs->primary_index->netfs = netfs; + netfs->primary_index->netfs_data = netfs; + + atomic_inc(&netfs->primary_index->iparent->usage); + atomic_inc(&netfs->primary_index->iparent->children); + + rwlock_init(&netfs->primary_index->lock); + init_rwsem(&netfs->primary_index->sem); + INIT_LIST_HEAD(&netfs->primary_index->search_results); + INIT_LIST_HEAD(&netfs->primary_index->backing_nodes); + + /* check the netfs type is not already present */ + down_write(&fscache_addremove_sem); + + ret = -EEXIST; + list_for_each_entry(ptr, &fscache_netfs_list, link) { + if (strcmp(ptr->name, netfs->name) == 0) + goto already_registered; + } + + list_add(&netfs->link, &fscache_netfs_list); + ret = 0; + + printk("Fscache: netfs '%s' registered for caching\n", netfs->name); + + already_registered: + up_write(&fscache_addremove_sem); + + if (ret < 0) { + netfs->primary_index->iparent = NULL; + __fscache_cookie_put(netfs->primary_index); + netfs->primary_index = NULL; + } + + _leave(" = %d", ret); + return ret; + +} /* end __fscache_register_netfs() */ + +EXPORT_SYMBOL(__fscache_register_netfs); + +/*****************************************************************************/ +/* + * unregister a network filesystem from the cache + * - all cookies must have been released first + */ +void __fscache_unregister_netfs(struct fscache_netfs *netfs) +{ + _enter("{%s.%u}", netfs->name, netfs->version); + + down_write(&fscache_addremove_sem); + + list_del(&netfs->link); + fscache_relinquish_cookie(netfs->primary_index, 0); + + up_write(&fscache_addremove_sem); + + printk("Fscache: netfs '%s' unregistered from caching\n", netfs->name); + + _leave(""); + +} /* end __fscache_unregister_netfs() */ + +EXPORT_SYMBOL(__fscache_unregister_netfs); + +/*****************************************************************************/ +/* + * initialise a cache record + */ +void fscache_init_cache(struct fscache_cache *cache, + struct fscache_cache_ops *ops, + unsigned fsdef_ino, + const char *idfmt, + ...) +{ + va_list va; + + memset(cache, 0, sizeof(*cache)); + + cache->ops = ops; + + va_start(va, idfmt); + vsnprintf(cache->identifier, sizeof(cache->identifier), idfmt, va); + va_end(va); + + INIT_LIST_HEAD(&cache->link); + INIT_LIST_HEAD(&cache->node_list); + spin_lock_init(&cache->node_list_lock); + + INIT_LIST_HEAD(&cache->fsdef_srch.link); + cache->fsdef_srch.cache = cache; + cache->fsdef_srch.ino = fsdef_ino; + +} /* end fscache_init_cache() */ + +EXPORT_SYMBOL(fscache_init_cache); + +/*****************************************************************************/ +/* + * declare a mounted cache as being open for business + */ +void fscache_add_cache(struct fscache_cache *cache) +{ + struct fscache_node *ifsdef; + + BUG_ON(!cache->ops); + + _enter("{%s.%s}", cache->ops->name, cache->identifier); + + /* prepare an active-node record for the FSDEF index of this cache */ + ifsdef = cache->ops->lookup_node(cache, cache->fsdef_srch.ino); + BUG_ON(IS_ERR(ifsdef)); /* there shouldn't be an error as FSDEF is the + * root dir of the FS and so should already be + * in core */ + + if (!cache->ops->grab_node(ifsdef)) + BUG(); + + ifsdef->cookie = &fscache_fsdef_index; + + down_write(&fscache_addremove_sem); + + /* add the cache to the list */ + list_add(&cache->link, &fscache_cache_list); + + /* add the cache's netfs definition index node to the cache's + * list */ + spin_lock(&cache->node_list_lock); + list_add_tail(&ifsdef->cache_link, &cache->node_list); + spin_unlock(&cache->node_list_lock); + + /* add the cache's netfs definition index node to the top level index + * cookie as a known backing node */ + down_write(&fscache_fsdef_index.sem); + + list_add_tail(&cache->fsdef_srch.link, + &fscache_fsdef_index.search_results); + list_add_tail(&ifsdef->cookie_link, + &fscache_fsdef_index.backing_nodes); + + atomic_inc(&fscache_fsdef_index.usage); + + /* done */ + up_write(&fscache_fsdef_index.sem); + up_write(&fscache_addremove_sem); + _leave(""); + +} /* end fscache_add_cache() */ + +EXPORT_SYMBOL(fscache_add_cache); + +/*****************************************************************************/ +/* + * withdraw an unmounted cache from the active service + */ +void fscache_withdraw_cache(struct fscache_cache *cache) +{ + struct fscache_node *node; + + _enter(""); + + /* make the cache unavailable for cookie acquisition */ + set_bit(FSCACHE_CACHE_WITHDRAWN, &cache->flags); + + down_write(&fscache_addremove_sem); + list_del_init(&cache->link); + up_write(&fscache_addremove_sem); + + /* mark all nodes as being withdrawn */ + spin_lock(&cache->node_list_lock); + list_for_each_entry(node, &cache->node_list, cache_link) { + set_bit(FSCACHE_NODE_WITHDRAWN, &node->flags); + } + spin_unlock(&cache->node_list_lock); + + /* make sure all pages pinned by operations on behalf of the netfs are + * written to disc */ + cache->ops->sync(cache); + + /* dissociate all the netfs pages backed by this cache from the block + * mappings in the cache */ + cache->ops->dissociate_pages(cache); + + /* we now have to destroy all the active nodes pertaining to this + * cache */ + spin_lock(&cache->node_list_lock); + + while (!list_empty(&cache->node_list)) { + node = list_entry(cache->node_list.next, struct fscache_node, + cache_link); + list_del(&node->cache_link); + spin_unlock(&cache->node_list_lock); + + /* we've extracted an active node from the tree - now dispose + * of it */ + fscache_withdraw_node(cache, node); + cache->ops->put_node(node); + + spin_lock(&cache->node_list_lock); + } + + spin_unlock(&cache->node_list_lock); + + _leave(""); + +} /* end fscache_withdraw_cache() */ + +EXPORT_SYMBOL(fscache_withdraw_cache); + +/*****************************************************************************/ +/* + * withdraw an node from active service + * - need break the links to a cached object cookie + * - called under two situations: + * (1) recycler decides to reclaim an in-use node + * (2) a cache is unmounted + * - have to take care as the cookie can be being relinquished by the netfs + * simultaneously + * - the active node is pinned by the caller holding a refcount on it + */ +static void fscache_withdraw_node(struct fscache_cache *cache, + struct fscache_node *node) +{ + struct fscache_search_result *srch; + struct fscache_cookie *cookie, *xcookie = NULL; + + _enter(""); + + /* first of all we have to break the links between the node and the + * cookie + * - we have to hold both semaphores BUT we have to get the cookie sem + * FIRST + */ + cache->ops->lock_node(node); + + cookie = node->cookie; + if (cookie) { + /* pin the cookie so that is doesn't escape */ + atomic_inc(&cookie->usage); + + /* re-order the locks to avoid deadlock */ + cache->ops->unlock_node(node); + down_write(&cookie->sem); + cache->ops->lock_node(node); + + /* erase references from the node to the cookie */ + list_del_init(&node->cookie_link); + + xcookie = node->cookie; + node->cookie = NULL; + + /* delete the search result record for this node from the + * cookie's list */ + list_for_each_entry(srch, &cookie->search_results, link) { + if (srch->cache == cache) + goto found_record; + } + BUG(); + + found_record: + list_del_init(&srch->link); + + if (srch != &cache->fsdef_srch) { + dbgfree(srch); + kfree(srch); + } + + up_write(&cookie->sem); + } + + cache->ops->unlock_node(node); + + /* we've broken the links between cookie and node */ + _debug("broken links"); + + if (xcookie) { + fscache_cookie_put(xcookie); + cache->ops->put_node(node); + } + + /* unpin the cookie */ + if (cookie) + fscache_cookie_put(cookie); + + _leave(""); + +} /* end fscache_withdraw_node() */ + +/*****************************************************************************/ +/* + * search for representation of an object in its parent cache + * - the cookie must be locked by the caller + * - returns -ENODATA if the object or one of its ancestors doesn't exist + */ +static int fscache_search_for_object(struct fscache_cookie *cookie, + struct fscache_cache *cache) +{ + struct fscache_search_result *srch; + struct fscache_cookie *iparent; + struct fscache_node *ipnode, *node; + int ret; + + iparent = cookie->iparent; + if (!iparent) { + /* FSDEF entries don't have a parent */ + _enter("{.fsdef},%s.%s", + cache->ops->name, cache->identifier); + BUG_ON(list_empty(&cookie->backing_nodes)); + BUG_ON(list_empty(&cookie->search_results)); + _leave(" = 0 [.fsdef]"); + return 0; + } + + _enter("{%s/%s},%s.%s", + iparent->idef->name, + cookie->idef ? (char *) cookie->idef->name : "<file>", + cache->ops->name, cache->identifier); + + /* see if there's a search result for this object already */ + list_for_each_entry(srch, &cookie->search_results, link) { + _debug("check entry %p x %p [ino %u]", + cookie, cache, srch->ino); + + if (srch->cache == cache) { + _debug("found entry"); + + if (srch->ino) { + _leave(" = 0 [found ino %u]", srch->ino); + return 0; + } + + /* entry is negative */ + _leave(" = -ENODATA"); + return -ENODATA; + } + } + + /* allocate an initially negative entry for this object */ + _debug("alloc entry %p x %p", cookie, cache); + + srch = kmalloc(sizeof(*srch), GFP_KERNEL); + if (!srch) { + _leave(" = -ENOMEM"); + return -ENOMEM; + } + + srch->cache = cache; + srch->ino = 0; + INIT_LIST_HEAD(&srch->link); + + /* we need see if there's an entry for this cache in this object's + * parent index, so the first thing to do is to see if the parent index + * is represented on disc + */ + down_read(&iparent->sem); + + _debug("backing nodes %p: %p,%p -> %p,%p", + &iparent->backing_nodes, + iparent->backing_nodes.next, + iparent->backing_nodes.prev, + iparent->backing_nodes.next->next, + iparent->backing_nodes.next->prev); + + ret = fscache_search_for_object(iparent, cache); + if (ret < 0) { + if (ret != -ENODATA) + goto error; + + /* set a negative entry */ + list_add_tail(&srch->link, &cookie->search_results); + goto done; + } + + /* find the parent's backing node */ + _debug("X backing nodes %p: %p,%p -> %p,%p", + &iparent->backing_nodes, + iparent->backing_nodes.next, + iparent->backing_nodes.prev, + iparent->backing_nodes.next->next, + iparent->backing_nodes.next->prev + ); + + _debug("X search results %p: %p,%p -> %p,%p", + &iparent->search_results, + iparent->search_results.next, + iparent->search_results.prev, + iparent->search_results.next->next, + iparent->search_results.next->prev); + + read_lock(&iparent->lock); + list_for_each_entry(ipnode, &iparent->backing_nodes, cookie_link) { + _debug("bnode %p -> %p", ipnode, ipnode->cache); + + if (ipnode->cache == cache) + goto found_parent_entry; + } + + BUG(); + + found_parent_entry: + read_unlock(&iparent->lock); + _debug("found_parent_entry"); + + /* search the parent index for a reference compatible with this + * object */ + ret = cache->ops->index_search(ipnode, cookie, srch); + switch (ret) { + default: + goto error; + + case 0: + /* found - allocate an node */ + node = cache->ops->lookup_node(cache, srch->ino); + if (IS_ERR(node)) { + ret = PTR_ERR(node); + goto error; + } + + cache->ops->lock_node(node); + + BUG_ON(!list_empty(&node->cookie_link)); + + /* attach the node to the cache's node list */ + if (list_empty(&node->cache_link)) { + if (!cache->ops->grab_node(node)) + goto igrab_failed_upput; + + spin_lock(&cache->node_list_lock); + list_add_tail(&node->cache_link, &cache->node_list); + spin_unlock(&cache->node_list_lock); + } + + /* attach the node to the cookie */ + node->cookie = cookie; + atomic_inc(&cookie->usage); + + write_lock(&iparent->lock); + list_add_tail(&srch->link, &cookie->search_results); + list_add_tail(&node->cookie_link, &cookie->backing_nodes); + write_unlock(&iparent->lock); + + cache->ops->unlock_node(node); + break; + + case -ENOENT: + /* we can at least set a valid negative entry */ + list_add_tail(&srch->link, &cookie->search_results); + ret = -ENODATA; + break; + } + + done: + up_read(&iparent->sem); + _leave(" = %d", ret); + return ret; + + igrab_failed_upput: + cache->ops->unlock_node(node); + cache->ops->put_node(node); + ret = -ENOENT; + error: + up_read(&iparent->sem); + dbgfree(srch); + kfree(srch); + _leave(" = %d", ret); + return ret; + +} /* end fscache_search_for_object() */ + +/*****************************************************************************/ +/* + * instantiate the object in the specified cache + * - the cookie must be write-locked by the caller + * - search must have been performed first (so lists of search results are + * filled out) + * - all parent index objects are instantiated if necessary + */ +static int fscache_instantiate_object(struct fscache_cookie *cookie, + struct fscache_cache *cache) +{ + struct fscache_search_result *srch; + struct fscache_cookie *iparent; + struct fscache_node *ipnode, *node; + int ret; + + iparent = cookie->iparent; + if (!iparent) + return 0; /* FSDEF entries don't have a parent */ + + _enter("{%s/%s},", + iparent->idef->name, + cookie->idef ? (char *) cookie->idef->name : "<file>"); + + /* find the search result for this object */ + list_for_each_entry(srch, &cookie->search_results, link) { + if (srch->cache == cache) + goto found_search_result; + } + + BUG(); + + found_search_result: + if (srch->ino) { + /* it was instantiated already */ + _leave(" = 0 [found ino %u]", srch->ino); + return 0; + } + + /* we need to insert an entry for this cache in the object's parent + * index, so the first thing to do is make sure that the parent index + * is represented on disc + */ + down_write(&iparent->sem); + + ret = fscache_instantiate_object(iparent, cache); + if (ret < 0) + goto error; + + /* the parent index's node should now be available */ + list_for_each_entry(ipnode, &iparent->backing_nodes, cookie_link) { + if (ipnode->cache == cache) + goto found_parent_node; + } + + BUG(); + + found_parent_node: + _debug("found_parent_node: node=%p", ipnode); + + BUG_ON(ipnode->cookie != iparent); + + /* allocate an entry within the parent index node */ + ret = cache->ops->index_add(ipnode, cookie, srch); + if (ret < 0) + goto error; + + /* we're going to need an in-memory reflection of the node too */ + node = cache->ops->lookup_node(cache, srch->ino); + if (IS_ERR(node)) { + ret = PTR_ERR(node); + goto error_x; /* uh-oh... our search record is now wrong */ + } + + /* keep track of it */ + cache->ops->lock_node(node); + + BUG_ON(!list_empty(&node->cookie_link)); + + /* attach to the cache's node list */ + if (list_empty(&node->cache_link)) { + if (!cache->ops->grab_node(node)) + goto error_xi; + + spin_lock(&cache->node_list_lock); + list_add_tail(&node->cache_link, &cache->node_list); + spin_unlock(&cache->node_list_lock); + } + + /* attach to the cookie's search result list */ + node->cookie = cookie; + atomic_inc(&cookie->usage); + list_add_tail(&node->cookie_link, &cookie->backing_nodes); + + /* done */ + cache->ops->unlock_node(node); + up_write(&iparent->sem); + _leave(" = 0 [new]"); + return 0; + + /* if we get an error after having instantiated an node on disc, just + * discard the search record so we find it next time */ + error_xi: + cache->ops->unlock_node(node); + cache->ops->put_node(node); + ret = -ENOENT; + error_x: + list_del(&srch->link); + dbgfree(srch); + kfree(srch); + srch = NULL; + error: + up_write(&iparent->sem); + _leave(" = %d", ret); + return ret; + +} /* end fscache_instantiate_object() */ + +/*****************************************************************************/ +/* + * select a cache on which to store a file + * - the cache addremove semaphore must be at least read-locked by the caller + */ +static struct fscache_cache *fscache_select_cache_for_file(void) +{ + struct fscache_cache *cache; + + _enter(""); + + /* TODO: make more intelligent than just choosing the first cache */ + cache = NULL; + if (!list_empty(&fscache_cache_list)) + cache = list_entry(fscache_cache_list.next, + struct fscache_cache, + link); + + _leave(" = %p", cache); + return cache; + +} /* end fscache_select_cache_for_file() */ + +/*****************************************************************************/ +/* + * request a cookie to represent a data file or an index + * - iparent specifies the parent index to pin in memory + * - the top level index cookie for each netfs is stored in the fscache_netfs + * struct upon registration + * - idef is NULL for a data file + * - idef points to the definition for an index + * - the netfs_data will be passed to the functions pointed to in *idef + * - all attached caches will be searched to see if they contain this object + * - index objects aren't stored on disc until there's a dependent file that + * needs storing + * - file objects are stored in a selected cache immediately, and all the + * indexes forming the path to it are instantiated if necessary + * - we never let on to the netfs about errors + * - we may set a negative cookie pointer, but that's okay + */ +struct fscache_cookie *__fscache_acquire_cookie(struct fscache_cookie *iparent, + struct fscache_index_def *idef, + void *netfs_data) +{ + struct fscache_cookie *cookie; + struct fscache_cache *cache; + int ret = 0; + + _enter("{%s},{%s},%p", + iparent ? (char *) iparent->idef->name : "<no-parent>", + idef ? (char *) idef->name : "<file>", + netfs_data); + + _debug("backing nodes %p: %p,%p -> %p,%p", + &fscache_fsdef_index.backing_nodes, + fscache_fsdef_index.backing_nodes.next, + fscache_fsdef_index.backing_nodes.prev, + fscache_fsdef_index.backing_nodes.next->next, + fscache_fsdef_index.backing_nodes.next->prev); + + /* if there's no parent cookie, then we don't create one here either */ + if (iparent == FSCACHE_NEGATIVE_COOKIE) { + _leave(" [no parent]"); + return FSCACHE_NEGATIVE_COOKIE; + } + + /* if it's going to be an index then validate the index data */ + if (idef) { + size_t dsize; + int loop; + + if (!idef->name[0]) { + printk("Fscache: %s.%s.%p: nameless index\n", + iparent->netfs->name, + iparent->idef->name, + idef); + return FSCACHE_NEGATIVE_COOKIE; + } + + dsize = idef->data_size; + + for (loop = 0; loop < 4; loop++) { + if (idef->keys[loop].type >= + FSCACHE_INDEX_KEYS__LAST) { + printk("Fscache: %s.%s.%s:" + " index type %u unsupported\n", + iparent->netfs->name, + iparent->idef->name, + idef->name, + idef->keys[loop].type); + return FSCACHE_NEGATIVE_COOKIE; + } + + dsize += idef->keys[loop].len; + } + + if (dsize > 400) { + printk("Fscache: %s.%s.%s:" + " index entry size exceeds maximum %u>400\n", + iparent->netfs->name, + iparent->idef->name, + idef->name, + dsize); + return FSCACHE_NEGATIVE_COOKIE; + } + } + + /* allocate and initialise a cookie */ + cookie = kmem_cache_alloc(fscache_cookie_jar, SLAB_KERNEL); + if (!cookie) { + _leave(" [ENOMEM]"); + return FSCACHE_NEGATIVE_COOKIE; + } + + atomic_set(&cookie->usage, 1); + atomic_set(&cookie->children, 0); + + atomic_inc(&iparent->usage); + atomic_inc(&iparent->children); + + cookie->idef = idef; + cookie->iparent = iparent; + cookie->netfs = iparent->netfs; + cookie->netfs_data = netfs_data; + + /* now we need to see whether the backing objects for this cookie yet + * exist, if not there'll be nothing to search */ + down_read(&fscache_addremove_sem); + + if (list_empty(&fscache_cache_list)) { + up_read(&fscache_addremove_sem); + _leave(" [no caches]"); + return cookie; + } + + down_write(&cookie->sem); + + /* search every cache we know about to see if the object is already + * present */ + list_for_each_entry(cache, &fscache_cache_list, link) { + ret = fscache_search_for_object(cookie, cache); + switch (ret) { + case 0: + if (!cookie->idef) + break; /* only want the first file entry */ + case -ENODATA: + ret = 0; + continue; + default: + goto error; + } + } + + /* if the object is a cookie then we need do nothing more here - we + * create indexes on disc when we need them as an index may exist in + * multiple caches */ + if (cookie->idef) + goto done; + + /* the object is a file - we need to select a cache in which to store + * it */ + ret = -ENOMEDIUM; + cache = fscache_select_cache_for_file(); + if (!cache) + goto error; /* couldn't decide on a cache */ + + /* create a file index entry on disc, along with all the indexes + * required to find it again later */ + ret = fscache_instantiate_object(cookie, cache); + if (ret == 0) + goto done; + + error: + printk("Fscache: error from cache fs: %d\n", ret); + if (cookie) { + __fscache_cookie_put(cookie); + cookie = FSCACHE_NEGATIVE_COOKIE; + atomic_dec(&iparent->children); + } + + done: + up_write(&cookie->sem); + up_read(&fscache_addremove_sem); + _leave(" = %p", cookie); + return cookie; + +} /* end __fscache_acquire_cookie() */ + +EXPORT_SYMBOL(__fscache_acquire_cookie); + +/*****************************************************************************/ +/* + * release a cookie back to the cache + * - the object will be marked as recyclable on disc if retire is true + * - all dependents of this cookie must have already been unregistered + * (indexes/files/pages) + */ +void __fscache_relinquish_cookie(struct fscache_cookie *cookie, int retire) +{ + struct fscache_cache *cache; + struct fscache_node *node; + + _enter("%p{%s},%d", + cookie, + cookie && cookie->idef ? (char *) cookie->idef->name : "<file>", + retire); + + if (cookie == FSCACHE_NEGATIVE_COOKIE) { + _leave(" [no cookie]"); + return; + } + + if (atomic_read(&cookie->children) != 0) { + printk("Fscache: cookie still has children\n"); + BUG(); + } + + /* detach pointers back to netfs */ + down_write(&cookie->sem); + + cookie->netfs_data = NULL; + cookie->idef = NULL; + + read_lock(&cookie->lock); + + /* queue retired objects for recycling */ + if (retire) { + list_for_each_entry(node, + &cookie->backing_nodes, + cookie_link) { + set_bit(FSCACHE_NODE_RECYCLING, &node->flags); + } + } + + /* break links with all the active nodes */ + while (!list_empty(&cookie->backing_nodes)) { + node = list_entry(cookie->backing_nodes.next, + struct fscache_node, + cookie_link); + + /* detach each cache node from the object cookie */ + set_bit(FSCACHE_NODE_RELEASING, &node->flags); + + list_del_init(&node->cookie_link); + read_unlock(&cookie->lock); + + cache = node->cache; + cache->ops->lock_node(node); + node->cookie = NULL; + cache->ops->unlock_node(node); + + if (atomic_dec_and_test(&cookie->usage)) + /* the cookie refcount shouldn't be reduced to 0 yet */ + BUG(); + + cache->ops->put_node(node); + + read_lock(&cookie->lock); + } + + read_unlock(&cookie->lock); + up_write(&cookie->sem); + + if (cookie->iparent) + atomic_dec(&cookie->iparent->children); + + /* finally dispose of the cookie */ + fscache_cookie_put(cookie); + + _leave(""); + +} /* end __fscache_relinquish_cookie() */ + +EXPORT_SYMBOL(__fscache_relinquish_cookie); + +/*****************************************************************************/ +/* + * update the index entries backing a cookie + */ +void __fscache_update_cookie(struct fscache_cookie *cookie) +{ + struct fscache_node *ixnode, *node; + + _enter("{%s}", + cookie && + cookie->idef ? (char *) cookie->idef->name : "<file>"); + + if (cookie == FSCACHE_NEGATIVE_COOKIE) { + _leave(" [no cookie]"); + return; + } + + down_write(&cookie->sem); + down_write(&cookie->iparent->sem); + + /* update the index entry on disc in each cache backing this cookie */ + list_for_each_entry(node, &cookie->backing_nodes, cookie_link) { + ixnode = fscache_find_parent_node(node); + node->cache->ops->index_update(ixnode, node); + } + + up_write(&cookie->iparent->sem); + up_write(&cookie->sem); + _leave(""); + +} /* end __fscache_update_cookie() */ + +EXPORT_SYMBOL(__fscache_update_cookie); + +/*****************************************************************************/ +/* + * destroy a cookie + */ +void __fscache_cookie_put(struct fscache_cookie *cookie) +{ + struct fscache_search_result *srch; + + _enter("%p", cookie); + + if (cookie->iparent) + fscache_cookie_put(cookie->iparent); + + /* dispose of any cached search results */ + while (!list_empty(&cookie->search_results)) { + srch = list_entry(cookie->search_results.next, + struct fscache_search_result, + link); + + list_del(&srch->link); + kfree(srch); + } + + BUG_ON(!list_empty(&cookie->search_results)); + BUG_ON(!list_empty(&cookie->backing_nodes)); + kmem_cache_free(fscache_cookie_jar, cookie); + + _leave(""); + +} /* end __fscache_cookie_put() */ + +/*****************************************************************************/ +/* + * initialise an cookie jar slab element prior to any use + */ +void fscache_cookie_init_once(void *_cookie, kmem_cache_t *cachep, + unsigned long flags) +{ + struct fscache_cookie *cookie = _cookie; + + if ((flags & (SLAB_CTOR_VERIFY|SLAB_CTOR_CONSTRUCTOR)) == + SLAB_CTOR_CONSTRUCTOR) { + memset(cookie, 0, sizeof(*cookie)); + rwlock_init(&cookie->lock); + init_rwsem(&cookie->sem); + INIT_LIST_HEAD(&cookie->search_results); + INIT_LIST_HEAD(&cookie->backing_nodes); + } + +} /* end fscache_cookie_init_once() */ diff -uNr linux-2.6.9-rc2-mm4/fs/fscache/fscache-int.h linux-2.6.9-rc2-mm4-fscache/fs/fscache/fscache-int.h --- linux-2.6.9-rc2-mm4/fs/fscache/fscache-int.h 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/fscache/fscache-int.h 2004-09-30 13:50:20.000000000 +0100 @@ -0,0 +1,81 @@ +/* fscache-int.h: internal definitions + * + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@xxxxxxxxxx) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _FSCACHE_INT_H +#define _FSCACHE_INT_H + +#include <linux/fscache-cache.h> +#include <linux/timer.h> +#include <linux/bio.h> + +extern kmem_cache_t *fscache_cookie_jar; + +extern struct fscache_cookie fscache_fsdef_index; + +extern void fscache_cookie_init_once(void *_cookie, kmem_cache_t *cachep, unsigned long flags); + +extern void __fscache_cookie_put(struct fscache_cookie *cookie); + +static inline void fscache_cookie_put(struct fscache_cookie *cookie) +{ + BUG_ON(atomic_read(&cookie->usage) <= 0); + + if (atomic_dec_and_test(&cookie->usage)) + __fscache_cookie_put(cookie); + +} + +/*****************************************************************************/ +/* + * debug tracing + */ +#define dbgprintk(FMT,...) \ + printk("[%-6.6s] "FMT"\n",current->comm ,##__VA_ARGS__) +#define _dbprintk(FMT,...) do { } while(0) + +#define kenter(FMT,...) dbgprintk("==> %s("FMT")",__FUNCTION__ ,##__VA_ARGS__) +#define kleave(FMT,...) dbgprintk("<== %s()"FMT"",__FUNCTION__ ,##__VA_ARGS__) +#define kdebug(FMT,...) dbgprintk(FMT ,##__VA_ARGS__) + +#define kjournal(FMT,...) _dbprintk(FMT ,##__VA_ARGS__) + +#define dbgfree(ADDR) _dbprintk("%p:%d: FREEING %p",__FILE__,__LINE__,ADDR) + +#define dbgpgalloc(PAGE) \ +do { \ + _dbprintk("PGALLOC %s:%d: %p {%lx,%lu}\n", \ + __FILE__,__LINE__, \ + (PAGE),(PAGE)->mapping->host->i_ino,(PAGE)->index \ + ); \ +} while(0) + +#define dbgpgfree(PAGE) \ +do { \ + if ((PAGE)) \ + _dbprintk("PGFREE %s:%d: %p {%lx,%lu}\n", \ + __FILE__,__LINE__, \ + (PAGE), \ + (PAGE)->mapping->host->i_ino, \ + (PAGE)->index \ + ); \ +} while(0) + +#ifdef __KDEBUG +#define _enter(FMT,...) kenter(FMT,##__VA_ARGS__) +#define _leave(FMT,...) kleave(FMT,##__VA_ARGS__) +#define _debug(FMT,...) kdebug(FMT,##__VA_ARGS__) +#else +#define _enter(FMT,...) do { } while(0) +#define _leave(FMT,...) do { } while(0) +#define _debug(FMT,...) do { } while(0) +#endif + +#endif /* _FSCACHE_INT_H */ diff -uNr linux-2.6.9-rc2-mm4/fs/fscache/fsdef.c linux-2.6.9-rc2-mm4-fscache/fs/fscache/fsdef.c --- linux-2.6.9-rc2-mm4/fs/fscache/fsdef.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/fscache/fsdef.c 2004-10-04 15:09:45.928003981 +0100 @@ -0,0 +1,87 @@ +/* fsdef.c: filesystem index definition + * + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@xxxxxxxxxx) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <linux/module.h> +#include "fscache-int.h" + +static fscache_match_val_t fscache_fsdef_index_match(void *target, + const void *entry); + +static void fscache_fsdef_index_update(void *source, void *entry); + +static struct fscache_index_def fscache_fsdef_index_def = { + .name = ".fsdef", + .data_size = sizeof(struct fscache_fsdef_index_entry), + .match = fscache_fsdef_index_match, + .update = fscache_fsdef_index_update +}; + +struct fscache_cookie fscache_fsdef_index = { + .usage = ATOMIC_INIT(1), + .idef = &fscache_fsdef_index_def, + .lock = RW_LOCK_UNLOCKED, + .sem = __RWSEM_INITIALIZER(fscache_fsdef_index.sem), + .search_results = LIST_HEAD_INIT(fscache_fsdef_index.search_results), + .backing_nodes = LIST_HEAD_INIT(fscache_fsdef_index.backing_nodes), +}; + +EXPORT_SYMBOL(fscache_fsdef_index); + +/*****************************************************************************/ +/* + * see if the netfs definition matches + */ +static fscache_match_val_t fscache_fsdef_index_match(void *target, + const void *entry) +{ + const struct fscache_fsdef_index_entry *fsdef = entry; + struct fscache_netfs *netfs = target; + + _enter("%p,%p", target, entry); + + /* name and version must both match with what's on disc */ + _debug("{%s.%u},{%s.%u}", + netfs->name, netfs->version, fsdef->name, fsdef->version); + + if (strncmp(netfs->name, fsdef->name, sizeof(fsdef->name)) != 0) { + _leave(" = FAILED"); + return FSCACHE_MATCH_FAILED; + } + + if (netfs->version == fsdef->version) { + _leave(" = SUCCESS"); + return FSCACHE_MATCH_SUCCESS; + } + + /* an entry of the same name but different version is scheduled for + * deletion */ + _leave(" = SUCCESS_DELETE"); + return FSCACHE_MATCH_SUCCESS_DELETE; + +} /* end fscache_fsdef_index_match() */ + +/*****************************************************************************/ +/* + * update the netfs definition to be stored on disc + */ +static void fscache_fsdef_index_update(void *source, void *entry) +{ + struct fscache_fsdef_index_entry *fsdef = entry; + struct fscache_netfs *netfs = source; + + _enter("{%s.%u},", netfs->name, netfs->version); + + /* install the netfs name and version in the top-level index entry */ + strncpy(fsdef->name, netfs->name, sizeof(fsdef->name)); + + fsdef->version = netfs->version; + +} /* end fscache_fsdef_index_update() */ diff -uNr linux-2.6.9-rc2-mm4/fs/fscache/main.c linux-2.6.9-rc2-mm4-fscache/fs/fscache/main.c --- linux-2.6.9-rc2-mm4/fs/fscache/main.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/fscache/main.c 2004-09-30 19:11:35.000000000 +0100 @@ -0,0 +1,111 @@ +/* main.c: general filesystem caching manager + * + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@xxxxxxxxxx) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <linux/module.h> +#include <linux/init.h> +#include <linux/sched.h> +#include <linux/completion.h> +#include <linux/slab.h> +#include "fscache-int.h" + +int fscache_debug = 0; + +static int fscache_init(void); +static void fscache_exit(void); + +fs_initcall(fscache_init); +module_exit(fscache_exit); + +MODULE_DESCRIPTION("FS Cache Manager"); +MODULE_AUTHOR("Red Hat, Inc."); +MODULE_LICENSE("GPL"); + +/*****************************************************************************/ +/* + * initialise the fs caching module + */ +static int fscache_init(void) +{ + fscache_cookie_jar = + kmem_cache_create("fscache_cookie_jar", + sizeof(struct fscache_cookie), + 0, + SLAB_HWCACHE_ALIGN, + fscache_cookie_init_once, + NULL); + + if (!fscache_cookie_jar) { + printk(KERN_NOTICE + "Fscache: Failed to allocate a cookie jar\n"); + return -ENOMEM; + } + + printk(KERN_INFO "fscache: general fs caching registered\n"); + return 0; + +} /* end fscache_init() */ + +/*****************************************************************************/ +/* + * clean up on module removal + */ +static void __exit fscache_exit(void) +{ + printk(KERN_INFO "Fscache: general fs caching unregistering\n"); + + kmem_cache_destroy(fscache_cookie_jar); + +} /* end fscache_exit() */ + +/*****************************************************************************/ +/* + * clear the dead space between task_struct and kernel stack + * - called by supplying -finstrument-functions to gcc + */ +#if 0 +void __cyg_profile_func_enter (void *this_fn, void *call_site) +__attribute__((no_instrument_function)); + +void __cyg_profile_func_enter (void *this_fn, void *call_site) +{ + asm volatile(" movl %%esp,%%edi \n" + " andl %0,%%edi \n" + " addl %1,%%edi \n" + " movl %%esp,%%ecx \n" + " subl %%edi,%%ecx \n" + " shrl $2,%%ecx \n" + " movl $0xedededed,%%eax \n" + " rep stosl \n" + : + : "i"(~(THREAD_SIZE-1)), "i"(sizeof(struct thread_info)) + : "eax", "ecx", "edi", "memory", "cc" + ); +} + +void __cyg_profile_func_exit(void *this_fn, void *call_site) +__attribute__((no_instrument_function)); + +void __cyg_profile_func_exit(void *this_fn, void *call_site) +{ + asm volatile(" movl %%esp,%%edi \n" + " andl %0,%%edi \n" + " addl %1,%%edi \n" + " movl %%esp,%%ecx \n" + " subl %%edi,%%ecx \n" + " shrl $2,%%ecx \n" + " movl $0xdadadada,%%eax \n" + " rep stosl \n" + : + : "i"(~(THREAD_SIZE-1)), "i"(sizeof(struct thread_info)) + : "eax", "ecx", "edi", "memory", "cc" + ); +} +#endif diff -uNr linux-2.6.9-rc2-mm4/fs/fscache/Makefile linux-2.6.9-rc2-mm4-fscache/fs/fscache/Makefile --- linux-2.6.9-rc2-mm4/fs/fscache/Makefile 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/fscache/Makefile 2004-09-30 13:40:00.000000000 +0100 @@ -0,0 +1,13 @@ +# +# Makefile for general filesystem caching code +# + +#CFLAGS += -finstrument-functions + +fscache-objs := \ + cookie.o \ + fsdef.o \ + main.o \ + page.o + +obj-$(CONFIG_FSCACHE) := fscache.o diff -uNr linux-2.6.9-rc2-mm4/fs/fscache/page.c linux-2.6.9-rc2-mm4-fscache/fs/fscache/page.c --- linux-2.6.9-rc2-mm4/fs/fscache/page.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/fscache/page.c 2004-10-04 17:06:39.304098924 +0100 @@ -0,0 +1,231 @@ +/* page.c: general filesystem cache cookie management + * + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@xxxxxxxxxx) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include <linux/module.h> +#include <linux/fscache-cache.h> +#include <linux/buffer_head.h> +#include "fscache-int.h" + +/*****************************************************************************/ +/* + * read a page from the cache or allocate a block in which to store it + * - we return: + * -ENOMEM - out of memory, nothing done + * -ENOBUFS - no backing node available in which to cache the block + * -ENODATA - no data available in the backing node for this block + * 0 - dispatched a read - it'll call end_io_func() when finished + */ +int __fscache_read_or_alloc_page(struct fscache_cookie *cookie, + struct page *page, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp) +{ + struct fscache_node *node; + struct fscache_page *pageio; + int ret; + + _enter("%p,{%lu},", cookie, page->index); + + if (cookie == FSCACHE_NEGATIVE_COOKIE) { + _leave(" -ENOBUFS [no cookie]"); + return -ENOBUFS; + } + + if (list_empty(&cookie->backing_nodes)) { + _leave(" -ENOBUFS [no backing nodes]"); + return -ENOBUFS; + } + + BUG_ON(cookie->idef); /* not supposed to use this for indexes */ + + /* get the cache-cookie for this page */ + pageio = cookie->netfs->ops->get_page_token(page); + if (IS_ERR(pageio)) { + _leave(" = %ld", PTR_ERR(pageio)); + return PTR_ERR(pageio); + } + + /* prevent the file from being uncached whilst we access it */ + down_read(&cookie->sem); + + ret = -ENOBUFS; + if (!list_empty(&cookie->backing_nodes)) { + /* get and pin the backing node */ + node = list_entry(cookie->backing_nodes.next, + struct fscache_node, + cookie_link); + + if (node->cache->ops->grab_node(node)) { + /* ask the cache to honour the operation */ + ret = node->cache->ops->read_or_alloc_page(node, + page, + pageio, + end_io_func, + end_io_data, + gfp); + + node->cache->ops->put_node(node); + } + + } + up_read(&cookie->sem); + _leave(" = %d", ret); + return ret; + +} /* end __fscache_read_or_alloc_page() */ + +EXPORT_SYMBOL(__fscache_read_or_alloc_page); + +/*****************************************************************************/ +/* + * request a page be stored in the cache + * - returns: + * -ENOMEM - out of memory, nothing done + * -ENOBUFS - no backing node available in which to cache the page + * 0 - dispatched a write - it'll call end_io_func() when finished + */ +int __fscache_write_page(struct fscache_cookie *cookie, + struct page *page, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp) +{ + struct fscache_page *pageio; + struct fscache_node *node; + int ret; + + _enter("%p,{%lu},", cookie, page->index); + + if (cookie == FSCACHE_NEGATIVE_COOKIE) { + _leave(" -ENOBUFS [no cookie]"); + return -ENOBUFS; /* no actual cookie */ + } + + BUG_ON(cookie->idef); /* not supposed to use this for indexes */ + + /* get the cache-cookie for this page */ + pageio = cookie->netfs->ops->get_page_token(page); + if (IS_ERR(pageio)) { + _leave(" = %ld", PTR_ERR(pageio)); + return PTR_ERR(pageio); + } + + /* prevent the file from been uncached whilst we deal with it */ + down_read(&cookie->sem); + + ret = -ENOBUFS; + if (!list_empty(&cookie->backing_nodes) && pageio->mapped_block) { + node = list_entry(cookie->backing_nodes.next, + struct fscache_node, + cookie_link); + + /* ask the cache to honour the operation */ + ret = node->cache->ops->write_page(node, + page, + pageio, + end_io_func, + end_io_data, + gfp); + } + + up_read(&cookie->sem); + _leave(" = %d", ret); + return ret; + +} /* end __fscache_write_page() */ + +EXPORT_SYMBOL(__fscache_write_page); + +/*****************************************************************************/ +/* + * remove a page from the cache + * - if the block backing the page still has a vjentry then the block will be + * recycled + */ +void __fscache_uncache_page(struct fscache_cookie *cookie, struct page *page) +{ + struct fscache_page *pageio; + struct fscache_node *node; + + _enter(",{%lu}", page->index); + + if (cookie == FSCACHE_NEGATIVE_COOKIE) { + _leave(" [no cookie]"); + return; + } + + BUG_ON(cookie->idef); /* not supposed to use this for indexes */ + + /* get the cache-cookie for this page */ + pageio = cookie->netfs->ops->get_page_token(page); + if (IS_ERR(pageio)) { + _leave(" [get_page_cookie() = %ld]", PTR_ERR(pageio)); + return; + } + + if (list_empty(&cookie->backing_nodes)) { + BUG_ON(pageio->mapped_block); + _leave(" [no backing]"); + return; + } + + if (!pageio->mapped_block) { + _leave(" [no mapping]"); + return; + } + + /* ask the cache to honour the operation */ + down_read(&cookie->sem); + + if (!list_empty(&cookie->backing_nodes) && pageio->mapped_block) { + node = list_entry(cookie->backing_nodes.next, + struct fscache_node, + cookie_link); + + node->cache->ops->uncache_page(node, pageio); + } + + up_read(&cookie->sem); + + _leave(""); + return; + +} /* end __fscache_uncache_page() */ + +EXPORT_SYMBOL(__fscache_uncache_page); + +/*****************************************************************************/ +/* + * get a page caching token from for a page, allocating it and attaching it to + * the page's private pointer if it doesn't exist + */ +struct fscache_page * __fscache_page_get_private(struct page *page, + unsigned gfp_flags) +{ + struct fscache_page *pageio = (struct fscache_page *) page->private; + + if (!pageio) { + pageio = kmalloc(sizeof(*pageio), gfp_flags); + if (!pageio) + return ERR_PTR(-ENOMEM); + + memset(pageio, 0, sizeof(*pageio)); + rwlock_init(&pageio->lock); + + page->private = (unsigned long) pageio; + SetPagePrivate(page); + } + + return pageio; +} /* end __fscache_page_get_private() */ + +EXPORT_SYMBOL(__fscache_page_get_private); diff -uNr linux-2.6.9-rc2-mm4/fs/Kconfig linux-2.6.9-rc2-mm4-fscache/fs/Kconfig --- linux-2.6.9-rc2-mm4/fs/Kconfig 2004-09-27 11:23:57.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/Kconfig 2004-09-30 21:06:13.000000000 +0100 @@ -485,10 +485,21 @@ menu "Caches" -config CACHEFS - tristate "Filesystem caching support" +config FSCACHE + tristate "General filesystem cache manager" depends on EXPERIMENTAL help + This option enables a generic filesystem caching manager that can be + used by various network and other filesystems to cache data + locally. Diffent sorts of caches can be plugged in, depending on the + resources available. + + See Documentation/filesystems/fscache.txt for more information. + +config CACHEFS + tristate "Filesystem caching filesystem" + depends on FSCACHE + help This filesystem acts as a cache for other filesystems - primarily networking filesystems - rather than thus allowing fast local disc to enhance the speed of slower devices. @@ -1482,6 +1493,13 @@ If unsure, say N. +config NFS_FSCACHE + bool "Provide NFS client caching support" + depends on NFS_FS && FSCACHE && EXPERIMENTAL + help + Say Y here if you want NFS data to be cached locally on disc through + the general filesystem cache manager + config NFS_DIRECTIO bool "Allow direct I/O on NFS files (EXPERIMENTAL)" depends on NFS_FS && EXPERIMENTAL @@ -1811,12 +1829,18 @@ If unsure, say N. -config AFS_CACHEFS - bool "Provide AFS client caching support through CacheFS" - depends on AFS_FS && CACHEFS && EXPERIMENTAL +config AFS_KEYS + bool "Provide AFS authentication & security key support" + depends on AFS_FS && KEYS && EXPERIMENTAL + help + Say Y here if you want AFS to attempt to use authentication. + +config AFS_FSCACHE + bool "Provide AFS client caching support" + depends on AFS_FS && FSCACHE && EXPERIMENTAL help - Say Y here if you want AFS data to be cached locally on disc through - the CacheFS filesystem. + Say Y here if you want AFS data to be cached locally on through the + generic filesystem cache manager config RXRPC tristate diff -uNr linux-2.6.9-rc2-mm4/fs/Makefile linux-2.6.9-rc2-mm4-fscache/fs/Makefile --- linux-2.6.9-rc2-mm4/fs/Makefile 2004-09-27 11:23:57.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/Makefile 2004-09-30 13:34:53.000000000 +0100 @@ -45,6 +45,8 @@ obj-$(CONFIG_PROFILING) += dcookies.o # Do not add any filesystems before this line +obj-$(CONFIG_FSCACHE) += fscache/ +obj-$(CONFIG_CACHEFS) += cachefs/ obj-$(CONFIG_REISERFS_FS) += reiserfs/ obj-$(CONFIG_REISER4_FS) += reiser4/ obj-$(CONFIG_EXT3_FS) += ext3/ # Before ext2 so root fs can be ext3 @@ -94,4 +96,3 @@ obj-$(CONFIG_BEFS_FS) += befs/ obj-$(CONFIG_HOSTFS) += hostfs/ obj-$(CONFIG_HPPFS) += hppfs/ -obj-$(CONFIG_CACHEFS) += cachefs/ diff -uNr linux-2.6.9-rc2-mm4/fs/nfs/file.c linux-2.6.9-rc2-mm4-fscache/fs/nfs/file.c --- linux-2.6.9-rc2-mm4/fs/nfs/file.c 2004-09-16 12:08:07.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/nfs/file.c 2004-09-30 18:50:15.000000000 +0100 @@ -27,9 +27,11 @@ #include <linux/slab.h> #include <linux/pagemap.h> #include <linux/smp_lock.h> +#include <linux/buffer_head.h> #include <asm/uaccess.h> #include <asm/system.h> +#include "nfs-fscache.h" #include "delegation.h" @@ -240,6 +242,61 @@ return status; } +#ifdef CONFIG_NFS_FSCACHE +static int nfs_invalidatepage(struct page *page, unsigned long offset) +{ + int ret = 1; + struct nfs_server *server = NFS_SERVER(page->mapping->host); + + BUG_ON(!PageLocked(page)); + + if (server->flags & NFS_MOUNT_FSCACHE) { + if (PagePrivate(page)) { + struct nfs_inode *nfsi = NFS_I(page->mapping->host); + fscache_uncache_page(nfsi->fscache, page); + + if (offset == 0) { + BUG_ON(!PageLocked(page)); + ret = 0; + if (!PageWriteback(page)) + ret = page->mapping->a_ops->releasepage(page, 0); + } + } + } else + ret = 0; + + return ret; +} +static int nfs_releasepage(struct page *page, int gfp_flags) +{ + struct fscache_page *pageio; + struct nfs_server *server = NFS_SERVER(page->mapping->host); + + if (server->flags & NFS_MOUNT_FSCACHE && PagePrivate(page)) { + struct nfs_inode *nfsi = NFS_I(page->mapping->host); + fscache_uncache_page(nfsi->fscache, page); + pageio = (struct fscache_page *) page->private; + page->private = 0; + ClearPagePrivate(page); + + if (pageio) + kfree(pageio); + } + + return 0; +} +static int nfs_mkwrite(struct page *page) +{ + wait_on_page_fs_misc(page); + return 0; +} +#endif + +/* + * since we use page->private for our own nefarious purposes when using fscache, we have to + * override extra address space ops to prevent fs/buffer.c from getting confused, even though we + * may not have asked its opinion + */ struct address_space_operations nfs_file_aops = { .readpage = nfs_readpage, .readpages = nfs_readpages, @@ -251,6 +308,12 @@ #ifdef CONFIG_NFS_DIRECTIO .direct_IO = nfs_direct_IO, #endif +#ifdef CONFIG_NFS_FSCACHE + .sync_page = block_sync_page, + .releasepage = nfs_releasepage, + .invalidatepage = nfs_invalidatepage, + .page_mkwrite = nfs_mkwrite, +#endif }; /* diff -uNr linux-2.6.9-rc2-mm4/fs/nfs/inode.c linux-2.6.9-rc2-mm4-fscache/fs/nfs/inode.c --- linux-2.6.9-rc2-mm4/fs/nfs/inode.c 2004-09-27 11:23:57.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/nfs/inode.c 2004-09-30 18:51:00.000000000 +0100 @@ -41,6 +41,8 @@ #include "delegation.h" +#include "nfs-fscache.h" + #define NFSDBG_FACILITY NFSDBG_VFS #define NFS_PARANOIA 1 @@ -140,7 +142,7 @@ /* * For the moment, the only task for the NFS clear_inode method is to - * release the mmap credential + * release the mmap credential and release the inode's on-disc cache */ static void nfs_clear_inode(struct inode *inode) @@ -153,6 +155,15 @@ cred = nfsi->cache_access.cred; if (cred) put_rpccred(cred); + +#ifdef CONFIG_NFS_FSCACHE + if (NFS_SERVER(inode)->flags & NFS_MOUNT_FSCACHE) { + dprintk("nfs_clear_inode: fscache 0x%p\n", nfsi->fscache); + fscache_relinquish_cookie(nfsi->fscache, 0); + nfsi->fscache = NULL; + } +#endif + BUG_ON(atomic_read(&nfsi->data_updates) != 0); } @@ -462,6 +473,19 @@ server->namelen = NFS2_MAXNAMLEN; } +#ifdef CONFIG_NFS_FSCACHE + /* create a cache index for looking up filehandles */ + server->fscache = NULL; + if (server->flags & NFS_MOUNT_FSCACHE) { + server->fscache = fscache_acquire_cookie(nfs_cache_netfs.primary_index, + &nfs_cache_fh_index_def, server); + if (server->fscache == NULL) { + server->flags &= ~NFS_MOUNT_FSCACHE; + printk(KERN_WARNING "NFS: No Fscache cookie. Turning Fscache off!\n"); + } + } +#endif + sb->s_op = &nfs_sops; return nfs_sb_init(sb, authflavor); } @@ -518,7 +542,7 @@ } nfs_info[] = { { NFS_MOUNT_SOFT, ",soft", ",hard" }, { NFS_MOUNT_INTR, ",intr", "" }, - { NFS_MOUNT_POSIX, ",posix", "" }, + { NFS_MOUNT_POSIX, ",fscache", "" }, { NFS_MOUNT_TCP, ",tcp", ",udp" }, { NFS_MOUNT_NOCTO, ",nocto", "" }, { NFS_MOUNT_NOAC, ",noac", "" }, @@ -568,6 +592,14 @@ nfsi->flags |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA; else nfsi->flags |= NFS_INO_INVALID_ATTR; + +#ifdef CONFIG_NFS_FSCACHE + if (NFS_SERVER(inode)->flags & NFS_MOUNT_FSCACHE) { + fscache_relinquish_cookie(nfsi->fscache, 1); + nfsi->fscache = NULL; + } +#endif + } /* @@ -705,6 +737,16 @@ memset(nfsi->cookieverf, 0, sizeof(nfsi->cookieverf)); nfsi->cache_access.cred = NULL; +#ifdef CONFIG_NFS_FSCACHE +{ + struct nfs_server *server = NFS_SB(sb); + if (server->flags & NFS_MOUNT_FSCACHE) { + nfsi->fscache = fscache_acquire_cookie(server->fscache, NULL, nfsi); + /* XXX: Add warning when NULL is returned */ + dprintk("nfs_fhget: fscache 0x%p\n", nfsi->fscache); + } +} +#endif unlock_new_inode(inode); } else nfs_refresh_inode(inode, fattr); @@ -1009,6 +1051,18 @@ (long long)NFS_FILEID(inode)); /* This ensures we revalidate dentries */ nfsi->cache_change_attribute++; + +#ifdef CONFIG_NFS_FSCACHE + if (server->flags & NFS_MOUNT_FSCACHE) { + struct fscache_cookie *old = nfsi->fscache; + + /* retire the current fscache cache and get a new one */ + fscache_relinquish_cookie(nfsi->fscache, 1); + nfsi->fscache = fscache_acquire_cookie(server->fscache, NULL, nfsi); + dfprintk(PAGECACHE,"NFS: fscache: old 0x%p new 0x%p\n", + old, nfsi->fscache); + } +#endif } dfprintk(PAGECACHE, "NFS: (%s/%Ld) revalidation complete\n", inode->i_sb->s_id, @@ -1417,6 +1471,14 @@ return ERR_PTR(-EINVAL); } +#ifndef CONFIG_NFS_FSCACHE + if (data->flags & NFS_MOUNT_FSCACHE) { + printk(KERN_WARNING "NFS: kernel not compiled with CONFIG_NFS_FSCACHE\n"); + kfree(server); + return ERR_PTR(-EINVAL); + } +#endif + s = sget(fs_type, nfs_compare_super, nfs_set_super, server); if (IS_ERR(s) || s->s_root) { @@ -1449,6 +1511,11 @@ kill_anon_super(s); +#ifdef CONFIG_NFS_FSCACHE + if (server->flags & NFS_MOUNT_FSCACHE) + fscache_relinquish_cookie(server->fscache, 0); +#endif + nfs4_renewd_prepare_shutdown(server); if (server->client != NULL && !IS_ERR(server->client)) @@ -1768,6 +1835,20 @@ s = ERR_PTR(-EIO); goto out_free; } +#ifdef TODO +#ifdef CONFIG_NFS_FSCACHE + /* create a cache index for looking up filehandles */ + server->fscache = NULL; + if (server->flags & NFS_MOUNT_FSCACHE) { + server->fscache = fscache_acquire_cookie(nfs_cache_netfs.primary_index, + &nfs_cache_fh_index_def, server); + if (server->fscache == NULL) { + server->flags &= ~NFS_MOUNT_FSCACHE; + printk(KERN_WARNING "NFS: No Fscache cookie. Turning Fscache off!\n"); + } + } +#endif +#endif error = nfs4_fill_super(s, data, flags & MS_VERBOSE ? 1 : 0); if (error) { @@ -1888,6 +1969,14 @@ { int err; +#ifdef CONFIG_NFS_FSCACHE + /* we want to be able to cache */ + err = fscache_register_netfs(&nfs_cache_netfs, + &nfs_cache_server_index_def); + if (err < 0) + goto out5; +#endif + err = nfs_init_nfspagecache(); if (err) goto out4; @@ -1923,6 +2012,10 @@ out3: nfs_destroy_nfspagecache(); out4: +#ifdef CONFIG_NFS_FSCACHE + fscache_unregister_netfs(&nfs_cache_netfs); +out5: +#endif return err; } @@ -1932,6 +2025,9 @@ nfs_destroy_readpagecache(); nfs_destroy_inodecache(); nfs_destroy_nfspagecache(); +#ifdef CONFIG_NFS_FSCACHE + fscache_unregister_netfs(&nfs_cache_netfs); +#endif #ifdef CONFIG_PROC_FS rpc_proc_unregister("nfs"); #endif diff -uNr linux-2.6.9-rc2-mm4/fs/nfs/Makefile linux-2.6.9-rc2-mm4-fscache/fs/nfs/Makefile --- linux-2.6.9-rc2-mm4/fs/nfs/Makefile 2004-09-16 12:08:07.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/nfs/Makefile 2004-09-30 18:47:36.000000000 +0100 @@ -12,4 +12,5 @@ delegation.o idmap.o \ callback.o callback_xdr.o callback_proc.o nfs-$(CONFIG_NFS_DIRECTIO) += direct.o +nfs-$(CONFIG_NFS_FSCACHE) += nfs-fscache.o nfs-objs := $(nfs-y) diff -uNr linux-2.6.9-rc2-mm4/fs/nfs/nfs-fscache.c linux-2.6.9-rc2-mm4-fscache/fs/nfs/nfs-fscache.c --- linux-2.6.9-rc2-mm4/fs/nfs/nfs-fscache.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/nfs/nfs-fscache.c 2004-09-30 21:33:45.000000000 +0100 @@ -0,0 +1,206 @@ +/* nfs-fscache.c: NFS filesystem cache interface + * + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@xxxxxxxxxx) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + + +#include <linux/config.h> +#include <linux/init.h> +#include <linux/kernel.h> +#include <linux/mm.h> +#include <linux/nfs_fs.h> +#include <linux/nfs_fs_sb.h> + +#include "nfs-fscache.h" + +#define NFS_CACHE_FH_INDEX_SIZE sizeof(struct nfs_fh) + +#if 0 +#define kleave(FMT,...) \ + printk("[%-6.6s] <== %s()"FMT"\n",current->comm,__FUNCTION__ ,##__VA_ARGS__) +#else +#define kleave(FMT,...) \ + do {} while(0) +#endif + +/* + * the root index is + */ +static struct fscache_page *nfs_cache_get_page_token(struct page *page); + +static struct fscache_netfs_operations nfs_cache_ops = { + .get_page_token = nfs_cache_get_page_token, +}; + +struct fscache_netfs nfs_cache_netfs = { + .name = "nfs", + .version = 0, + .ops = &nfs_cache_ops, +}; + +/* + * the root index for the filesystem is defined by nfsd IP address and ports + */ +static fscache_match_val_t nfs_cache_server_match(void *target, + const void *entry); +static void nfs_cache_server_update(void *source, void *entry); + +struct fscache_index_def nfs_cache_server_index_def = { + .name = "servers", + .data_size = 18, + .keys[0] = { FSCACHE_INDEX_KEYS_IPV6ADDR, 16 }, + .keys[1] = { FSCACHE_INDEX_KEYS_BIN, 2 }, + .match = nfs_cache_server_match, + .update = nfs_cache_server_update, +}; + +/* + * the primary index for each server is simply made up of a series of NFS file + * handles + */ +static fscache_match_val_t nfs_cache_fh_match(void *target, const void *entry); +static void nfs_cache_fh_update(void *source, void *entry); + +struct fscache_index_def nfs_cache_fh_index_def = { + .name = "fh", + .data_size = NFS_CACHE_FH_INDEX_SIZE, + .keys[0] = { FSCACHE_INDEX_KEYS_BIN, + sizeof(struct nfs_fh) }, + .match = nfs_cache_fh_match, + .update = nfs_cache_fh_update, +}; + +/* + * get a page token for the specified page + * - the token will be attached to page->private and PG_private will be set on + * the page + */ +static struct fscache_page *nfs_cache_get_page_token(struct page *page) +{ + return fscache_page_get_private(page, GFP_NOIO); +} + +static const uint8_t nfs_cache_ipv6_wrapper_for_ipv4[12] = { + [0 ... 9] = 0x00, + [10 ... 11] = 0xff +}; + +/* + * match a server record obtained from the cache + */ +static fscache_match_val_t nfs_cache_server_match(void *target, + const void *entry) +{ + struct nfs_server *server = target; + const uint8_t *data = entry; + + switch (server->addr.sin_family) { + case AF_INET: + if (memcmp(data + 0, + &nfs_cache_ipv6_wrapper_for_ipv4, + 12) != 0) + break; + + if (memcmp(data + 12, &server->addr.sin_addr, 4) != 0) + break; + + if (memcmp(data + 16, &server->addr.sin_port, 2) != 0) + break; + + kleave(" = SUCCESS"); + return FSCACHE_MATCH_SUCCESS; + + case AF_INET6: + if (memcmp(data + 0, &server->addr.sin_addr, 16) != 0) + break; + + if (memcmp(data + 16, &server->addr.sin_port, 2) != 0) + break; + + kleave(" = SUCCESS"); + return FSCACHE_MATCH_SUCCESS; + + default: + break; + } + + kleave(" = FAILED"); + return FSCACHE_MATCH_FAILED; +} + +/* + * update a server record in the cache + */ +static void nfs_cache_server_update(void *source, void *entry) +{ + struct nfs_server *server = source; + uint8_t *data = entry; + + switch (server->addr.sin_family) { + case AF_INET: + memcpy(data + 0, &nfs_cache_ipv6_wrapper_for_ipv4, 12); + memcpy(data + 12, &server->addr.sin_addr, 4); + memcpy(data + 16, &server->addr.sin_port, 2); + return; + + case AF_INET6: + memcpy(data + 0, &server->addr.sin_addr, 16); + memcpy(data + 16, &server->addr.sin_port, 2); + return; + + default: + return; + } +} + +/* + * match a file handle record obtained from the cache + */ +static fscache_match_val_t nfs_cache_fh_match(void *target, const void *entry) +{ + struct nfs_inode *nfsi = target; + const uint8_t *data = entry; + int loop; + + /* check the file handle matches */ + if (memcmp(data, &nfsi->fh, sizeof(nfsi->fh)) == 0) { + + /* check the auxilliary data matches (if any) */ + for (loop = sizeof(nfsi->fh); + loop < NFS_CACHE_FH_INDEX_SIZE; + loop++) + if (data[loop]) { + kleave(" = FAILED"); + return FSCACHE_MATCH_FAILED; + } + + kleave(" = SUCCESS"); + return FSCACHE_MATCH_SUCCESS; + } + + kleave(" = FAILED"); + return FSCACHE_MATCH_FAILED; +} + +/* + * update a fh record in the cache + */ +static void nfs_cache_fh_update(void *source, void *entry) +{ + struct nfs_inode *nfsi = source; + uint8_t *data = entry; + + /* set the file handle */ + memcpy(data, &nfsi->fh, sizeof(nfsi->fh)); + + /* just clear the auxilliary data for now */ + memset(data + sizeof(nfsi->fh), + 0, + NFS_CACHE_FH_INDEX_SIZE - sizeof(nfsi->fh)); +} diff -uNr linux-2.6.9-rc2-mm4/fs/nfs/nfs-fscache.h linux-2.6.9-rc2-mm4-fscache/fs/nfs/nfs-fscache.h --- linux-2.6.9-rc2-mm4/fs/nfs/nfs-fscache.h 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/nfs/nfs-fscache.h 2004-09-30 21:12:13.000000000 +0100 @@ -0,0 +1,27 @@ +/* nfs-fscache.h: NFS filesystem cache interface definitions + * + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@xxxxxxxxxx) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _NFS_FSCACHE_H +#define _NFS_FSCACHE_H + +#include <linux/fscache.h> + +#ifdef CONFIG_NFS_FSCACHE +#ifndef CONFIG_FSCACHE +#error "CONFIG_NFS_FSCACHE is defined but not CONFIG_FSCACHE" +#endif + +extern struct fscache_netfs nfs_cache_netfs; +extern struct fscache_index_def nfs_cache_server_index_def; +extern struct fscache_index_def nfs_cache_fh_index_def; + +#endif +#endif /* _NFS_FSCACHE_H */ diff -uNr linux-2.6.9-rc2-mm4/fs/nfs/read.c linux-2.6.9-rc2-mm4-fscache/fs/nfs/read.c --- linux-2.6.9-rc2-mm4/fs/nfs/read.c 2004-09-16 12:08:07.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/nfs/read.c 2004-09-30 18:48:35.000000000 +0100 @@ -28,6 +28,7 @@ #include <linux/sunrpc/clnt.h> #include <linux/nfs_fs.h> #include <linux/nfs_page.h> +#include <linux/nfs_mount.h> #include <linux/smp_lock.h> #include <asm/system.h> @@ -89,6 +90,53 @@ } /* + * store a newly fetched page in fscache + */ +#ifdef CONFIG_NFS_FSCACHE +struct nfs_fscache_stats_t { + int read_from_calls; + int read_from_misses; + int read_from_hits; + int write_to_calls; + int write_to_errors; + int write_to_completes; +} nfs_fscache_stats = {0, 0, 0, 0, 0, 0}; + +static void +nfs_readpage_to_fscache_complete(void *cookie_data, struct page *page, void *data, int error) +{ +nfs_fscache_stats.write_to_completes++; + end_page_fs_misc(page); +} + +static inline void +nfs_readpage_to_fscache(struct inode *inode, struct page *page, int sync) +{ +nfs_fscache_stats.write_to_calls++; + SetPageFsMisc(page); + if (fscache_write_page(NFS_I(inode)->fscache, + page, + nfs_readpage_to_fscache_complete, + NULL, + GFP_KERNEL) != 0 + ) { +nfs_fscache_stats.write_to_errors++; + fscache_uncache_page(NFS_I(inode)->fscache, page); + ClearPageFsMisc(page); + } + + unlock_page(page); +} +#else +static inline void +nfs_readpage_to_fscache(struct inode *inode, struct page *page, int sync) +{ + BUG(); +} +#endif + + +/* * Read a page synchronously. */ static int nfs_readpage_sync(struct nfs_open_context *ctx, struct inode *inode, @@ -164,6 +212,13 @@ ClearPageError(page); result = 0; + if (NFS_SERVER(inode)->flags & NFS_MOUNT_FSCACHE) + nfs_readpage_to_fscache(inode, page, 1); + else + unlock_page(page); + + return result; + io_error: unlock_page(page); nfs_readdata_free(rdata); @@ -196,6 +251,14 @@ static void nfs_readpage_release(struct nfs_page *req) { +#ifdef CONFIG_NFS_FSCACHE + struct inode *d_inode = req->wb_context->dentry->d_inode; + + if ((NFS_SERVER(d_inode)->flags & NFS_MOUNT_FSCACHE) && + PageUptodate(req->wb_page)) + nfs_readpage_to_fscache(d_inode, req->wb_page, 0); + else +#endif unlock_page(req->wb_page); nfs_clear_request(req); @@ -494,6 +557,57 @@ data->complete(data, status); } + +/* + * Read a page through the on-disc cache if possible + */ +#ifdef CONFIG_NFS_FSCACHE +static void +nfs_readpage_from_fscache_complete(void *cookie_data, struct page *page, void *data, int error) +{ + if (error) + SetPageError(page); + else + SetPageUptodate(page); + unlock_page(page); +} + +static inline int +nfs_readpage_from_fscache(struct inode *inode, struct page *page) +{ + struct fscache_page *pageio; + int ret; + +nfs_fscache_stats.read_from_calls++; + pageio = fscache_page_get_private(page, GFP_NOIO); + if (IS_ERR(pageio)) + return PTR_ERR(pageio); + + ret = fscache_read_or_alloc_page(NFS_I(inode)->fscache, + page, + nfs_readpage_from_fscache_complete, + NULL, + GFP_KERNEL); + + switch (ret) { + case 1: /* read BIO submitted and wb-journal entry found */ + BUG(); + + case 0: /* read BIO submitted (page in fscache) */ +nfs_fscache_stats.read_from_hits++; + return ret; + + case -ENOBUFS: /* inode not in cache */ + case -ENODATA: /* page not in cache */ +nfs_fscache_stats.read_from_misses++; + return 1; + + default: + return ret; + } +} +#endif + /* * Read a page over NFS. * We read the page synchronously in the following case: @@ -527,6 +641,15 @@ ctx = get_nfs_open_context((struct nfs_open_context *) file->private_data); if (!IS_SYNC(inode)) { +#ifdef CONFIG_NFS_FSCACHE + if (NFS_SERVER(inode)->flags & NFS_MOUNT_FSCACHE) { + error = nfs_readpage_from_fscache(inode, page); + if (error < 0) + goto out_error; + if (error == 0) + return error; + } +#endif error = nfs_readpage_async(ctx, inode, page); goto out; } @@ -557,6 +680,18 @@ unsigned int len; nfs_wb_page(inode, page); + +#ifdef CONFIG_NFS_FSCACHE + if (NFS_SERVER(inode)->flags & NFS_MOUNT_FSCACHE) { + int error = nfs_readpage_from_fscache(inode, page); + if (error < 0) + return error; + if (error == 0) { + return error; + } + } +#endif + len = nfs_page_length(inode, page); if (len == 0) return nfs_return_empty_page(page); diff -uNr linux-2.6.9-rc2-mm4/fs/nfs/write.c linux-2.6.9-rc2-mm4-fscache/fs/nfs/write.c --- linux-2.6.9-rc2-mm4/fs/nfs/write.c 2004-09-16 12:08:07.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/fs/nfs/write.c 2004-09-30 20:58:23.000000000 +0100 @@ -273,6 +273,28 @@ } /* + * store an updated page in fscache + */ +#ifdef CONFIG_NFS_FSCACHE +static void +nfs_writepage_to_fscache_complete(void *cookie_data, struct page *page, void *data, int error) +{ + /* really need to synchronise the end of writeback, probably using a page flag */ +} +static inline void +nfs_writepage_to_fscache(struct inode *inode, struct page *page) +{ + if (fscache_write_page(NFS_I(inode)->fscache, + page, + nfs_writepage_to_fscache_complete, + NULL, + GFP_KERNEL) != 0 + ) + fscache_uncache_page(NFS_I(inode)->fscache, page); +} +#endif + +/* * Write an mmapped page to the server. */ int nfs_writepage(struct page *page, struct writeback_control *wbc) @@ -317,6 +339,12 @@ err = -EBADF; goto out; } + +#ifdef CONFIG_NFS_FSCACHE + if (NFS_SERVER(inode)->flags & NFS_MOUNT_FSCACHE) + nfs_writepage_to_fscache(inode, page); +#endif + lock_kernel(); if (!IS_SYNC(inode) && inode_referenced) { err = nfs_writepage_async(ctx, inode, page, 0, offset); @@ -1322,6 +1350,7 @@ (long long)NFS_FILEID(req->wb_context->dentry->d_inode), req->wb_bytes, (long long)req_offset(req)); + if (task->tk_status < 0) { req->wb_context->error = task->tk_status; nfs_inode_remove_request(req); diff -uNr linux-2.6.9-rc2-mm4/include/linux/cachefs.h linux-2.6.9-rc2-mm4-fscache/include/linux/cachefs.h --- linux-2.6.9-rc2-mm4/include/linux/cachefs.h 2004-09-27 11:24:04.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/include/linux/cachefs.h 1970-01-01 01:00:00.000000000 +0100 @@ -1,351 +0,0 @@ -/* cachefs.h: general filesystem caching interface - * - * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. - * Written by David Howells (dhowells@xxxxxxxxxx) - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - */ - -#ifndef _LINUX_CACHEFS_H -#define _LINUX_CACHEFS_H - -#include <linux/config.h> -#include <linux/fs.h> -#include <linux/list.h> -#include <linux/pagemap.h> - -#ifdef CONFIG_CACHEFS_MODULE -#define CONFIG_CACHEFS -#endif - -struct cachefs_cookie; -struct cachefs_netfs; -struct cachefs_netfs_operations; -struct cachefs_page; - -#define CACHEFS_NEGATIVE_COOKIE NULL - -typedef void (*cachefs_rw_complete_t)(void *cookie_data, - struct page *page, - void *data, - int error); - -/* result of index entry comparison */ -typedef enum { - /* no match */ - CACHEFS_MATCH_FAILED, - - /* successful match */ - CACHEFS_MATCH_SUCCESS, - - /* successful match, entry requires update */ - CACHEFS_MATCH_SUCCESS_UPDATE, - - /* successful match, entry requires deletion */ - CACHEFS_MATCH_SUCCESS_DELETE, -} cachefs_match_val_t; - -/*****************************************************************************/ -/* - * cachefs index definition - * - each index file contains a number of fixed size entries - * - they don't have to fit exactly into a page, but if they don't, the gap - * at the end of the page will not be used - */ -struct cachefs_index_def -{ - /* name of index */ - uint8_t name[8]; - - /* size of data to be stored in index */ - uint16_t data_size; - - /* key description (for displaying in cache mountpoint) */ - struct { - uint8_t type; - uint16_t len; - } keys[4]; - -#define CACHEFS_INDEX_KEYS_NOTUSED 0 -#define CACHEFS_INDEX_KEYS_BIN 1 -#define CACHEFS_INDEX_KEYS_ASCIIZ 2 -#define CACHEFS_INDEX_KEYS_IPV4ADDR 3 -#define CACHEFS_INDEX_KEYS_IPV6ADDR 4 -#define CACHEFS_INDEX_KEYS__LAST CACHEFS_INDEX_KEYS_IPV6ADDR - - /* see if entry matches the specified key - * - the netfs data from the cookie being used as the target is - * presented - * - entries that aren't in use will not be presented for matching - */ - cachefs_match_val_t (*match)(void *target_netfs_data, - const void *entry); - - /* update entry from key - * - the netfs data from the cookie being used as the source is - * presented - */ - void (*update)(void *source_netfs_data, void *entry); -}; - -#ifdef CONFIG_CACHEFS -extern struct cachefs_cookie *__cachefs_acquire_cookie(struct cachefs_cookie *iparent, - struct cachefs_index_def *idef, - void *netfs_data); - -extern void __cachefs_relinquish_cookie(struct cachefs_cookie *cookie, - int retire); - -extern void __cachefs_update_cookie(struct cachefs_cookie *cookie); -#endif - -static inline -struct cachefs_cookie *cachefs_acquire_cookie(struct cachefs_cookie *iparent, - struct cachefs_index_def *idef, - void *netfs_data) -{ -#ifdef CONFIG_CACHEFS - if (iparent != CACHEFS_NEGATIVE_COOKIE) - return __cachefs_acquire_cookie(iparent, idef, netfs_data); -#endif - return CACHEFS_NEGATIVE_COOKIE; -} - -static inline -void cachefs_relinquish_cookie(struct cachefs_cookie *cookie, - int retire) -{ -#ifdef CONFIG_CACHEFS - if (cookie != CACHEFS_NEGATIVE_COOKIE) - __cachefs_relinquish_cookie(cookie, retire); -#endif -} - -static inline -void cachefs_update_cookie(struct cachefs_cookie *cookie) -{ -#ifdef CONFIG_CACHEFS - if (cookie != CACHEFS_NEGATIVE_COOKIE) - __cachefs_update_cookie(cookie); -#endif -} - -/*****************************************************************************/ -/* - * cachefs cached network filesystem type - * - name, version and ops must be filled in before registration - * - all other fields will be set during registration - */ -struct cachefs_netfs -{ - const char *name; /* filesystem name */ - unsigned version; /* indexing version */ - struct cachefs_cookie *primary_index; - struct cachefs_netfs_operations *ops; - struct list_head link; /* internal link */ -}; - -struct cachefs_netfs_operations -{ - /* get page-to-block mapping cookie for a page - * - one should be allocated if it doesn't exist - * - returning -ENODATA will cause this page to be ignored - * - typically, the struct will be attached to page->private - */ - struct cachefs_page *(*get_page_cookie)(struct page *page); -}; - -#ifdef CONFIG_CACHEFS -extern int __cachefs_register_netfs(struct cachefs_netfs *netfs, - struct cachefs_index_def *primary_idef); -extern void __cachefs_unregister_netfs(struct cachefs_netfs *netfs); -#endif - -static inline -int cachefs_register_netfs(struct cachefs_netfs *netfs, - struct cachefs_index_def *primary_idef) -{ -#ifdef CONFIG_CACHEFS - return __cachefs_register_netfs(netfs, primary_idef); -#else - return 0; -#endif -} - -static inline -void cachefs_unregister_netfs(struct cachefs_netfs *netfs) -{ -#ifdef CONFIG_CACHEFS - __cachefs_unregister_netfs(netfs); -#endif -} - -/*****************************************************************************/ -/* - * page mapping cookie - * - stores the mapping of a page to a block in the cache (may also be null) - * - note that the mapping may be removed without notice if a cache is removed - */ -struct cachefs_page -{ - struct cachefs_block *mapped_block; /* block mirroring this page */ - rwlock_t lock; - - unsigned long flags; -#define CACHEFS_PAGE_BOUNDARY 0 /* next block has a different - * indirection chain */ -#define CACHEFS_PAGE_NEW 1 /* this is a newly allocated block */ -}; - -/* - * read a page from the cache or allocate a block in which to store it - * - if the cookie is not backed by a file: - * - -ENOBUFS will be returned and nothing more will be done - * - else if the page is backed by a block in the cache: - * - a read will be started which will call end_io_func on completion - * - the wb-journal will be searched for an entry pertaining to this block - * - if an entry is found: - * - 1 will be returned [not yet supported] - * else - * - 0 will be returned - * - else if the page is unbacked: - * - a block will be allocated and attached - * - the validity journal will be marked to note the block does not yet - * contain valid data - * - -ENODATA will be returned - */ -#ifdef CONFIG_CACHEFS -extern int __cachefs_read_or_alloc_page(struct cachefs_cookie *cookie, - struct page *page, - cachefs_rw_complete_t end_io_func, - void *end_io_data, - unsigned long gfp); -#endif - -static inline -int cachefs_read_or_alloc_page(struct cachefs_cookie *cookie, - struct page *page, - cachefs_rw_complete_t end_io_func, - void *end_io_data, - unsigned long gfp) -{ -#ifdef CONFIG_CACHEFS - if (cookie != CACHEFS_NEGATIVE_COOKIE) - return __cachefs_read_or_alloc_page(cookie, page, end_io_func, - end_io_data, gfp); -#endif - return -ENOBUFS; -} - -/* - * request a page be stored in the cache - * - this request may be ignored if no cache block is currently attached, in - * which case it: - * - returns -ENOBUFS - * - if a cache block was already allocated: - * - the page cookie will be updated to reflect the block selected - * - a BIO will be dispatched to write the page (end_io_func will be called - * from the completion function) - * - end_io_func can be NULL, in which case a default function will just - * clear the writeback bit on the page - * - any associated validity journal entry will be cleared - * - returns 0 - */ -#ifdef CONFIG_CACHEFS -extern int __cachefs_write_page(struct cachefs_cookie *cookie, - struct page *page, - cachefs_rw_complete_t end_io_func, - void *end_io_data, - unsigned long gfp); -#endif - -static inline -int cachefs_write_page(struct cachefs_cookie *cookie, - struct page *page, - cachefs_rw_complete_t end_io_func, - void *end_io_data, - unsigned long gfp) -{ -#ifdef CONFIG_CACHEFS - if (cookie != CACHEFS_NEGATIVE_COOKIE) - return __cachefs_write_page(cookie, page, end_io_func, - end_io_data, gfp); -#endif - return -ENOBUFS; -} - -/* - * indicate that caching is no longer required on a page - * - note: cannot cancel any outstanding BIOs between this page and the cache - */ -#ifdef CONFIG_CACHEFS -extern void __cachefs_uncache_page(struct cachefs_cookie *cookie, - struct page *page); -#endif - -static inline -void cachefs_uncache_page(struct cachefs_cookie *cookie, - struct page *page) -{ -#ifdef CONFIG_CACHEFS - __cachefs_uncache_page(cookie, page); -#endif -} - -/* - * keep track of pages changed locally but not yet committed - */ -#if 0 /* TODO */ -extern void cachefs_writeback_prepare(struct cachefs_cookie *cookie, - struct page *page, - unsigned short from, - unsigned short to); - -extern void cachefs_writeback_committed(struct cachefs_cookie *cookie, - struct page *page, - unsigned short from, - unsigned short to); - -extern void cachefs_writeback_aborted(struct cachefs_cookie *cookie, - struct page *page, - unsigned short from, - unsigned short to); -#endif - -/* - * convenience routines for mapping page->private directly to a struct - * cachefs_page - */ -static inline -struct cachefs_page *__cachefs_page_grab_private(struct page *page) -{ - return (struct cachefs_page *) (PagePrivate(page) ? page->private : 0); -} - -#define cachefs_page_grab_private(X) \ -({ \ - BUG_ON(!PagePrivate(X)); \ - __cachefs_page_grab_private(X); \ -}) - - -#ifdef CONFIG_CACHEFS -extern struct cachefs_page *__cachefs_page_get_private(struct page *page, - unsigned gfp); -#endif - -static inline -struct cachefs_page *cachefs_page_get_private(struct page *page, - unsigned gfp) -{ -#ifdef CONFIG_CACHEFS - return __cachefs_page_get_private(page, gfp); -#else - return ERR_PTR(-EIO); -#endif -} - -#endif /* _LINUX_CACHEFS_H */ diff -uNr linux-2.6.9-rc2-mm4/include/linux/fscache-cache.h linux-2.6.9-rc2-mm4-fscache/include/linux/fscache-cache.h --- linux-2.6.9-rc2-mm4/include/linux/fscache-cache.h 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/include/linux/fscache-cache.h 2004-10-04 17:23:28.357675703 +0100 @@ -0,0 +1,205 @@ +/* fscache-cache.h: general filesystem caching backing cache interface + * + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@xxxxxxxxxx) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _LINUX_FSCACHE_CACHE_H +#define _LINUX_FSCACHE_CACHE_H + +#include <linux/fscache.h> + +struct fscache_cache; +struct fscache_cache_ops; +struct fscache_node; +struct fscache_search_result; + +struct fscache_search_result { + struct list_head link; /* link in search_results */ + struct fscache_cache *cache; /* cache searched */ + unsigned ino; /* node ID (or 0 if negative) */ +}; + +struct fscache_cache { + struct fscache_cache_ops *ops; + struct list_head link; /* link in list of caches */ + size_t max_index_size; /* maximum size of index data */ + unsigned long flags; +#define FSCACHE_CACHE_WITHDRAWN 0 /* T if cache has been withdrawn */ + + char identifier[32]; /* cache label */ + + /* node management */ + struct list_head node_list; /* list of data/index nodes */ + spinlock_t node_list_lock; + struct fscache_search_result fsdef_srch; /* search result for the fsdef index */ +}; + +extern void fscache_init_cache(struct fscache_cache *cache, + struct fscache_cache_ops *ops, + unsigned fsdef_ino, + const char *idfmt, + ...) __attribute__ ((format (printf,4,5))); + +extern void fscache_add_cache(struct fscache_cache *cache); +extern void fscache_withdraw_cache(struct fscache_cache *cache); + +/* see if a cache has been withdrawn */ +static inline int fscache_is_cache_withdrawn(struct fscache_cache *cache) +{ + return test_bit(FSCACHE_CACHE_WITHDRAWN, &cache->flags); +} + +/*****************************************************************************/ +/* + * cache operations + */ +struct fscache_cache_ops { + /* name of cache provider */ + const char *name; + + /* look up the nominated node for this cache */ + struct fscache_node *(*lookup_node)(struct fscache_cache *cache, unsigned ino); + + /* increment the usage count on this inode (may fail if unmounting) */ + struct fscache_node *(*grab_node)(struct fscache_node *node); + + /* lock a semaphore on a node */ + void (*lock_node)(struct fscache_node *node); + + /* unlock a semaphore on a node */ + void (*unlock_node)(struct fscache_node *node); + + /* dispose of a reference to a node */ + void (*put_node)(struct fscache_node *node); + + /* search an index for an inode to back a cookie + * - the "inode number" should be set in result->ino + */ + int (*index_search)(struct fscache_node *node, struct fscache_cookie *cookie, + struct fscache_search_result *result); + + /* create a new file or inode, with an entry in the named index + * - the "inode number" should be set in result->ino + */ + int (*index_add)(struct fscache_node *node, struct fscache_cookie *cookie, + struct fscache_search_result *result); + + /* update the index entry for a node + * - the netfs's update operation should be called + */ + int (*index_update)(struct fscache_node *ixnode, + struct fscache_node *node); + + /* sync a cache */ + void (*sync)(struct fscache_cache *cache); + + /* dissociate a cache from all the pages it was backing */ + void (*dissociate_pages)(struct fscache_cache *cache); + + /* request a backing block for a page be read or allocated in the + * cache */ + int (*read_or_alloc_page)(struct fscache_node *node, + struct page *page, + struct fscache_page *pageio, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp); + + /* write a page to its backing block in the cache */ + int (*write_page)(struct fscache_node *node, + struct page *page, + struct fscache_page *pageio, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp); + + /* detach a backing block from a page */ + void (*uncache_page)(struct fscache_node *node, + struct fscache_page *pageio); +}; + +/*****************************************************************************/ +/* + * data file or index object cookie + * - a file will only appear in one cache + * - a request to cache a file may or may not be honoured, subject to + * constraints such as disc space + * - indexes files are created on disc just-in-time + */ +struct fscache_cookie { + atomic_t usage; /* number of users of this cookie */ + atomic_t children; /* number of children of this cookie */ + rwlock_t lock; /* list access lock */ + struct rw_semaphore sem; /* list creation vs scan lock */ + struct list_head search_results; /* results of searching iparent */ + struct list_head backing_nodes; /* node(s) backing this file/index */ + struct fscache_index_def *idef; /* index definition */ + struct fscache_cookie *iparent; /* index holding this entry */ + struct fscache_netfs *netfs; /* owner network fs definition */ + void *netfs_data; /* back pointer to netfs */ +}; + +extern struct fscache_cookie fscache_fsdef_index; + +/*****************************************************************************/ +/* + * on-disc cache file or index handle + */ +struct fscache_node { + unsigned long flags; +#define FSCACHE_NODE_ISINDEX 0 /* T if inode is index file (F if file) */ +#define FSCACHE_NODE_RELEASING 1 /* T if inode is being released */ +#define FSCACHE_NODE_RECYCLING 2 /* T if inode is being retired */ +#define FSCACHE_NODE_WITHDRAWN 3 /* T if inode has been withdrawn */ + + struct list_head cache_link; /* link in cache->node_list */ + struct list_head cookie_link; /* link in cookie->backing_nodes */ + struct fscache_cache *cache; /* cache that supplied this node */ + struct fscache_cookie *cookie; /* netfs's file/index object */ +}; + +static inline +void fscache_node_init(struct fscache_node *node) +{ + node->flags = 0; + INIT_LIST_HEAD(&node->cache_link); + INIT_LIST_HEAD(&node->cookie_link); + node->cache = NULL; + node->cookie = NULL; +} + +/* find the parent index node for a node */ +static inline +struct fscache_node *fscache_find_parent_node(struct fscache_node *node) +{ + struct fscache_cookie *cookie = node->cookie; + struct fscache_cache *cache = node->cache; + struct fscache_node *parent; + + list_for_each_entry(parent, + &cookie->iparent->backing_nodes, + cookie_link + ) { + if (parent->cache == cache) + return parent; + } + + return NULL; +} + +/*****************************************************************************/ +/* + * definition of the contents of an FSDEF index entry + */ +struct fscache_fsdef_index_entry { + uint8_t name[24]; /* name of netfs */ + uint32_t version; /* version of layout */ +}; + +#endif /* _LINUX_FSCACHE_CACHE_H */ diff -uNr linux-2.6.9-rc2-mm4/include/linux/fscache.h linux-2.6.9-rc2-mm4-fscache/include/linux/fscache.h --- linux-2.6.9-rc2-mm4/include/linux/fscache.h 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/include/linux/fscache.h 2004-09-30 17:27:30.000000000 +0100 @@ -0,0 +1,351 @@ +/* fscache.h: general filesystem caching interface + * + * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@xxxxxxxxxx) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _LINUX_FSCACHE_H +#define _LINUX_FSCACHE_H + +#include <linux/config.h> +#include <linux/fs.h> +#include <linux/list.h> +#include <linux/pagemap.h> + +#ifdef CONFIG_FSCACHE_MODULE +#define CONFIG_FSCACHE +#endif + +struct fscache_cookie; +struct fscache_netfs; +struct fscache_netfs_operations; +struct fscache_page; + +#define FSCACHE_NEGATIVE_COOKIE NULL + +typedef void (*fscache_rw_complete_t)(void *cookie_data, + struct page *page, + void *data, + int error); + +/* result of index entry comparison */ +typedef enum { + /* no match */ + FSCACHE_MATCH_FAILED, + + /* successful match */ + FSCACHE_MATCH_SUCCESS, + + /* successful match, entry requires update */ + FSCACHE_MATCH_SUCCESS_UPDATE, + + /* successful match, entry requires deletion */ + FSCACHE_MATCH_SUCCESS_DELETE, +} fscache_match_val_t; + +/*****************************************************************************/ +/* + * fscache index definition + * - each index file contains a number of fixed size entries + * - they don't have to fit exactly into a page, but if they don't, the gap + * at the end of the page will not be used + */ +struct fscache_index_def +{ + /* name of index */ + uint8_t name[8]; + + /* size of data to be stored in index */ + uint16_t data_size; + + /* key description (for displaying in cache mountpoint) */ + struct { + uint8_t type; + uint16_t len; + } keys[4]; + +#define FSCACHE_INDEX_KEYS_NOTUSED 0 +#define FSCACHE_INDEX_KEYS_BIN 1 +#define FSCACHE_INDEX_KEYS_ASCIIZ 2 +#define FSCACHE_INDEX_KEYS_IPV4ADDR 3 +#define FSCACHE_INDEX_KEYS_IPV6ADDR 4 +#define FSCACHE_INDEX_KEYS__LAST FSCACHE_INDEX_KEYS_IPV6ADDR + + /* see if entry matches the specified key + * - the netfs data from the cookie being used as the target is + * presented + * - entries that aren't in use will not be presented for matching + */ + fscache_match_val_t (*match)(void *target_netfs_data, + const void *entry); + + /* update entry from key + * - the netfs data from the cookie being used as the source is + * presented + */ + void (*update)(void *source_netfs_data, void *entry); +}; + +#ifdef CONFIG_FSCACHE +extern struct fscache_cookie *__fscache_acquire_cookie(struct fscache_cookie *iparent, + struct fscache_index_def *idef, + void *netfs_data); + +extern void __fscache_relinquish_cookie(struct fscache_cookie *cookie, + int retire); + +extern void __fscache_update_cookie(struct fscache_cookie *cookie); +#endif + +static inline +struct fscache_cookie *fscache_acquire_cookie(struct fscache_cookie *iparent, + struct fscache_index_def *idef, + void *netfs_data) +{ +#ifdef CONFIG_FSCACHE + if (iparent != FSCACHE_NEGATIVE_COOKIE) + return __fscache_acquire_cookie(iparent, idef, netfs_data); +#endif + return FSCACHE_NEGATIVE_COOKIE; +} + +static inline +void fscache_relinquish_cookie(struct fscache_cookie *cookie, + int retire) +{ +#ifdef CONFIG_FSCACHE + if (cookie != FSCACHE_NEGATIVE_COOKIE) + __fscache_relinquish_cookie(cookie, retire); +#endif +} + +static inline +void fscache_update_cookie(struct fscache_cookie *cookie) +{ +#ifdef CONFIG_FSCACHE + if (cookie != FSCACHE_NEGATIVE_COOKIE) + __fscache_update_cookie(cookie); +#endif +} + +/*****************************************************************************/ +/* + * fscache cached network filesystem type + * - name, version and ops must be filled in before registration + * - all other fields will be set during registration + */ +struct fscache_netfs +{ + const char *name; /* filesystem name */ + unsigned version; /* indexing version */ + struct fscache_cookie *primary_index; + struct fscache_netfs_operations *ops; + struct list_head link; /* internal link */ +}; + +struct fscache_netfs_operations +{ + /* get page-to-block mapping token for a page + * - one should be allocated if it doesn't exist + * - returning -ENODATA will cause this page to be ignored + * - typically, the struct will be attached to page->private + */ + struct fscache_page *(*get_page_token)(struct page *page); +}; + +#ifdef CONFIG_FSCACHE +extern int __fscache_register_netfs(struct fscache_netfs *netfs, + struct fscache_index_def *primary_idef); +extern void __fscache_unregister_netfs(struct fscache_netfs *netfs); +#endif + +static inline +int fscache_register_netfs(struct fscache_netfs *netfs, + struct fscache_index_def *primary_idef) +{ +#ifdef CONFIG_FSCACHE + return __fscache_register_netfs(netfs, primary_idef); +#else + return 0; +#endif +} + +static inline +void fscache_unregister_netfs(struct fscache_netfs *netfs) +{ +#ifdef CONFIG_FSCACHE + __fscache_unregister_netfs(netfs); +#endif +} + +/*****************************************************************************/ +/* + * page mapping cookie + * - stores the mapping of a page to a block in the cache (may also be null) + * - note that the mapping may be removed without notice if a cache is removed + */ +struct fscache_page +{ + void *mapped_block; /* block mirroring this page */ + rwlock_t lock; + + unsigned long flags; +#define FSCACHE_PAGE_BOUNDARY 0 /* next block has a different + * indirection chain */ +#define FSCACHE_PAGE_NEW 1 /* this is a newly allocated block */ +}; + +/* + * read a page from the cache or allocate a block in which to store it + * - if the cookie is not backed by a file: + * - -ENOBUFS will be returned and nothing more will be done + * - else if the page is backed by a block in the cache: + * - a read will be started which will call end_io_func on completion + * - the wb-journal will be searched for an entry pertaining to this block + * - if an entry is found: + * - 1 will be returned [not yet supported] + * else + * - 0 will be returned + * - else if the page is unbacked: + * - a block will be allocated and attached + * - the validity journal will be marked to note the block does not yet + * contain valid data + * - -ENODATA will be returned + */ +#ifdef CONFIG_FSCACHE +extern int __fscache_read_or_alloc_page(struct fscache_cookie *cookie, + struct page *page, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp); +#endif + +static inline +int fscache_read_or_alloc_page(struct fscache_cookie *cookie, + struct page *page, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp) +{ +#ifdef CONFIG_FSCACHE + if (cookie != FSCACHE_NEGATIVE_COOKIE) + return __fscache_read_or_alloc_page(cookie, page, end_io_func, + end_io_data, gfp); +#endif + return -ENOBUFS; +} + +/* + * request a page be stored in the cache + * - this request may be ignored if no cache block is currently attached, in + * which case it: + * - returns -ENOBUFS + * - if a cache block was already allocated: + * - the page cookie will be updated to reflect the block selected + * - a BIO will be dispatched to write the page (end_io_func will be called + * from the completion function) + * - end_io_func can be NULL, in which case a default function will just + * clear the writeback bit on the page + * - any associated validity journal entry will be cleared + * - returns 0 + */ +#ifdef CONFIG_FSCACHE +extern int __fscache_write_page(struct fscache_cookie *cookie, + struct page *page, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp); +#endif + +static inline +int fscache_write_page(struct fscache_cookie *cookie, + struct page *page, + fscache_rw_complete_t end_io_func, + void *end_io_data, + unsigned long gfp) +{ +#ifdef CONFIG_FSCACHE + if (cookie != FSCACHE_NEGATIVE_COOKIE) + return __fscache_write_page(cookie, page, end_io_func, + end_io_data, gfp); +#endif + return -ENOBUFS; +} + +/* + * indicate that caching is no longer required on a page + * - note: cannot cancel any outstanding BIOs between this page and the cache + */ +#ifdef CONFIG_FSCACHE +extern void __fscache_uncache_page(struct fscache_cookie *cookie, + struct page *page); +#endif + +static inline +void fscache_uncache_page(struct fscache_cookie *cookie, + struct page *page) +{ +#ifdef CONFIG_FSCACHE + __fscache_uncache_page(cookie, page); +#endif +} + +/* + * keep track of pages changed locally but not yet committed + */ +#if 0 /* TODO */ +extern void fscache_writeback_prepare(struct fscache_cookie *cookie, + struct page *page, + unsigned short from, + unsigned short to); + +extern void fscache_writeback_committed(struct fscache_cookie *cookie, + struct page *page, + unsigned short from, + unsigned short to); + +extern void fscache_writeback_aborted(struct fscache_cookie *cookie, + struct page *page, + unsigned short from, + unsigned short to); +#endif + +/* + * convenience routines for mapping page->private directly to a struct + * fscache_page + */ +static inline +struct fscache_page *__fscache_page_grab_private(struct page *page) +{ + return (struct fscache_page *) (PagePrivate(page) ? page->private : 0); +} + +#define fscache_page_grab_private(X) \ +({ \ + BUG_ON(!PagePrivate(X)); \ + __fscache_page_grab_private(X); \ +}) + + +#ifdef CONFIG_FSCACHE +extern struct fscache_page *__fscache_page_get_private(struct page *page, + unsigned gfp); +#endif + +static inline +struct fscache_page *fscache_page_get_private(struct page *page, + unsigned gfp) +{ +#ifdef CONFIG_FSCACHE + return __fscache_page_get_private(page, gfp); +#else + return ERR_PTR(-EIO); +#endif +} + +#endif /* _LINUX_FSCACHE_H */ diff -uNr linux-2.6.9-rc2-mm4/include/linux/nfs_fs.h linux-2.6.9-rc2-mm4-fscache/include/linux/nfs_fs.h --- linux-2.6.9-rc2-mm4/include/linux/nfs_fs.h 2004-09-16 12:08:13.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/include/linux/nfs_fs.h 2004-09-30 13:43:00.000000000 +0100 @@ -30,6 +30,7 @@ #include <linux/nfs_xdr.h> #include <linux/rwsem.h> #include <linux/workqueue.h> +#include <linux/fscache.h> /* * Enable debugging support for nfs client. @@ -189,6 +190,10 @@ struct rw_semaphore rwsem; #endif /* CONFIG_NFS_V4*/ +#ifdef CONFIG_NFS_FSCACHE + struct fscache_cookie *fscache; +#endif + struct inode vfs_inode; }; diff -uNr linux-2.6.9-rc2-mm4/include/linux/nfs_fs_sb.h linux-2.6.9-rc2-mm4-fscache/include/linux/nfs_fs_sb.h --- linux-2.6.9-rc2-mm4/include/linux/nfs_fs_sb.h 2004-09-16 12:08:13.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/include/linux/nfs_fs_sb.h 2004-09-30 13:43:12.000000000 +0100 @@ -3,6 +3,7 @@ #include <linux/list.h> #include <linux/backing-dev.h> +#include <linux/fscache.h> /* * NFS client parameters stored in the superblock. @@ -46,6 +47,10 @@ that are supported on this filesystem */ #endif + +#ifdef CONFIG_NFS_FSCACHE + struct fscache_cookie *fscache; /* cache cookie */ +#endif }; /* Server capabilities */ diff -uNr linux-2.6.9-rc2-mm4/include/linux/nfs_mount.h linux-2.6.9-rc2-mm4-fscache/include/linux/nfs_mount.h --- linux-2.6.9-rc2-mm4/include/linux/nfs_mount.h 2004-06-18 13:42:15.000000000 +0100 +++ linux-2.6.9-rc2-mm4-fscache/include/linux/nfs_mount.h 2004-09-30 18:48:28.000000000 +0100 @@ -60,6 +60,7 @@ #define NFS_MOUNT_BROKEN_SUID 0x0400 /* 4 */ #define NFS_MOUNT_STRICTLOCK 0x1000 /* reserved for NFSv4 */ #define NFS_MOUNT_SECFLAVOUR 0x2000 /* 5 */ +#define NFS_MOUNT_FSCACHE NFS_MOUNT_POSIX #define NFS_MOUNT_FLAGMASK 0xFFFF #endif --Multipart_Mon_Oct__4_17:32:25_2004-1--