Re: [rfc][patch] unlock_page speedup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Fri, 19 Dec 2008, Linus Torvalds wrote:
> 
> Hmm. Do we ever use lock_page() on anything but page-cache pages and the 
> buffer cache?

Looking closer, I don't think we do. 

The issue with using the low bits in page->mapping is that sometimes that 
field doesn't exist, and we use other members of that union:

 - spinlock_t ptr	 	- page table pages
 - struct kmem_cache *slab	- slab allocations
 - struct page *first_page	- compound tail pages

but I cannot see how lock_page() could ever be valid for any of them. We 
use lock_page() for things that we do IO on, or that are mapped into user 
space. And while we can map compound pages into user space, we'd better 
not be locking random parts of it - we have to lock the whole thing (ie 
the first one, not the tails).

And we even have a way to verify it - we can make lock_page() verify the 
page flags for at least things like "not a slab page or a compound tail". 
I guess we don't mark page table pages any special way, so I don't see how 
we can add an assert for that use, but verifying that we never do 
lock_page() on a page table page should be trivial.

So it should work.

That said, I did notice a problem. Namely that while the VM code is good 
about looking at ->mapping (because it doesn't know whether the page is 
anonymous or a true mapping), much of the filesystem code is _not_ careful 
about page->mapping, since the filesystem code knows a-priori that the 
mapping pointer must be an inode mapping (or we'd not have called it).

So filesystems do tend to do things like

	struct inode *inode = page->mapping->host;

and while the low bit of mapping is magic, those code-paths don't care 
because they depend on it being zero.

So hiding the lock bit there would involve a lot more work than I naïvely 
expected before looking closer. We'd have to change the name (to 
"_mapping", presumably), and make all users use an accessor function to 
make code like the above do

	struct inode *inode = page_mapping(page)->host;

or something (we migth want to have a "page_host_inode()" helper to do it, 
it seems to be the most common reason for accessing "->mapping" that 
there is.

So it could be done pretty mechanically, but it's still a _big_ change. 
Maybe not worth it, unless we can really translate it into some other 
advantage (ie real simplification of page flag access)

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux