Hi Jan, Thanks for excellent reply.It refresh my concepts. I knew these generic_* functions . Rather i implemented readpage ,prepare write and commit write for network file system, The main confusion i got when locking comes into picture, We cant rely on thoses pages on the current machines which are not under lock,as it bacause some other clients might modify those pages.And this all stuff makes me think of our own explicit caching atleast in large cluster filesystems. Akthough i dont know how worst or good it is. Or if somebody knows about lustre cachind then i really pleased to listen the comments on lustres. I wan not clear at my question previously , Sorry for that. Thanks Prasanna --- Jan Hudec <bulb@xxxxxx> wrote: > On Thu, Jan 13, 2005 at 23:20:31 -0800, prasanna > wakhare wrote: > > Hi all, > > I have a simple question in mind. I want to > introduce > > client side caching in any network or cluster file > > system. The caches are in-memory cache not > secondary > > cache as in AFS but as sprite. > > My question is all the cache is in kernel space > and if > > large file happens to mmap or read/write say 2 GB > etc. > > What would be consequence if there is great limit > on > > my cache. > > We know kernel space virtual address are from 3GB > to > > 4GB. If i start allocating cache block as soon as > i'm > > getting file from storage node and if my cache > size is > > 5GB then this will hang the system as kernel cant > have > > that much addresses refer to. And in that case we > need > > either CPU to address 64 bit or something like > that. > > But even then there is limit on cacheing the file > > data. > > So how much shall i keep my cache size to have > best > > performance. > > I hope i'm pretty clear in what i asked? > > You are clear... in that you have never heared of > page cache. > > The page cache is handled in the generic_file_read > and > generic_file_write (and generic_file_mmap or > somesuch IIRC) functions, > if you use them for appropriate methods in > file_operations. > > These functions require you to implement the > address_space_operations > methods, namely readpage, writepage, prepare_write > and commit_write. > > The file is never downloaded all at once. It is > downloaded page-by-page > on demand (with some readahead). So only the data > actualy requested are > downloaded and they are dropped when they are not > accessed for some > time. > > Kernel uses all pages it does not need for anything > else for caching > files. In that, it considers all pages equal. No > matter where they came > from (except it of course affect thow they are > writeen out), when a page > is needed, kernel chooses some page that was not > accessed in some period > of time and drops it (writing dirty data in the > process). There is > a small amount of pages kept really free so the > allocation does not need > to wait for the write-out. > > And the key point in all the above is, that it's not > your business to > manage the cache. Leave it to the VM subsystem -- it > does a better job. > > Btw: The kernel address space is 1G large on 32 bit > paltforms, but it's > not a limit of anything -- pages (for pagecache) are > addressed using the > high memory mechanizm if there are not enough > addresses. > > ------------------------------------------------------------------------------- > Jan 'Bulb' Hudec <bulb@xxxxxx> > > ATTACHMENT part 2 application/pgp-signature name=signature.asc __________________________________ Do you Yahoo!? The all-new My Yahoo! - Get yours free! http://my.yahoo.com -- Kernelnewbies: Help each other learn about the Linux kernel. Archive: http://mail.nl.linux.org/kernelnewbies/ FAQ: http://kernelnewbies.org/faq/