Hi, On Sun, 22 Jan 2006, Jan Koss wrote: > >(They access the block device directly, completely > >bypassing the page cache so you are breaking cache coherency and are 100% > >broken by design.) > > Oh... I thought that start from 2.4.x there are no separate implementation > of working with blocks and pages, when you read block, kernel read whole page, > am I wrong? There is a very big difference. If you do sb_bread() you are reading a block from the block device. And yes this block is attached to a page but it is a page belonging to the block device address space mapping. You cannot do anything to this block other than read/write it. If you use the page cache to access the contents of a file, then that file (or more precisely the inode of that file) will have an address space mapping of its own, completely independent of the address space mapping of the block device inode. Those pages will (or will not) have buffers attached to them (your getblock() callback is there exactly to allow the buffers to be created and mapped if they are not there). Those buffers will be part of the file page cache page, thus part of the inode's address space mapping, and those buffers have no meaning other than to say "the data in this part of the page belongs to blockdevice so and so and to blocknumber on that block device so and so". So you can change the b_blocknr on those buffers to your hearts content (well you need to observe necessarily locking so buffers under i/o don't get screwed) and that is no problem. Note that the buffers from the block device address space mapping are COMPLETELY separate from the buffers from a file inode address space mapping. So writes from one are NOT seen in the other and you NEVER can mix the two forms of i/o and expect to have a working file system. You will get random results and tons of weird data corruption that way. > > They only way to help you > > is to see your whole file system code > > If we need some handhold for discussion, lets talk about minix v.1 > (my file system derive from this code). > Lets suppose I want make algorigth of allocation blocks in > fs/minix/bitmap.c: minix_new_block more inteligent. > > I should say that minix code use sb_bread/brelse and work with pages (for > example fs/minix/dir.c). Er, not on current kernels: $ grep bread linux-2.6/fs/minix/* bitmap.c: *bh = sb_bread(sb, block); bitmap.c: *bh = sb_bread(sb, block); inode.c: if (!(bh = sb_bread(s, 1))) inode.c: if (!(sbi->s_imap[i]=sb_bread(s, block))) inode.c: if (!(sbi->s_zmap[i]=sb_bread(s, block))) itree_common.c: bh = sb_bread(sb, block_to_cpu(p->key)); itree_common.c: bh = sb_bread(inode->i_sb, nr); Are you working on 2.4 by any chance? If you are writing a new fs I would strongly recommend you to work on 2.6 kernels otherwise you are writing something that is already out of date... The only thing minix in current 2.6 kernel uses bread for is to read the on-disk inodes themselves. It never uses it to access file data at all and I very much doubt that even old 2.4 kernels ever use bread for anything that is not strictly metadata rather than file data. > So instead of allocation one additional block, > I want "realloc" blocks, so all file will occupy several consecutive blocks. > > And we stop on such code > bh->b_blocknr = newblk; > unmap_underlying_metadata(bh->b_bdev, bh->b_blocknr); > mark_buffer_dirty (bh); > > And question how should I get this _bh_, if I can not use sb_bread? That depends entirely in which function you are / which call path you are in at present. Taking minix as an example, tell me the call path where you end up wanting to do the above and I will tell you where to get the bh from... (-: Btw. don't think this is all that easy. If you want to keep whole files rather than whole pages of buffers in consecutive blocks you are in for some very serious fun with multi-page locking and/or complete i/o serialisation, i.e. when a write is happening all other writes on the same file will just block... Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/ -- Kernelnewbies: Help each other learn about the Linux kernel. Archive: http://mail.nl.linux.org/kernelnewbies/ FAQ: http://kernelnewbies.org/faq/