On Thu, Jan 15, 2009 at 10:49 PM, Greg Freemyer <greg.freemyer@xxxxxxxxx> wrote: > > I dont' think the above paragraph is an issue with re-org as currently > designed. Neither for the ext4_defrag patchset that is under > consideration for acceptance, nor the work the OHSM team is doing. > well...it boils down to probability....the lower level the locks, the more complex it gets....and Nick Piggin echoed this, to quote from article: http://lwn.net/Articles/275185/: (Toward better direct I/O scalability) "There are two common approaches to take when faced with this sort of scalability problem. One is to go with more fine-grained locking, where each lock covers a smaller part of the kernel. Splitting up locks has been happening since the initial creation of the Big Kernel Lock, which is the definitive example of coarse-grained locking. There are limits to how much fine-grained locking can help, though, and the addition of more locks comes at the cost of more complexity and more opportunities to create deadlocks. " > > Especially with rotational media, the call stack at the filesystem be aware of SSD....and they are coming down very fast in terms of cost. right now....IBM is testing 4TB SSD.......discussed in a separate thread. (not really sure about properties of SSD....but I think physical contiguity of data may not matter any more, as there are no moving heads to read the data?) > layer is just so much faster than the drive, that blocking access to > the write queue for a few milliseconds while some block level re-org how about doing it in-memory? ie, reading the inode blocks (which can be scattered all over the place) into memory as a contiguous chunk. then allocate the inodes sequence...physically contiguously....and then write to it in sequence. so there exists COPY + PHYSICAL-REORG at the same time.....partly through memory? so while this is happening, and the source blocks got modified....then the memory for destination blocks will be updated immediately....no time delay. > Not to be snide, but if you truly feel a design that does use inode > locking to get the job done is unacceptable, then you should post your > objections on the ext4 list. sorry.....I am just a newbie....and I enjoy discussing all these with those at my level.....for the ext4 list? well....they already know that - and I quote from the same article above: http://lwn.net/Articles/275185/ "The other approach is to do away with locking altogether; this has been the preferred way of improving scalability in recent years. That is, for example, what all of the work around read-copy-update has been doing. And this is the direction Nick has chosen to improve get_user_pages()." I will discuss in the list if i can understand 80% to 90% of this article, which is still far from true :-(. Thanks..... -- Regards, Peter Teoh -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ