Re: Copying Data Blocks

Peter Teoh <htmldeveloper@xxxxxxxxx> · Thu, 15 Jan 2009 23:41:37 +0800

On Thu, Jan 15, 2009 at 10:49 PM, Greg Freemyer <greg.freemyer@xxxxxxxxx> wrote:
>
> I dont' think the above paragraph is an issue with re-org as currently
> designed.  Neither for the ext4_defrag patchset that is under
> consideration for acceptance, nor the work the OHSM team is doing.
>

well...it boils down to probability....the lower level the locks, the
more complex it gets....and Nick Piggin echoed this, to quote from
article:

http://lwn.net/Articles/275185/:  (Toward better direct I/O scalability)

"There are two common approaches to take when faced with this sort of
scalability problem. One is to go with more fine-grained locking,
where each lock covers a smaller part of the kernel. Splitting up
locks has been happening since the initial creation of the Big Kernel
Lock, which is the definitive example of coarse-grained locking. There
are limits to how much fine-grained locking can help, though, and the
addition of more locks comes at the cost of more complexity and more
opportunities to create deadlocks. "

>
> Especially with rotational media, the call stack at the filesystem

be aware of SSD....and they are coming down very fast in terms of
cost.   right now....IBM is testing 4TB SSD.......discussed in a
separate thread.   (not really sure about properties of SSD....but I
think physical contiguity of data may not matter any more, as there
are no moving heads to read the data?)

> layer is just so much faster than the drive, that blocking access to
> the write queue for a few milliseconds while some block level re-org

how about doing it in-memory?  ie, reading the inode blocks (which can
be scattered all over the place) into memory as a contiguous chunk.
then allocate the inodes sequence...physically contiguously....and
then write to it in sequence.   so there exists COPY + PHYSICAL-REORG
at the same time.....partly through memory?   so while this is
happening, and the source blocks got modified....then the memory for
destination blocks will be updated immediately....no time delay.

> Not to be snide, but if you truly feel a design that does use inode
> locking to get the job done is unacceptable, then you should post your
> objections on the ext4 list.

sorry.....I am just a newbie....and I enjoy discussing all these with
those at my level.....for the ext4 list? well....they already know
that - and I quote from the same article above:

http://lwn.net/Articles/275185/

"The other approach is to do away with locking altogether; this has
been the preferred way of improving scalability in recent years. That
is, for example, what all of the work around read-copy-update has been
doing. And this is the direction Nick has chosen to improve
get_user_pages()."

I will discuss in the list if i can understand 80% to 90% of this
article, which is still far from true :-(.

Thanks.....

-- 
Regards,
Peter Teoh

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ