Re: Kernel config option which causes reiser4 to be instable

Edward Shishkin <edward.shishkin@xxxxxxxxx> · Sun, 16 Dec 2012 16:36:38 +0100

On 12/14/2012 07:20 PM, Ivan Shapovalov wrote:
On 14 December 2012 12:07:56 Edward Shishkin wrote:
On 12/14/2012 04:14 AM, Ivan Shapovalov wrote:
On 13 December 2012 23:47:10 Edward Shishkin wrote:
On 12/11/2012 09:54 PM, Dušan Čolić wrote:
On Tue, Dec 11, 2012 at 7:33 PM, Edward Shishkin

<edward.shishkin@xxxxxxxxx>    wrote:
On 12/11/2012 04:08 PM, Ivan Shapovalov wrote:
Hello!

Hello.

With help of Dušan Čolić<dusanc@xxxxxxxxx>    who provided his kernel
config
diff I've found a kernel option which, when disabled, greatly reduces
(hopefully to zero, but need time to verify it) corruption rate in
reiser4.

It's CONFIG_TRANSPARENT_HUGEPAGE (or something which is used by it
like
CONFIG_COMPACTION or CONFIG_MIGRATION).
For now I'm testing it with CONFIG_TRANSPARENT_HUGEPAGE disabled

How long?

For me the difference in uptime is months without vs hours with it :D
on 2.6.39.4

Hm, indeed: my setup with enabled migration can not survive even one
kernel compilation, while with disabled migration everything looks ok..

The overnight testing also showed no errors...
So shall we release reiser4-for-3.7 and announce FIXED(?) once again?

:)

I worry that migration is mandatory option for hugepages.
Does fail_migrate_page() work with hugepages?

_Apparently_ yes. We have a counter named "compact_pagemigrate_failed" in
/proc/vmstat (documented in vm/transhuge.txt), which means that failing a page
migration is not a critical event. So hugepages and compaction will work,
albeit quite less effectively...

...And I've immediately got a bunch of (presumably silly) questions

Nop. Good questions.

 while
trying to implement ->migratepage().

1) Why it is needed to writeback dirty pages before migrating them?

2) Looking at the default implementation (fallback_migrate_page()), what is
the meaning of migrating a released page?

To make sure that nobody uses the page.

Just imagine: we allocate a page, take a reference, make page uptodate.
At this point migration routine steals the page. Then we do kmap(), but
virtual address is wrong. Welcome to corruption..

So, at first, migration routine wants to make sure that file system
doesn't use the page: try_to_release_page() checks a reference
counter (see e.g reiser4_releasepage).

 In other words, doesn't "releasing"
page anyway mean "completely freeing" it, requiring the fs to read
corresponding data again?

File system can not use a pointer to page which has been released.
We should obtain a new pointer (via find_get_page(), etc). IMHO dirty
page is a special case (this is regarding your question #1)

3) As far as I could understand, migrating page (from fs's point of view) is
just replacing all internal pointers to the "old" page with pointers to the
new one together with calling predefined functions migrate_page_move_mapping()
and migrate_page_copy(). So here's a question - which structures of reiser4
(beyond jnode->pg) keep pointers to pages and how to access them, given a
single page?

Those pointers shouldn't be a concern, as we use them with reference
counters hold. I don't see where we reuse pointers to released page.

When a page is successfully released, we detach it from jnode (see
page_clear_jnode() in reiser4_releasepage()).

I can remember cryptcompress's struct cluster_handle which stores an array of
pages...

All cluster handles do have a status of local variables. After
checkin_page_cluster() we forget about the pointers while reference
counters are still hold. After checkout_page_cluster() we drop
reference counters and also forget about the pointers.

I see that default migration routine tries to release only pages
with non-zero private info. It won't work for reiser4, as not all
our pages has non-zero private info. For files managed by
cryptcompress plugin we allocate one jnode per page cluster (by
default 16 pages for page size 4K). And only first page of the
cluster gets non-zero private info. So reiser4_migratepage() should
try to release _all_ pages, not only ones with non-zero private info.

Still don't have ideas why we get corruption in the case of files
managed by (default) unix-file plugin (where we allocate one jnode
per page)..

Edward.

Thanks,
Ivan.

Also before the release I'll try to take a look at this:
http://marc.info/?l=reiserfs-devel&m=135402207623711&w=2

This failed path might indicate that we adjusted to fs-writeback
incorrectly.

Edward.

Regards,
Ivan.

     on kernel

3.6.10, and everything seems to be OK so far (so the workaround is
version-
agnostic).

Edward, are there any guesses on what can make reiser4 choke on
hugepages/compaction/migration?

TBH, no ideas. They (hugepages) are _transparent_.
It means we shouldn't suffer in theory ;)

     I'm not even barely familiar with the kernel

internals.

Thanks,
Ivan.

--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html