RE: [PATCH 0/4] promote zcache from staging

Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> · Sat, 18 Aug 2012 12:09:27 -0700 (PDT)

> From: Seth Jennings [mailto:sjenning@xxxxxxxxxxxxxxxxxx]
> Sent: Friday, August 17, 2012 5:33 PM
> To: Dan Magenheimer
> Cc: Greg Kroah-Hartman; Andrew Morton; Nitin Gupta; Minchan Kim; Konrad Wilk; Robert Jennings; linux-
> mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxx; Kurt Hackel
> Subject: Re: [PATCH 0/4] promote zcache from staging
> 
> >
> > Sorry to beat a dead horse, but I meant to report this
> > earlier in the week and got tied up by other things.
> >
> > I finally got my test scaffold set up earlier this week
> > to try to reproduce my "bad" numbers with the RHEL6-ish
> > config file.
> >
> > I found that with "make -j28" and "make -j32" I experienced
> > __DATA CORRUPTION__.  This was repeatable.
> 
> I actually hit this for the first time a few hours ago when
> I was running performance for your rewrite.  I didn't know
> what to make of it yet.  The 24-thread kernel build failed
> when both frontswap and cleancache were enabled.
> 
> > The type of error led me to believe that the problem was
> > due to concurrency of cleancache reclaim.  I did not try
> > with cleancache disabled to prove/support this theory
> > but it is consistent with the fact that you (Seth) have not
> > seen a similar problem and has disabled cleancache.
> >
> > While this problem is most likely in my code and I am
> > suitably chagrined, it re-emphasizes the fact that
> > the current zcache in staging is 20-month old "demo"
> > code.  The proposed new zcache codebase handles concurrency
> > much more effectively.
> 
> I imagine this can be solved without rewriting the entire
> codebase.  If your new code contains a fix for this, can we
> just pull it as a single patch?

Hi Seth --

I didn't even observe this before this week, let alone fix this
as an individual bug.  The redesign takes into account LRU ordering
and zombie pageframes (which have valid pointers to the contained
zbuds and possibly valid data, so can't be recycled yet),
taking races and concurrency carefully into account.

The demo codebase is pretty dumb about concurrency, really
a hack that seemed to work.  Given the above, I guess the
hack only works _most_ of the time... when it doesn't
data corruption can occur.

It would be an interesting challenge, but likely very
time-consuming, to fix this one bug while minimizing other
changes so that the fix could be delivered as a self-contained
incremental patch.  I suspect if you try, you will learn why
the rewrite was preferable and necessary.

(Away from email for a few days very soon now.)
Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href