Re: Problems testing Lustre filesystem with fscache / cachefiles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for the quick reply...notes below:

On Wed, Dec 17, 2008 at 1:10 PM, David Howells <dhowells@xxxxxxxxxx> wrote:

> John Groves <John@xxxxxxxxxx> wrote:
>
> > The main constraint is that I'm stuck, for now, with EL5 kernels due to
> > copious dependencies in the Lustre 1.6.* source base.  This is a big
> issue,
> > but it looks like it should not be a deal breaker.  I'm currently running
> > kernel 2.6.18-53.1.14.
>
> Where did you get your fscache kernel patches from?


I'm just using the fscache code that was in the stock kernel.


> The ones that come built in to the RHEL-5 kernel are unstable.  The newer
> ones
> are much better, but probably not usable with RHEL-5.


doh!


> > fscache / cachefiles works when /var/fscache is just a directory on my
> boot
> > drive (ext3).  That's nice, but my boot drive is not faster than my
> network
> > (I can stream about 700MB/s over infiniband, and my lustre object servers
> > can keep up with that).  My boot drive is good for about 60MB/s.
>
> Yeah.  Caching NFS that's coming over GigE is a complete waste of time if
> you're just looking for performance enhancements in what I've observed if
> there's no conflicting traffic on the wire.


In our case there are a lot of factors, and we can use really fast disk for
the cache (if that works, which it doesn't at the moment).


>
>
> >    - With ext3 on the ramdisk, cachefilesd dies on startup.
>
> When you say 'dies' does cachefilesd just die, or is there an oops?  Is
> anything dumped to dmesg?


This is all I get in /var/log/messages:

Dec 17 12:50:14 violin cachefilesd[5323]: readdir returned unknown type:
errno 4 (Interrupted system call)
Dec 17 12:50:14 violin kernel: FZ- lustre_fscache_set_cookie 228 0 8241 NOT
READ ONLY<6>
Dec 17 12:50:14 violin kernel: CacheFiles: File cache on 08:11 unregistering
Dec 17 12:50:14 violin kernel: FS-Cache: Withdrawing cache "mycache"

(the FZ message is mine).  I don't know where else to look.  Looking back
through the log, cachefilesd sometimes tanked with errno 4 as well.  9 and 4
are the only cachefilesd errors I see.



> > Cachefiles came in the kernel, and I installed cachefilesd with yum.
>
> If it's the RHEL-5 cachefiles, you're probably doomed, unfortunately.  I
> know
> it's unstable, but finding the source of the instability is a pain.  The
> newer
> patches are much better.  I really must try backporting them.


I might consider trying the back port... Do you have a recommendation as to
what patch to start with?  Also, might it be easier just to back port
cachefiles?  I suppose that will be sensitive to the readv/aio_read change,
etc.  If I could back port a patch that was better, but not complete modern,
it might be easier...or does that sound like a mess to you?

Thanks,
John Groves
--
Linux-cachefs mailing list
Linux-cachefs@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cachefs

[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]
  Powered by Linux