Re: [PATCH] procfs: expose page cache contents

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> what use does this information have?

There are two main ways I'd find this data (as distinct from this format)
useful:

Some applications would benefit from knowing which files are cheaper to
access. A good example would be a database's query planner, when deciding
whether to use an index or just sequentially scan a table. If the table's
blocks were resident in memory but the index's weren't, then it might be
faster just to start scan the table. While mmap / mincore'ing the files
would
provide this information for a specific file, when the size of the files
you're interested in exceed the address space available (admittedly unlike
on
64-bit machines, but easy on 32-bit machines) you'd have to start
processing
the files in chunks; this would take much longer and so increase the
accuracy
problems you highlight.

This scenario actually highlights an algorithmic problem with my solution
- it
loops through the inodes of each (block-device) super-block, querying if
any
of their pages are resident. It'd be far more efficient to look through the
resident pages, and see which inodes they pointed at (if any), possibly by
walking through the memory zones (like /proc/zoneinfo), iterating over the
per_cpu_pages and mapping them to inodes (if applicable) via
page->mapping->host?

The other use-case I had in mind was when profiling existing processes that
either use memory-mapping or otherwise rely on the kernel to cache the data
they frequently rely on. If I'm trying to validate a process' assumption
that
the page cache will help it I'd like to verify that the blocks it needs
are in
the page cache. This is especially useful if two processes are competing
for
page cache space, and is much more accurate  (and definitely more granular)
than either comparing per-process major page fault counts or indirect
timing
methods (such as the process' response latency).

> Indeed, the act of dumping this information and storing/parsing it in
> userspace will generate memory pressure and perturb the very thing you
>are
> trying to measure....

That's true, although the impact could be minimised by writing the results
out
using O_DIRECT. Reducing the size of the /proc/page_cache report (possible
even using a binary representation like /proc/???/pagemap does) would also
help.

I understand your concerns, but I believe more transparency around what the
page cache is doing would be useful due to its significant impact on a
system's performance.

Thanks -

Nick White

Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux