[Off-topic] Battery Backed NVRAM for journals ...

sct@redhat.com (Stephen C. Tweedie) · Tue, 26 Feb 2002 09:57:03 +0000

Hi,

On Tue, Feb 26, 2002 at 12:46:09PM +1100, Neil Brown wrote:

> This seems to work ok, but makes replay time fairly slow after a
> crash.  This seems to be due to the journal replay making 3 passes
> through the journal (I think).

No, the journal scan should be fast enough.  On the first pass,
recovery only looks at the descriptor blocks, to find the last valid
transaction in the system.  On the second, it scans those valid
transactions again, looking again only at descriptor blocks, but this
time parsing any revoke information it finds in them.  The descriptors
are the minority of data in the journal, so should be still in cache.

Finally, however, we have to do journal replay --- looking for the
most uptodate copies of each block in the journal and writing them
back to disk.  This is again a sequential scan through the journal,
but the writebacks are NOT sequential and can cause a lot of seeking.
I suspect that that is where the bulk of the time is spent.

However, things are slightly different with sync NFS mode, as in that
case we're forcing transactions to disk much more frequently so we
will have a higher %age of descriptor blocks in the journal.
Regardless, the journal scan is still sequential so it should be fast
enough: it's the seeks to the rest of the disk which are more likely
to cost a lot.

Cheers,
 Stephen