Re: [EXT4 set 8][PATCH 1/1]Add journal checksums

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jul 11, 2007  17:16 +0530, Girish Shilamkar wrote:
> > > +	if (test_opt(sb, JOURNAL_ASYNC_COMMIT)) {
> > > +		jbd2_journal_set_features(sbi->s_journal,
> > > +				JBD2_FEATURE_COMPAT_CHECKSUM, 0,
> > > +				JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT);
> > > +	} else if (test_opt(sb, JOURNAL_CHECKSUM)) {
> > > +		jbd2_journal_set_features(sbi->s_journal,
> > > +				JBD2_FEATURE_COMPAT_CHECKSUM, 0, 0);
> > > +		jbd2_journal_clear_features(sbi->s_journal, 0, 0,
> > > +				JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT);
> > > +	} else {
> > > +		jbd2_journal_clear_features(sbi->s_journal,
> > > +				JBD2_FEATURE_COMPAT_CHECKSUM, 0,
> > > +				JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT);
> > > +	}
> > 
> > Some discussion of the forward- and backward- compatibility design would be
> > appropriate.  Also a description of whether and how fsck can be used to fix
> > up these feature flags.

It is forward & backward compatible to enable COMPAT_CHECKSUM.  That just
means the commit blocks will have checksums in them, but older kernels
will just ignore them.  Hmm, I suppose there might be an issue with "upgrade,
downgrade, upgrade" in that the commit blocks would not have checksums
even though the superblock says they will...

Does that mean we should accept a checksum == 0 as being valid (which
is not very nice, given that 0 is an oft-hit bad value), or that we need
a flag in every commit block which indicates if it actually has a checksum?

The INCOMPAT_ASYNC_COMMIT can't be handled safely by older kernels, since
they would assume "commit block == complete transaction", which isn't
true if the commit block didn't wait for the rest of the blocks to make
it to the disk.

I don't think e2fsck can be used to individually clean up the feature flags,
but it is always possible to remove and recreate the journal...

> > > -	/* AKPM: buglet - add `i' to tmp! */
> > 
> > Damn.  After, what, seven years, someone actually fixed it?
> > 
> > >  	for (i = 0; i < bh->b_size; i += 512) {
> > > -		journal_header_t *tmp = (journal_header_t*)bh->b_data;
> > > +		struct commit_header *tmp =
> > > +			(struct commit_header *)(bh->b_data + i);
> > >  		tmp->h_magic = cpu_to_be32(JBD2_MAGIC_NUMBER);
> > >  		tmp->h_blocktype = cpu_to_be32(JBD2_COMMIT_BLOCK);
> > >  		tmp->h_sequence = cpu_to_be32(commit_transaction->t_tid);
> > > +
> > > +		if (JBD2_HAS_COMPAT_FEATURE(journal,
> > > +					    JBD2_FEATURE_COMPAT_CHECKSUM)) {
> > > +			tmp->h_chksum_type 	= JBD2_CRC32_CHKSUM;
> > > +			tmp->h_chksum_size 	= JBD2_CRC32_CHKSUM_SIZE;
> > > +			tmp->h_chksum[0] 	= cpu_to_be32(crc32_sum);
> > > +		}
> > >  	}
> > 
> > And in doing so, changed the on-disk format of the journal commit blocks.
> > 
> > Surely this was worth a mention in the changelog, if not a standalone patch?
> > 
> > I don't think this is worth doing, really.  Why not just leave the format
> > as it was, take the loop out and run this code once rather than eight
> > times?

Well, we aren't using the rest of the commit block in any case.  I think
the original intention was that we'd get 8 copies of the commit block so
we would be sure to get a good one.

I don't know whether we'd rather have 8 copies of the commit block, or
more potential to expand the commit block?  I don't personally have any
preference, since the checksum should be a more robust way of checking
validity than having multiple copies, so we may as well remove the loop
and stick with a single copy for now.

> > > @@ -328,6 +360,7 @@ static int do_one_pass(journal_t *journa
> > >  	unsigned int		sequence;
> > >  	int			blocktype;
> > >  	int			tag_bytes = journal_tag_bytes(journal);
> > > +	__u32			crc32_sum = ~0; /* Transactional Checksums */
> > 
> > We normally use __u32 for visible-to-userspace stuff.  Kernel code would
> > use plain old u32.
> Ok.

Since the checksum is saved to disk, it seems more appropriate to use __u32
or maybe even __be32, though I'm not sure if the crc32 functions do that
correctly or not.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux