Re: xfs: Uninitialized memory read at xlog_write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 15, 2017 at 08:44:31AM +1000, Dave Chinner wrote:
> On Thu, Sep 14, 2017 at 07:15:38PM +0900, Tetsuo Handa wrote:
> > Dave Chinner wrote:
> > > On Wed, Sep 13, 2017 at 06:59:38PM +0900, Tetsuo Handa wrote:
> > > > Dave Chinner wrote:
> > > > > On Wed, Sep 13, 2017 at 04:14:37PM +0900, Tetsuo Handa wrote:
> > > > > > [  OK  ] Stopped target Switch Root.
> > > > > > 
> > > > > > [  OK  ] Stopped target Initrd File Systems.[ 1054.691505] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880135396660)
> > > > > > [ 1054.691506] 000000000000000093050a200000000000000000000000000000000000000000
> > > > > > [ 1054.691511]  u u u u u u u u i i i i i i i i u u u u u u u u u u u u u u u u
> > > > > > [ 1054.691515]  ^
> > > > > > [ 1054.691519] RIP: 0010:xlog_write+0x344/0x6b0
> > > > > 
> > > > > What line of code does this correspond to?
> > > > > 
> > > > 
> > > >                         /*
> > > >                          * Copy region.
> > > >                          *
> > > >                          * Unmount records just log an opheader, so can have
> > > >                          * empty payloads with no data region to copy. Hence we
> > > >                          * only copy the payload if the vector says it has data
> > > >                          * to copy.
> > > >                          */
> > > >                         ASSERT(copy_len >= 0);
> > > >                         if (copy_len > 0) {
> > > >                                 memcpy(ptr, reg->i_addr + copy_off, copy_len); // <= xlog_write+0x344/0x6b0
> > > >                                 xlog_write_adv_cnt(&ptr, &len, &log_offset,
> > > >                                                    copy_len);
> > > >                         }
> > > > 
> > > 
> > > Ok, that's what I suspected. The region being copied is set up
> > > in xlog_cil_insert_format_items(), so problem is in one of the
> > > ->iop_format methods it calls to format the dirty metadata into the
> > > region.
> > > 
> > > And given that the address is ...6660, it's likely the offset into
> > > the structure being copied is 96 bytes.
> > > 
> > > $ pahole...
> > > .....
> > > struct xfs_log_dinode {
> > > .....
> > >        xfs_agino_t                di_next_unlinked;     /*    96     4 */
> > > .....
> > > 
> > > Try the patch below.
> > 
> > That patch did not help.
> > 
> > I checked values passed to memcpy() using below patch.
> 
> ok....
> 
> > 
> > ----------
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index c5107c7..f91c4c7 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -2476,6 +2476,8 @@
> >  			 */
> >  			ASSERT(copy_len >= 0);
> >  			if (copy_len > 0) {
> > +				printk(KERN_INFO "ptr=%p reg->i_addr=%p copy_off=%u copy_len=%u\n",
> > +				       ptr, reg->i_addr, copy_off, copy_len);
> 
> You need to print out the reg->i_type here. Then we'll know exactly
> where it came from.
> 
> >  				memcpy(ptr, reg->i_addr + copy_off, copy_len);
> >  				xlog_write_adv_cnt(&ptr, &len, &log_offset,
> >  						   copy_len);
> > ----------
> > 
> > The copy_len was not multiple of sizeof(struct xfs_log_dinode).
> > Thus, I guess we can't assume this is "struct xfs_log_dinode".
> 
> It still could be - we don't always log the entire structure.
> 
> static inline uint xfs_log_dinode_size(int version)
> {
>         if (version == 3)
> 		return sizeof(struct xfs_log_dinode);
> 	return offsetof(struct xfs_log_dinode, di_next_unlinked);
> }
> 
> i.e. If it's v2 inode, the logged region is 96 bytes in length, not
> 176.
> 
> > 
> > ----------
> >          Starting Load/Save Random Seed...
> > 
> >          Starting Configure read-only root support...
> > 
> > [ 1106.927991] ptr=ffffc90001c08218 reg->i_addr=ffff880134c7fda8 copy_off=0 copy_len=16
> > [ 1106.928022] ptr=ffffc90001c08234 reg->i_addr=ffff88013395f858 copy_off=0 copy_len=56
> > [ 1106.928100] ptr=ffffc90001c08278 reg->i_addr=ffff88013395f890 copy_off=0 copy_len=96
> > [ 1106.932354] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88013395f860)
> 
> Hold on- that warning has come from the /prior/ region copy, not
> the current region!
> 
> i.e. 2nd last copy region was ffff88013395f858 for 0x38 bytes, and
> this spans address that failed ffff88013395f860. i.e. it's 8 bytes
> into that region. It's 48 bytes before the region that we were
> copying when kmemcheck triggered!
> 
> I'm going to guess that we've got a log op header, followed by
> a inode format header, follow by something like a v2 inode core.
> i.e. log op headers are 16 bytes, xfs_inode_log_format are 56 bytes,
> and a v2 inode core is 96 bytes.
> 
> So, the inode log format header:
> 
> struct xfs_inode_log_format {
>         uint16_t                   ilf_type;             /*     0     2 */
>         uint16_t                   ilf_size;             /*     2     2 */
>         uint32_t                   ilf_fields;           /*     4     4 */
>         uint16_t                   ilf_asize;            /*     8     2 */
>         uint16_t                   ilf_dsize;            /*    10     2 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         uint64_t                   ilf_ino;              /*    16     8 */
>         union {
>                 uint32_t           ilfu_rdev;            /*           4 */
>                 uuid_t             ilfu_uuid;            /*          16 */
>         } ilf_u;                                         /*    24    16 */
>         int64_t                    ilf_blkno;            /*    40     8 */
>         int32_t                    ilf_len;              /*    48     4 */
>         int32_t                    ilf_boffset;          /*    52     4 */
> 
>         /* size: 56, cachelines: 1, members: 10 */
>         /* sum members: 52, holes: 1, sum holes: 4 */
>         /* last cacheline: 56 bytes */
> };
> 
> Oh, that's right. Someone, a long, long time ago, screwed up the on
> disk inode log structure but nobody noticed it due to the Irix MIPS
> compilers padding the structure identically on 32 and 64 bit
> systems. Port to linux, and i386/x86-64 padded it differently. The
> fix at the time was to make it right in recovery and continue to
> use the native structure at runtime. Ok, let's fix that properly
> now.
> 
> Try the patch below.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 
> xfs: Don't log uninitialised fields in inode structures
> 
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> Prevent kmemcheck from throwing warnings about reading uninitialised
> memory when formatting inodes into the incore log buffer. There are
> several issues here - we don't always log all the fields in the
> inode log format item, and we never log the inode the
> di_next_unlinked field.
> 
> In the case of the inode log format item, this is aused b cerbated

"is aused b cerbated" ?

> by the old xfs_inode_log_format structure padding issue. Hence make
> the padded, 64 bit aligned version of the structure the one we always
> use for formatting the log and get rid of the 64 bit variant. This
> means we'll always log the 64-bit version and so recovery only needs
> to convert from the unpadded 32 bit version from older 32 bit
> kernels.

And those old 32-bit kernels can read the xfs_inode_log_format{,_64}
structures, right?

> Signed-Off-By: Dave Chinner <dchinner@xxxxxxxxxx>
> ---
>  fs/xfs/libxfs/xfs_log_format.h | 27 +++++----------
>  fs/xfs/xfs_inode_item.c        | 79 ++++++++++++++++++++++--------------------
>  fs/xfs/xfs_ondisk.h            |  2 +-
>  3 files changed, 50 insertions(+), 58 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
> index 8372e9bcd7b6..71de185735e0 100644
> --- a/fs/xfs/libxfs/xfs_log_format.h
> +++ b/fs/xfs/libxfs/xfs_log_format.h
> @@ -270,6 +270,7 @@ typedef struct xfs_inode_log_format {
>  	uint32_t		ilf_fields;	/* flags for fields logged */
>  	uint16_t		ilf_asize;	/* size of attr d/ext/root */
>  	uint16_t		ilf_dsize;	/* size of data/ext/root */
> +	uint32_t		ilf_pad;	/* pad for 64 bit boundary */
>  	uint64_t		ilf_ino;	/* inode number */
>  	union {
>  		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
> @@ -280,29 +281,17 @@ typedef struct xfs_inode_log_format {
>  	int32_t			ilf_boffset;	/* off of inode in buffer */
>  } xfs_inode_log_format_t;
>  
> -typedef struct xfs_inode_log_format_32 {
> -	uint16_t		ilf_type;	/* inode log item type */
> -	uint16_t		ilf_size;	/* size of this item */
> -	uint32_t		ilf_fields;	/* flags for fields logged */
> -	uint16_t		ilf_asize;	/* size of attr d/ext/root */
> -	uint16_t		ilf_dsize;	/* size of data/ext/root */
> -	uint64_t		ilf_ino;	/* inode number */
> -	union {
> -		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
> -		uuid_t		ilfu_uuid;	/* mount point value */
> -	} ilf_u;
> -	int64_t			ilf_blkno;	/* blkno of inode buffer */
> -	int32_t			ilf_len;	/* len of inode buffer */
> -	int32_t			ilf_boffset;	/* off of inode in buffer */
> -} __attribute__((packed)) xfs_inode_log_format_32_t;
> -
> -typedef struct xfs_inode_log_format_64 {
> +/*
> + * Old 32 bit systems will log in this format without the 64 bit
> + * alignment padding. Recovery will detect this and convert it to the
> + * correct format.
> + */
> +struct xfs_inode_log_format_32 {
>  	uint16_t		ilf_type;	/* inode log item type */
>  	uint16_t		ilf_size;	/* size of this item */
>  	uint32_t		ilf_fields;	/* flags for fields logged */
>  	uint16_t		ilf_asize;	/* size of attr d/ext/root */
>  	uint16_t		ilf_dsize;	/* size of data/ext/root */
> -	uint32_t		ilf_pad;	/* pad for 64 bit boundary */
>  	uint64_t		ilf_ino;	/* inode number */
>  	union {
>  		uint32_t	ilfu_rdev;	/* rdev value for dev inode*/
> @@ -311,7 +300,7 @@ typedef struct xfs_inode_log_format_64 {
>  	int64_t			ilf_blkno;	/* blkno of inode buffer */
>  	int32_t			ilf_len;	/* len of inode buffer */
>  	int32_t			ilf_boffset;	/* off of inode in buffer */
> -} xfs_inode_log_format_64_t;
> +} __attribute__((packed));
>  
>  
>  /*
> diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> index 6d0f74ec31e8..aec9cf36b5b7 100644
> --- a/fs/xfs/xfs_inode_item.c
> +++ b/fs/xfs/xfs_inode_item.c
> @@ -364,6 +364,9 @@ xfs_inode_to_log_dinode(
>  	to->di_dmstate = from->di_dmstate;
>  	to->di_flags = from->di_flags;
>  
> +	/* log a dummy value to ensure log structure is fully initialised */
> +	to->di_next_unlinked = NULLAGINO;
> +
>  	if (from->di_version == 3) {
>  		to->di_changecount = inode->i_version;
>  		to->di_crtime.t_sec = from->di_crtime.t_sec;
> @@ -404,6 +407,11 @@ xfs_inode_item_format_core(
>   * the second with the on-disk inode structure, and a possible third and/or
>   * fourth with the inode data/extents/b-tree root and inode attributes
>   * data/extents/b-tree root.
> + *
> + * Note: Always use the 64 bit inode log format structure so we don't
> + * leave an uninitialised hole in the format item on 64 bit systems. Log
> + * recovery on 32 bit systems handles this just fine, so there's no reason
> + * for not using an initialising the properly padded structure all the time.
>   */
>  STATIC void
>  xfs_inode_item_format(
> @@ -412,8 +420,8 @@ xfs_inode_item_format(
>  {
>  	struct xfs_inode_log_item *iip = INODE_ITEM(lip);
>  	struct xfs_inode	*ip = iip->ili_inode;
> -	struct xfs_inode_log_format *ilf;
>  	struct xfs_log_iovec	*vecp = NULL;
> +	struct xfs_inode_log_format *ilf;
>  
>  	ASSERT(ip->i_d.di_version > 1);
>  
> @@ -425,7 +433,17 @@ xfs_inode_item_format(
>  	ilf->ilf_boffset = ip->i_imap.im_boffset;
>  	ilf->ilf_fields = XFS_ILOG_CORE;
>  	ilf->ilf_size = 2; /* format + core */
> -	xlog_finish_iovec(lv, vecp, sizeof(struct xfs_inode_log_format));
> +
> +	/*
> +	 * make sure we don't leak uninitialised data into the log in the case
> +	 * when we don't log every field in the inode.
> +	 */
> +	ilf->ilf_dsize = 0;
> +	ilf->ilf_asize = 0;
> +	ilf->ilf_pad = 0;
> +	uuid_copy(&ilf->ilf_u.ilfu_uuid, &uuid_null);
> +
> +	xlog_finish_iovec(lv, vecp, sizeof(*ilf));
>  
>  	xfs_inode_item_format_core(ip, lv, &vecp);
>  	xfs_inode_item_format_data_fork(iip, ilf, lv, &vecp);
> @@ -855,44 +873,29 @@ xfs_istale_done(
>  }
>  
>  /*
> - * convert an xfs_inode_log_format struct from either 32 or 64 bit versions
> - * (which can have different field alignments) to the native version
> + * convert an xfs_inode_log_format struct from the old 32 bit version
> + * (which can have different field alignments) to the native 64 bit version
>   */
>  int
>  xfs_inode_item_format_convert(
> -	xfs_log_iovec_t		*buf,
> -	xfs_inode_log_format_t	*in_f)
> +	struct xfs_log_iovec		*buf,
> +	struct xfs_inode_log_format	*in_f)
>  {
> -	if (buf->i_len == sizeof(xfs_inode_log_format_32_t)) {
> -		xfs_inode_log_format_32_t *in_f32 = buf->i_addr;
> -
> -		in_f->ilf_type = in_f32->ilf_type;
> -		in_f->ilf_size = in_f32->ilf_size;
> -		in_f->ilf_fields = in_f32->ilf_fields;
> -		in_f->ilf_asize = in_f32->ilf_asize;
> -		in_f->ilf_dsize = in_f32->ilf_dsize;
> -		in_f->ilf_ino = in_f32->ilf_ino;
> -		/* copy biggest field of ilf_u */
> -		uuid_copy(&in_f->ilf_u.ilfu_uuid, &in_f32->ilf_u.ilfu_uuid);
> -		in_f->ilf_blkno = in_f32->ilf_blkno;
> -		in_f->ilf_len = in_f32->ilf_len;
> -		in_f->ilf_boffset = in_f32->ilf_boffset;
> -		return 0;
> -	} else if (buf->i_len == sizeof(xfs_inode_log_format_64_t)){
> -		xfs_inode_log_format_64_t *in_f64 = buf->i_addr;
> -
> -		in_f->ilf_type = in_f64->ilf_type;
> -		in_f->ilf_size = in_f64->ilf_size;
> -		in_f->ilf_fields = in_f64->ilf_fields;
> -		in_f->ilf_asize = in_f64->ilf_asize;
> -		in_f->ilf_dsize = in_f64->ilf_dsize;
> -		in_f->ilf_ino = in_f64->ilf_ino;
> -		/* copy biggest field of ilf_u */
> -		uuid_copy(&in_f->ilf_u.ilfu_uuid, &in_f64->ilf_u.ilfu_uuid);
> -		in_f->ilf_blkno = in_f64->ilf_blkno;
> -		in_f->ilf_len = in_f64->ilf_len;
> -		in_f->ilf_boffset = in_f64->ilf_boffset;
> -		return 0;
> -	}
> -	return -EFSCORRUPTED;
> +	struct xfs_inode_log_format_32	*in_f32 = buf->i_addr;
> +
> +	if (buf->i_len != sizeof(*in_f32))
> +		return -EFSCORRUPTED;
> +
> +	in_f->ilf_type = in_f32->ilf_type;
> +	in_f->ilf_size = in_f32->ilf_size;
> +	in_f->ilf_fields = in_f32->ilf_fields;
> +	in_f->ilf_asize = in_f32->ilf_asize;
> +	in_f->ilf_dsize = in_f32->ilf_dsize;
> +	in_f->ilf_ino = in_f32->ilf_ino;
> +	/* copy biggest field of ilf_u */
> +	uuid_copy(&in_f->ilf_u.ilfu_uuid, &in_f32->ilf_u.ilfu_uuid);
> +	in_f->ilf_blkno = in_f32->ilf_blkno;
> +	in_f->ilf_len = in_f32->ilf_len;
> +	in_f->ilf_boffset = in_f32->ilf_boffset;
> +	return 0;
>  }
> diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h
> index 0c381d71b242..0492436a053f 100644
> --- a/fs/xfs/xfs_ondisk.h
> +++ b/fs/xfs/xfs_ondisk.h
> @@ -134,7 +134,7 @@ xfs_check_ondisk_structs(void)
>  	XFS_CHECK_STRUCT_SIZE(struct xfs_icreate_log,		28);
>  	XFS_CHECK_STRUCT_SIZE(struct xfs_ictimestamp,		8);
>  	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format_32,	52);
> -	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format_64,	56);
> +	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format,	56);

Will require a change to xfs/122 as well...

--D

>  	XFS_CHECK_STRUCT_SIZE(struct xfs_qoff_logformat,	20);
>  	XFS_CHECK_STRUCT_SIZE(struct xfs_trans_header,		16);
>  }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux