RE: Regression on linux-next (next-20250120)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Al,

> -----Original Message-----
> From: Al Viro <viro@xxxxxxxxxxxxxxxx> On Behalf Of Al Viro
> Sent: Monday, January 27, 2025 10:34 AM
> To: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: Borah, Chaitanya Kumar <chaitanya.kumar.borah@xxxxxxxxx>; intel-
> gfx@xxxxxxxxxxxxxxxxxxxxx; intel-xe@xxxxxxxxxxxxxxxxxxxxx; Kurmi, Suresh
> Kumar <suresh.kumar.kurmi@xxxxxxxxx>; Saarinen, Jani
> <jani.saarinen@xxxxxxxxx>; linux-fsdevel@xxxxxxxxxxxxxxx; Alexander Gordeev
> <agordeev@xxxxxxxxxxxxx>
> Subject: Re: Regression on linux-next (next-20250120)
> 
> On Sun, Jan 26, 2025 at 04:54:55PM +0100, Alexander Gordeev wrote:
> 
> > > > Since the version next-20250120 [2], we are seeing the following
> > > > regression
> > >
> > > Ugh...  To narrow the things down, could you see if replacing
> > >                 fsd = kmalloc(sizeof(*fsd), GFP_KERNEL); with
> > >                 fsd = kzalloc(sizeof(*fsd), GFP_KERNEL); in
> > > fs/debugfs/file.c:__debugfs_file_get() affects the test?
> >
> > This change fixes lots of the below failures in our CI. FWIW:
> >
> > Tested-by: Alexander Gordeev <agordeev@xxxxxxxxxxxxx>
> 
> The real fix follows:
> 
> [PATCH] Fix the missing initializations in __debugfs_file_get()
> 
> both method table pointers in debugfs_fsdata need to be initialized,
> obviously...
> 
> Fixes: 41a0ecc0997c "debugfs: get rid of dynamically allocation proxy_ops"
> Fucked-up-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> ---
> diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c index
> e33cc77699cd..212cd8128e1f 100644
> --- a/fs/debugfs/file.c
> +++ b/fs/debugfs/file.c
> @@ -111,6 +111,7 @@ static int __debugfs_file_get(struct dentry *dentry,
> enum dbgfs_get_mode mode)
>  				fsd->methods |= HAS_READ;
>  			if (ops->write)
>  				fsd->methods |= HAS_WRITE;
> +			fsd->real_fops = NULL;
>  		} else {
>  			const struct file_operations *ops;
>  			ops = fsd->real_fops = DEBUGFS_I(inode)->real_fops;
> @@ -124,6 +125,7 @@ static int __debugfs_file_get(struct dentry *dentry,
> enum dbgfs_get_mode mode)
>  				fsd->methods |= HAS_IOCTL;
>  			if (ops->poll)
>  				fsd->methods |= HAS_POLL;
> +			fsd->short_fops = NULL;
>  		}
>  		refcount_set(&fsd->active_users, 1);
>  		init_completion(&fsd->active_users_drained);


Unfortunately this change does not help us. I think it is the methods member that causes the problem. So the following change solves the problem for us.


--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -102,6 +102,8 @@ static int __debugfs_file_get(struct dentry *dentry, enum dbgfs_get_mode mode)
                if (!fsd)
                        return -ENOMEM;

+               fsd->methods = 0;
+
                if (mode == DBGFS_GET_SHORT) {
                        const struct debugfs_short_fops *ops;
                        ops = fsd->short_fops = DEBUGFS_I(inode)->short_fops;

My guess is, since methods have some junk value in it, we are trying to call a read function for a debugfs entry for which it doesn't exist. That leads to the failure.

<4>[   34.240163] Call Trace:
<4>[   34.240164]  <TASK>
<4>[   34.240165]  ? show_regs+0x6c/0x80
<4>[   34.240169]  ? __die+0x24/0x80
<4>[   34.240171]  ? page_fault_oops+0x175/0x5d0
<4>[   34.240176]  ? do_user_addr_fault+0x4c6/0x9d0
<4>[   34.240179]  ? exc_page_fault+0x8a/0x300
<4>[   34.240182]  ? asm_exc_page_fault+0x27/0x30
<4>[   34.240187]  *full_proxy_read*+0x6b/0xb0
<4>[   34.240191]  vfs_read+0xf9/0x390

My take would be to stick with the kzalloc like you suggested.

Regards

Chaitanya




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux