Re: [PATCH v2] fs/ntfs: fix BUG_ON of ntfs_read_block()

Anton Altaparmakov <anton@xxxxxxxxxx> · Fri, 24 Jun 2022 15:26:27 +0000

Hi,

> On 24 Jun 2022, at 15:37, Namjae Jeon <linkinjeon@xxxxxxxxxx> wrote:
> 2022-06-24 21:19 GMT+09:00, Anton Altaparmakov <anton@xxxxxxxxxx>:
>> Hi,
>> 
>> On 24 Jun 2022, at 03:33, Namjae Jeon
>> <linkinjeon@xxxxxxxxxx<mailto:linkinjeon@xxxxxxxxxx>> wrote:
>> 
>> 2022-06-24 2:08 GMT+09:00, Eric Biggers
>> <ebiggers@xxxxxxxxxx<mailto:ebiggers@xxxxxxxxxx>>:
>> On Thu, Jun 23, 2022 at 09:49:56AM +0000,
>> cgel.zte@xxxxxxxxx<mailto:cgel.zte@xxxxxxxxx> wrote:
>> From: xu xin <xu.xin16@xxxxxxxxxx<mailto:xu.xin16@xxxxxxxxxx>>
>> 
>> As the bug description at
>> https://lore.kernel.org/lkml/20220623033635.973929-1-xu.xin16@xxxxxxxxxx/
>> attckers can use this bug to crash the system.
>> 
>> So to avoid panic, remove the BUG_ON, and use ntfs_warning to output a
>> warning to the syslog and return instead until someone really solve
>> the problem.
>> 
>> Cc: stable@xxxxxxxxxxxxxxx
>> Reported-by: Zeal Robot <zealci@xxxxxxxxxx>
>> Reported-by: syzbot+6a5a7672f663cce8b156@xxxxxxxxxxxxxxxxxxxxxxxxx
>> Reviewed-by: Songyi Zhang <zhang.songyi@xxxxxxxxxx>
>> Reviewed-by: Yang Yang <yang.yang29@xxxxxxxxxx>
>> Reviewed-by: Jiang Xuexin<jiang.xuexin@xxxxxxxxxx>
>> Reviewed-by: Zhang wenya<zhang.wenya1@xxxxxxxxxx>
>> Signed-off-by: xu xin <xu.xin16@xxxxxxxxxx>
>> ---
>> 
>> Change for v2:
>> - Use ntfs_warning instead of WARN().
>> - Add the tag Cc: stable@xxxxxxxxxxxxxxx.
>> ---
>> fs/ntfs/aops.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
>> index 5f4fb6ca6f2e..84d68efb4ace 100644
>> --- a/fs/ntfs/aops.c
>> +++ b/fs/ntfs/aops.c
>> @@ -183,7 +183,12 @@ static int ntfs_read_block(struct page *page)
>> vol = ni->vol;
>> 
>> /* $MFT/$DATA must have its complete runlist in memory at all times. */
>> - BUG_ON(!ni->runlist.rl && !ni->mft_no && !NInoAttr(ni));
>> + if (unlikely(!ni->runlist.rl && !ni->mft_no && !NInoAttr(ni))) {
>> + ntfs_warning(vi->i_sb, "Error because ni->runlist.rl, ni->mft_no, "
>> + "and NInoAttr(ni) is null.");
>> + unlock_page(page);
>> + return -EINVAL;
>> + }
>> 
>> A better warning message that doesn't rely on implementation details
>> (struct
>> field and macro names) would be "Runlist of $MFT/$DATA is not cached".
>> Also,
>> why does this situation happen in the first place? Is there a way to
>> prevent
>> this situation in the first place?
>> 
>> ntfs_mapping_pairs_decompress() should return error pointer instead of
>> NULL.
>> 
>> Callers is checking error value using IS_ERR(). and the mapping pairs
>> array of @MFT entry is empty, I think it's corrupted, it should cause
>> mount failure.
>> 
>> NAK
>> 
>> Sorry but this patch is incorrect. It is perfectly valid to have an empty
>> non-resident attribute. E.g. if you truncate a file to zero size this is
>> exactly what you will get on-disk and when you then unmount and mount next
>> time and try to access that file with your patch you will now get an -EIO
>> error trying to access the file and you will not be able to write to the
>> file nor truncate it as you will keep getting the i/o error.
> Sorry, I can't reproduce the issue you described?
> 
> root@linkinjeon-Z10PA-D8-Series:/mnt/test# ls -al
> total 5928
> drwx------ 1 root root 4096 6월 24 23:01 .
> drwxr-xr-x 7 root root 4096 5월 29 12:47 ..
> -rw------- 1 root root 6059409 9월 22 2020 foo
> drwx------ 1 root root 0 6월 24 22:30 'System Volume Information'
> root@linkinjeon-Z10PA-D8-Series:/mnt/test# truncate -s 0 foo
> root@linkinjeon-Z10PA-D8-Series:/mnt/test# ls -al
> total 8
> drwx------ 1 root root 4096 6월 24 23:01 .
> drwxr-xr-x 7 root root 4096 5월 29 12:47 ..
> -rw------- 1 root root 0 6월 24 23:11 foo
> drwx------ 1 root root 0 6월 24 22:30 'System Volume Information'
> root@linkinjeon-Z10PA-D8-Series:/mnt/test# cd ..
> root@linkinjeon-Z10PA-D8-Series:/mnt# sudo umount /mnt/test
> root@linkinjeon-Z10PA-D8-Series:/mnt# sudo mount -t ntfs /dev/sde2 /mnt/test/
> root@linkinjeon-Z10PA-D8-Series:/mnt# cd /mnt/test/
> root@linkinjeon-Z10PA-D8-Series:/mnt/test# cat foo
> root@linkinjeon-Z10PA-D8-Series:/mnt/test# truncate -s 1048576 foo
> root@linkinjeon-Z10PA-D8-Series:/mnt/test# ls -al
> total 1032
> drwx------ 1 root root 4096 6월 24 23:01 .
> drwxr-xr-x 7 root root 4096 5월 29 12:47 ..
> -rw------- 1 root root 1048576 6월 24 23:12 foo
> drwx------ 1 root root 0 6월 24 22:30 'System Volume Information'
> root@linkinjeon-Z10PA-D8-Series:/mnt/test# echo "hello world" > foo
> root@linkinjeon-Z10PA-D8-Series:/mnt/test# cat foo
> hello world

It must be that we never attempt to map the runlist in this case.  generic_file_read_iter() appears to never call into the file system because filemap_read() does:

if (unlikely(iocb->ki_pos >= i_size_read(inode)))
break; 

I imagine that there are similar code paths in the VFS are why it currently works...

In fact for example has:

                if (!rl && !allocated_size)
                        goto first_alloc;
                rl = ntfs_mapping_pairs_decompress(vol, a, ni->runlist.rl);

So it skips calling ntfs_mapping_pairs_decompress() for an empty file but what if there is a runlist but allocated_size is incorrectly zero?  We would now wipe that and lose it so it is not ideal to do it that way but it does make it work.  Anyway it is because of things like that that at least some code paths cope with your patch.

But basically it is perfectly valid to have a non-resident attribute which has zero size and thus an empty runlist.  This is not an error so you cannot start returning an error for this case!

Just because in this case the problem is avoided does not mean your patch is correct.  NTFS also accesses attributes internally and if happens to attempt to map the runlist of an empty non-resident attribute it would now bail out with error instead of continuing correctly.

>> The correct solution is to use IS_ERR_OR_NULL() in places where an empty
>> attribute is not acceptable. Such a case is for example when mounting the
>> $MFT::$DATA::unnamed attribute cannot be empty which should then be
>> addressed inside in fs/ntfs/inode.c::ntfs_read_inode_mount(). There may be
>> more call sites to ntfs_mapping_pairs_decompress() which require similar
>> treatment. Need to go through the code to see...
> I think that it is needed everywhere that calls it. Am I missing something ?

It is not needed everywhere.  It is only needed in code paths that require there be a runlist afterwards because the code paths assume there must be one.  So for example for $MFT it by definition cannot have an empty runlist as we are already reading from the $MFT so it has an allocation so the runlist cannot be empty.  But the fuzzed image that is showing the problem that you are trying to fix has such a corrupt volume that the runlist is in fact empty - note it is NOT corrupt but it is valid but empty.  This then leads to the problem that we try to load the runlist for the $MFT and we do not check whether the result is empty because we assume it cannot be empty.  Clearly when you have a corrupted image that assumption can be wrong so we need to check for both error and for it being empty, hence IS_ERR_OR_NULL().  But if we are accessing an attribute elsewhere in the driver and that happens to be empty then we want a NULL runlist to be obtained so we can then operate with the attribute.  We do not want attempting to map the runlist to fail.  The code can try to map the runlist for all non-resident attributes before working with them and if the attribute is zero length then the runlist is NULL until we write something to the attribute at which point it becomes non-NULL describing the allocated clusters.

Your patch violates the design of the code which is that empty attribute inodes have a NULL runlist and your patch will cause perfectly good attributes to be rejected if their runlist is attempted to be mapped when they are empty.  You would then have to start checking everywhere in the code whether an attribute has zero allocated_size (or compressed_size if compressed and/or sparse) and if so you would need to then not attempt to map the runlist which would be ugly.  Much better to have the code path streamline so it just returns an empty runlist...

It may be possible that the current OSS ntfs driver can indeed function with your patch but it still violates how the code is designed to work which is why I am not happy to take your patch.

> I can not understand why the below code is needed in
> ntfs_mapping_pairs_decompress().
> 
> /* If the mapping pairs array is valid but empty, nothing to do. */
> if (!vcn && !*buf) {
> return old_rl;
> }
> 
> There is no description in patch. and this code is not in
> ntfs_mapping_pairs_decompress() in ntfs-3g. Is there any case the
> caller get NULL runlist pointer from ntfs_mapping_pairs_decompress()
> in current ntfs code?

That fix was put in for a reason.  Forgive me if I cannot remember after 17 years why that was...  The fact that the patch says "fix handling" means that it was required to be put in to fix something.  It is possible that afterwards the code changed so it no longer can get there in this case due to guards as pointed out above but fundamentally the function needs to be able to deal correctly with an empty runlist and if the caller expects to definitely get a runlist (as e.g. the $MFT loading code does at mount time) then it is for that caller to verify that the runlist is in fact not empty rather than to break the function instead to return errors for empty, non-resident attributes.

Best regards,

Anton

> NTFS: Fix handling of valid but empty mapping pairs array in
> fs/ntfs/runlist.c::ntfs_mapping_pairs_decompress().
> 
> Signed-off-by: Anton Altaparmakov <aia21@xxxxxxxxxx>
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/ntfs/runlist.c?h=v5.19-rc3&id=2b0ada2b8e086c267dd116a39ad41ff0a717b665
>> 
>> Best regards,
>> 
>> Anton
>> 
>> 
>> I haven't checked if this patch fix the problem. Xu, Can you check it ?
>> 
>> diff --git a/fs/ntfs/runlist.c b/fs/ntfs/runlist.c
>> index 97932fb5179c..31263fe0772f 100644
>> --- a/fs/ntfs/runlist.c
>> +++ b/fs/ntfs/runlist.c
>> @@ -766,8 +766,11 @@ runlist_element
>> *ntfs_mapping_pairs_decompress(const ntfs_volume *vol,
>> return ERR_PTR(-EIO);
>> }
>> /* If the mapping pairs array is valid but empty, nothing to do. */
>> - if (!vcn && !*buf)
>> + if (!vcn && !*buf) {
>> + if (!old_rl)
>> + return ERR_PTR(-EIO);
>> return old_rl;
>> + }
>> /* Current position in runlist array. */
>> rlpos = 0;
>> /* Allocate first page and set current runlist size to one page. */
>> 
>> 
>> - Eric