RE: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in cache_tag_flush_range_np()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Yi,

>-----Original Message-----
>From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
>Sent: Thursday, December 12, 2024 5:29 PM
>Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in
>cache_tag_flush_range_np()
>
>On 2024/12/12 16:19, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Liu, Yi L <yi.l.liu@xxxxxxxxx>
>>> Sent: Thursday, December 12, 2024 3:45 PM
>>> Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in
>>> cache_tag_flush_range_np()
>>>
>>> On 2024/12/12 15:24, Zhenzhong Duan wrote:
>>>> When setup mapping on an paging domain before the domain is attached to
>>> any
>>>> device, a NULL pointer dereference triggers as below:
>>>>
>>>>    BUG: kernel NULL pointer dereference, address: 0000000000000200
>>>>    #PF: supervisor read access in kernel mode
>>>>    #PF: error_code(0x0000) - not-present page
>>>>    RIP: 0010:cache_tag_flush_range_np+0x114/0x1f0
>>>>    ...
>>>>    Call Trace:
>>>>     <TASK>
>>>>     ? __die+0x20/0x70
>>>>     ? page_fault_oops+0x80/0x150
>>>>     ? do_user_addr_fault+0x5f/0x670
>>>>     ? pfn_to_dma_pte+0xca/0x280
>>>>     ? exc_page_fault+0x78/0x170
>>>>     ? asm_exc_page_fault+0x22/0x30
>>>>     ? cache_tag_flush_range_np+0x114/0x1f0
>>>>     intel_iommu_iotlb_sync_map+0x16/0x20
>>>>     iommu_map+0x59/0xd0
>>>>     batch_to_domain+0x154/0x1e0
>>>>     iopt_area_fill_domains+0x106/0x300
>>>>     iopt_map_pages+0x1bc/0x290
>>>>     iopt_map_user_pages+0xe8/0x1e0
>>>>     ? xas_load+0x9/0xb0
>>>>     iommufd_ioas_map+0xc9/0x1c0
>>>>     iommufd_fops_ioctl+0xff/0x1b0
>>>>     __x64_sys_ioctl+0x87/0xc0
>>>>     do_syscall_64+0x50/0x110
>>>>     entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>>
>>>> qi_batch structure is allocated when domain is attached to a device for the
>>>> first time, when a mapping is setup before that, qi_batch is referenced to
>>>> do batched flush and trigger above issue.
>>>>
>>>> Fix it by checking qi_batch pointer and bypass batched flushing if no
>>>> device is attached.
>>>>
>>>> Cc: stable@xxxxxxxxxxxxxxx
>>>> Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation")
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@xxxxxxxxx>
>>>> ---
>>>>    drivers/iommu/intel/cache.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c
>>>> index e5b89f728ad3..bb9dae9a7fba 100644
>>>> --- a/drivers/iommu/intel/cache.c
>>>> +++ b/drivers/iommu/intel/cache.c
>>>> @@ -264,7 +264,7 @@ static unsigned long
>>> calculate_psi_aligned_address(unsigned long start,
>>>>
>>>>    static void qi_batch_flush_descs(struct intel_iommu *iommu, struct
>qi_batch
>>> *batch)
>>>>    {
>>>> -	if (!iommu || !batch->index)
>>>> +	if (!iommu || !batch || !batch->index)
>>>
>>> this is the same issue in the below link. :) And we should fix it by
>>> allocating the qi_batch for parent domains. The nested parent domains is
>>> not going to be attached to device at all. It acts more as a background
>>> domain, so this fix will miss the future cache flushes. It would have
>>> bigger problems. :)
>>>
>>> https://lore.kernel.org/linux-iommu/20241210130322.17175-1-
>>> yi.l.liu@xxxxxxxxx/
>>
>> Ah, just see this😊
>> This patch tries to fix an issue that mapping setup on a paging domain before
>> it's attached to a device, your patch fixed an issue that nested parent
>> domain's qi_batch is not allocated even if nested domain is attached to
>> a device. I think they are different issues?
>
>Oops. I see. I think your case is allocating a hwpt based on an IOAS that
>already has mappings. When the hwpt is added to it, all the mappings of
>this IOAS would be pushing to the hwpt. But the hwpt has not been attached
>yet, so hit this issue. I remember there is a immediate_attach bool to let
>iommufd_hwpt_paging_alloc() do an attach before calling
>iopt_table_add_domain() which pushes the IOAS mappings to domain.
>
>One possible fix is to set the immediate_attach to be true. But I doubt if
>it will be agreed since it was introduced due to some gap on ARM side. If
>that gap has been resolved, this behavior would go away as well.
>
>So back to this issue, with this change, the flush would be skipped. It
>looks ok to me to skip cache flush for map path. And we should not expect
>any unmap on this domain since there is no device attached (parent domain
>is an exception), hence nothing to be flushed even there is unmap in the
>domain's IOAS. So it appears to be a acceptable fix. @Baolu, your opinion?

Hold on, it looks I'm wrong on analyzing related code qi_batch_flush_descs().
The iommu should always be NULL in my suspected case, so 
qi_batch_flush_descs() will return early without issue.

I reproduced the backtrace when playing with iommufd qemu nesting, I think your
previous comment is correct, I misunderstood the root cause of it, it's indeed
due to nesting parent domain not paging domain. Please ignore this patch.

Thanks
Zhenzhong




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux