Re: [bug report] scsi: target: tcmu: Fix possible data corruption

Bodo Stroesser <bostroesser@xxxxxxxxx> · Mon, 9 May 2022 20:28:54 +0200

To be honest, tcmu_blocks_release is not relevant for tcmu performance.
Therefore I think we should try to keep the code simple and easy to
understand.

So, it probably is the simplest and also best solution to switch to
xarray normal API. That would make tcmu_blocks_release look like
the following (untested) code:

static u32 tcmu_blocks_release(struct tcmu_dev *udev, unsigned long first,
                               unsigned long last)
{
	struct page *page;
	unsigned long dpi;
	u32 pages_freed = 0;

	first = first * udev->data_pages_per_blk;
	last = (last + 1) * udev->data_pages_per_blk - 1;
	xa_for_each_range(&udev->data_pages, dpi, page, first, last) {
		xa_erase(&udev->data_pages, dpi);
		lock_page(page);
		unlock_page(page);
		__free_page(page);
		pages_freed++;
	}

	atomic_sub(pages_freed, &global_page_count);

	return pages_freed;
}

Regards,
Bodo

On 09.05.22 08:05, Dan Carpenter wrote:
If there is no other way to avoid the Smatch warning,

The Smatch warning is not the issue.  If we're holding a spinlock and
we call might_sleep() then that generates a stack trace at runtime if
you have CONFIG_DEBUG_ATOMIC_SLEEP enabled.  Probably enabling
CONFIG_DEBUG_ATOMIC_SLEEP should just be a standard part of the QC
process.

Anyway, it sounds you're just doing the locking to silence a warning in
xarray.  Let's ask Matthew if he has a hint.

regards,
dan carpenter

On Sun, May 08, 2022 at 08:03:14PM +0200, Bodo Stroesser wrote:
Hi Dan,

Thank you for the report.

I'm quite sure that our code does not cause any problems, because
in tcmu we explicitly or implicitly take the xarray's lock only while
holding the so called cmdr_lock mutex. Also, we take the lock without
disabling irq or bh. So I think there is no problem if lock_page sleeps
while we hold the xarray's lock.

Of course, we wouldn't need the xarray lock at all, but xarray code
forces us to take it to avoid warnings.

In tcmu_blocks_release we use the advanced xarray API to keep the
overhead small. It allows us to lock/unlock before and after the loop
only. If there is no other way to avoid the Smatch warning, we could
easily put additional xas_unlock() and xas_lock() around the
lock_page/unlock_page block.

But if there is a way to avoid the warning without imposing overhead,
I would of course prefer it.

Regards,
Bodo

On 04.05.22 17:12, Dan Carpenter wrote:
Hello Xiaoguang Wang,

The patch bb9b9eb0ae2e: "scsi: target: tcmu: Fix possible data
corruption" from Apr 21, 2022, leads to the following Smatch static
checker warning:

	drivers/target/target_core_user.c:1689 tcmu_blocks_release()
	warn: sleeping in atomic context

drivers/target/target_core_user.c
      1661 static u32 tcmu_blocks_release(struct tcmu_dev *udev, unsigned long first,
      1662                                 unsigned long last)
      1663 {
      1664         XA_STATE(xas, &udev->data_pages, first * udev->data_pages_per_blk);
      1665         struct page *page;
      1666         u32 pages_freed = 0;
      1667
      1668         xas_lock(&xas);
                   ^^^^^^^^^^^^^^
We take a spinlock here.

      1669         xas_for_each(&xas, page, (last + 1) * udev->data_pages_per_blk - 1) {
      1670                 xas_store(&xas, NULL);
      1671                 /*
      1672                  * While reaching here there may be page faults occurring on
      1673                  * the to-be-released pages. A race condition may occur if
      1674                  * unmap_mapping_range() is called before page faults on these
      1675                  * pages have completed; a valid but stale map is created.
      1676                  *
      1677                  * If another command subsequently runs and needs to extend
      1678                  * dbi_thresh, it may reuse the slot corresponding to the
      1679                  * previous page in data_bitmap. Though we will allocate a new
      1680                  * page for the slot in data_area, no page fault will happen
      1681                  * because we have a valid map. Therefore the command's data
      1682                  * will be lost.
      1683                  *
      1684                  * We lock and unlock pages that are to be released to ensure
      1685                  * all page faults have completed. This way
      1686                  * unmap_mapping_range() can ensure stale maps are cleanly
      1687                  * removed.
      1688                  */
--> 1689                 lock_page(page);
                           ^^^^^^^^^^^^^^^
The lock_page() function calls might_sleep() (inside the declaration
block).

      1690                 unlock_page(page);
      1691                 __free_page(page);
      1692                 pages_freed++;
      1693         }
      1694         xas_unlock(&xas);
                   ^^^^^^^^^^^^^^^^^
Unlock

      1695
      1696         atomic_sub(pages_freed, &global_page_count);
      1697
      1698         return pages_freed;
      1699 }

regards,
dan carpenter