Re: [PATCH 0/1] Possible bug in zram on ppc64le on vfat

Martin Doucha <mdoucha@xxxxxxx> · Thu, 10 Nov 2022 15:29:58 +0100

On 07. 11. 22 22:25, Minchan Kim wrote:
On Mon, Nov 07, 2022 at 08:11:35PM +0100, Petr Vorel wrote:
Hi all,

following bug is trying to workaround an error on ppc64le, where
zram01.sh LTP test (there is also kernel selftest
tools/testing/selftests/zram/zram01.sh, but LTP test got further
updates) has often mem_used_total 0 although zram is already filled.

Hi, Petr,

Is it happening on only ppc64le?

Is it a new regression? What kernel version did you use?

Hi,
I've reported the same issue on kernels 4.12.14 and 5.3.18 internally to 
our kernel developers at SUSE. The bugreport is not public but I'll copy 
the bug description here:

New version of LTP test zram01 found a sysfile issue with zram devices 
mounted using VFAT filesystem. When when all available space is filled, 
e.g. by `dd if=/dev/zero of=/mnt/zram0/file`, the corresponding sysfile 
/sys/block/zram0/mm_stat will report that the compressed data size on 
the device is 0 and total memory usage is also 0. LTP test zram01 uses 
these values to calculate compression ratio, which results in division 
by zero.

The issue is specific to PPC64LE architecture and the VFAT filesystem. 
No other tested filesystem has this issue and I could not reproduce it 
on other archs (s390 not tested). The issue appears randomly about every 
3 test runs on SLE-15SP2 and 15SP3 (kernel 5.3). It appears less 
frequently on SLE-12SP5 (kernel 4.12). Other SLE version were not tested 
with the new test version yet. The previous version of the test did not 
check the VFAT filesystem on zram devices.

I've tried to debug the issue and collected some interesting data (all 
values come from zram device with 25M size limit and zstd compression 
algorithm):
- mm_stat values are correct after mkfs.vfat:
65536      220    65536 26214400    65536        0        0        0

- mm_stat values stay correct after mount:
65536      220    65536 26214400    65536        0        0        0

- the bug is triggered by filling the filesystem to capacity (using dd):
4194304        0        0 26214400   327680       64        0        0

- adding `sleep 1` between `dd` and reading mm_stat does not help
- adding sync between `dd` and reading mm_stat appears to fix the issue:
26214400     2404   262144 26214400   327680      399        0        0

--
Martin Doucha   mdoucha@xxxxxxx
QA Engineer for Software Maintenance
SUSE LINUX, s.r.o.
CORSO IIa
Krizikova 148/34
186 00 Prague 8
Czech Republic