On Tue, 2019-10-29 at 10:02 +0800, Ian Kent wrote: > On Tue, 2019-10-29 at 09:11 +0800, Ian Kent wrote: > > On Mon, 2019-10-28 at 17:52 -0700, Darrick J. Wong wrote: > > > On Tue, Oct 29, 2019 at 08:29:38AM +0800, Ian Kent wrote: > > > > On Mon, 2019-10-28 at 16:34 -0700, Darrick J. Wong wrote: > > > > > On Mon, Oct 28, 2019 at 05:17:05PM +0800, Ian Kent wrote: > > > > > > Hi Darrick, > > > > > > > > > > > > Unfortunately I'm having a bit of trouble with my USB > > > > > > keyboard > > > > > > and random key repeats, I lost several important messages > > > > > > this > > > > > > morning due to it. > > > > > > > > > > > > Your report of the xfstests generic/361 problem was one of > > > > > > them > > > > > > (as was Christoph's mail about the mount code location, > > > > > > I'll > > > > > > post > > > > > > on that a bit later). So I'm going to have to refer to the > > > > > > posts > > > > > > and hope that I can supply enough context to avoid > > > > > > confusion. > > > > > > > > > > > > Sorry about this. > > > > > > > > > > > > Anyway, you posted: > > > > > > > > > > > > "Dunno what's up with this particular patch, but I see > > > > > > regressions > > > > > > on > > > > > > generic/361 (and similar asserts on a few others). The > > > > > > patches > > > > > > leading > > > > > > up to this patch do not generate this error." > > > > > > > > > > > > I've reverted back to a point more or less before moving > > > > > > the > > > > > > mount > > > > > > and super block handling code around and tried to reproduce > > > > > > the > > > > > > problem > > > > > > on my test VM and I din't see the problem. > > > > > > > > > > > > Is there anything I need to do when running the test, other > > > > > > have > > > > > > SCRATCH_MNT and SCRATCH_DEV defined in the local config, > > > > > > and > > > > > > the > > > > > > mount point, and the device existing? > > > > > > > > > > Um... here's the kernel branch that I used: > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=mount-api-crash > > > > > > > > Ok, I'll see what I can do with that. > > > > > > > > > Along with: > > > > > > > > > > MKFS_OPTIONS -- -m crc=0 > > > > > > > > Right. > > > > > > > > > MOUNT_OPTIONS -- -o usrquota,grpquota > > > > > > > > It looked like generic/361 used only the SCRATCH_DEV so I > > > > thought > > > > that meant making a file system and mounting it within the > > > > test. > > > > > > Yes. MOUNT_OPTIONS are used to mount the scratch device (and in > > > my > > > case > > > the test device too). > > > > > > > > and both TEST_DEV and SCRATCH_DEV pointed at boring scsi > > > > > disks. > > > > > > > > My VM disks are VirtIO (file based) virtual disks, so that > > > > sounds > > > > a bit different. > > > > > > > > Unfortunately I can't use raw disks on the NAS I use for VMs > > > > and > > > > I've migrated away from having a desktop machine with a couple > > > > of > > > > disks to help with testing. > > > > > > > > I have other options if I really need to but it's a little bit > > > > harder to setup and use company lab machines remotely, compared > > > > to > > > > local hardware (requesting additional disks is hard to do), and > > > > I'm not sure (probably not) if they can/will use raw disks (or > > > > partitions) either. > > > > > > Sorry, I meant 'boring SCSI disks' in a VM. > > > > > > Er let's see what the libvirt config is... > > > > > > <disk type='file' device='disk'> > > > <driver name='qemu' type='raw' cache='unsafe' > > > discard='unmap'/> > > > <source file='/run/mtrdisk/a.img'/> > > > <target dev='sda' bus='scsi'/> > > > <address type='drive' controller='0' bus='0' target='0' > > > unit='0'/> > > > </disk> > > > > > > Which currently translates to virtio-scsi disks. > > > > I could use the scsi driver for the disk I guess but IO is already > > a bottleneck for me. > > > > For my VM disks I have: > > > > <disk type='file' device='disk'> > > <driver name='qemu' type='qcow2' cache='writeback'/> > > <source file='/share/VS-VM/images/F30 test/F30 > > test_2.1565610215' startupPolicy='optional'/> > > <target dev='vdc' bus='virtio'/> > > <address type='pci' domain='0x0000' bus='0x00' slot='0x08' > > function='0x0'/> > > </disk> > > > > I'm pretty much restricted to cow type VM disks if I don't do > > some questionable manual customization to the xml, ;) > > > > In any case the back trace you saw looks like it's in the mount/VFS > > code > > so it probably isn't disk driver related. > > > > I'll try and reproduce it with a checkout of you branch above. > > I guess this is where things get difficult. > > I can't reproduce it, I tried creating an additional VM disk that > uses a SCSI controller as well but no joy. > > I used this config: > [xfs] > FSTYPE=xfs > MKFS_OPTIONS="-m crc=0" > MOUNT_OPTIONS="-o usrquota,grpquota" > TEST_DIR=/mnt/test > TEST_DEV=/dev/vdb > TEST_LOGDEV=/dev/vdd > SCRATCH_MNT=/mnt/scratch > SCRATCH_DEV=/dev/sda > SCRATCH_LOGDEV=/dev/vde > > and used: > ./check -s xfs generic/361 > > Perhaps some of the earlier tests played a part in the problem, > I'll try running all the tests next ... > > Perhaps I'll need to try a different platform ... mmm. Well, that was rather more painful that I had hoped. I have been able to reproduce the problem by using a libvirt VM on my NUC desktop. That raises the question of whether the (older version) qemu on my NAS is at fault or the newer libvirt is at fault. I don't think it's the raw vs. qcow virtaul disk difference but I may need to check that in the libvirt setup. I think a bare metal install should be definitive ... what do you think Darrick? Ian