testcase 011 trips and ASSERT in x86_64 too

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

A while back I reported that the test case 011 trips an ASSERT on POWER
architecture, but not in x86_64.

I started comparing the code and quickly realized that the problem is
_not_ arch specific, but could make the test case 011 fail, with reduced
log on x86_64. But, I could make the POWER not fail by simply increasing
the file system size to 100G (from 20G).

After some debug I found that I get into this racy situation when the
free threshold drops and we flush the log buffer to the disk.
i.e in function xlog_grant_push_ail(), if we return at

       if (free_blocks >= free_threshold)
                return;
we do not get into the race that trips the ASSERT.

Then I started comparing the behavioral difference bet the two ARCHs,
and I found that in POWER I see more number of threads at a time (max of
4 threads) in the function xlog_grant_log_space(), whereas in x86_64 I
see max of only two and mostly it is only one.

I also noted that in POWER test case 011 takes about 8 seconds whereas
in x86_64, it takes about 165 seconds.

So, I ventured into the core of test case 011, dirstress, and found that
simply creating 1000s of files under a directory takes very long time in
x86_64 compare to POWER(1 min 15s Vs 2s)
Note: Attached is the source file (stripped version of dirstress.c) for
the program b.
------------------POWER----------------------------------
root@test135 chandra]# uname -a 
Linux test135.beaverton.ibm.com 2.6.38-rc7 #1 SMP Fri Mar 4 09:36:14 PST
2011 ppc64 ppc64 ppc64 GNU/Linux
[root@test135 chandra]# grep -e xfs -e home /proc/mounts
none /selinux selinuxfs rw,relatime 0 0
/dev/mapper/vg_test135-lv_home /home ext4
rw,seclabel,relatime,barrier=1,data=ordered 0 0
/dev/sda8 /mnt/xfsMntPt xfs rw,seclabel,relatime,attr2,noquota 0 0
[root@test135 chandra]# ###### Run test on XFS filesystem
[root@test135 chandra]# time ./b /mnt/xfsMntPt/dir 10000 1
i 0

real    0m2.055s
user    0m0.011s
sys     0m0.732s
[root@test135 chandra]# ###### Run test of ext4 filesystem
[root@test135 chandra]# time ./b /home/dir 10000 1
i 0

real    0m0.355s
user    0m0.009s
sys     0m0.304s
--------------------x86_64----------------------------------------
[root@test27 chandra]# uname -a
Linux test27 2.6.38-rc7 #4 SMP Wed Mar 9 08:37:32 PST 2011 x86_64 x86_64
x86_64 GNU/Linux
[root@test27 chandra]# grep -e xfs -e home /proc/mounts
none /selinux selinuxfs rw,relatime 0 0
/dev/sdc3 /home ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
/dev/sdb1 /mnt/xfsMntPt xfs rw,seclabel,relatime,attr2,noquota 0 0
[root@test27 chandra]# ###### Run test on XFS filesystem
[root@test27 chandra]# time ./b /mnt/xfsMntPt/dir 10000 1
i 0

real    1m15.700s
user    0m0.030s
sys     0m1.679s
[root@test27 chandra]# ###### Run test of ext4 filesystem
[root@test27 chandra]# time ./b /home/dir 10000 1
i 0

real    0m0.317s
user    0m0.010s
sys     0m0.306s
-------------------------------------------------------------------

After quite an amount of debug I found that I can make it trip the
ASSERT in x86_64 also, if I start sufficient of threads accessing the
file system. Basically, "./b /mnt/xfsMntPt/dir 100 100" trips the
ASSERT.

I have two questions:

1. Does anybody have any explanation why x86_64 is so slow, compared
with POWER ?

2. Any suggestions on how to debug and fix the race condition ? 

Thanks & Regards,

chandra
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>

#define MKNOD_DEV 0

int
dirstress(char *dir, int num_files, int prefix)
{
	int i, err, fd;
	char buf[128];

	err = chdir(dir);
	if (err) {
		perror("cannot chdir to directory");
		return 1;
	}

	sprintf(buf, "dir%d", prefix);
	err = mkdir(buf, 0777);
	if (err) {
		perror("mkdir");
		return 1;
	}
	err = chdir(buf);
	if (err) {
		perror("cannot chdir to prefix");
		return 1;
	}

	for (i = 0; i < num_files; i++) {
		sprintf(buf, "XXXXXXXXXXXX.%d.%d", prefix, i);
		switch (i % 4) {
		case 0:
			/*
			 * Create a file
			 */
			fd = creat(buf, 0666);
			if (fd > 0) {
				close(fd);
			} else {
                                fprintf(stderr,"!! close %s failed\n", buf);
                                perror("close");
                        }
                        
			break;
		case 1:
			/*
			 * Make a directory.
			 */
			if (mkdir(buf, 0777)) {
                            fprintf(stderr,"!! mkdir %s 0777 failed\n", buf);
                            perror("mkdir");
                        }
                        
			break;
		case 2:
			/*
			 * Make a symlink
			 */
			if (symlink(buf, buf)) {
                            fprintf(stderr,"!! symlink %s %s failed\n", buf, buf);
                            perror("symlink");
                        }
                        
			break;
		case 3:
			/*
			 * Make a dev node
			 */
			if (mknod(buf, S_IFCHR | 0666, MKNOD_DEV)) {
                            fprintf(stderr,"!! mknod %s 0x%x failed\n", buf, MKNOD_DEV);
                            perror("mknod");
                        }
                        
			break;
		default:
			break;
		}
	}
	return 0;
}

main(int argc, char *argv[])
{
	pid_t pid;
	int i, count;

	if (argc != 4) {
		printf("Usage: %s directory num_files numprocs\n", argv[0]);
		exit(1);
	}
#if 0
	exit(dirstress(argv[1], atoi(argv[2])));
#else
	for (i = 0; i < atoi(argv[3]); i++) {
		printf("i %d\n", i);
		pid = fork();
		if (pid < 0) {
			perror("fork");
			goto done;
		}
		if (pid == 0) {
			exit(dirstress(argv[1], atoi(argv[2]), i));
		}
	}
done:
	while (wait(&count) != -1)
		;
	exit(0);
#endif
}
_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs

[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux