Re: Soft lockups on kerberised NFSv4.0 clients

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- Original Message -----
> From: "Tuomas Räsänen" <tuomasjjrasanen@xxxxxxxxxx>
> 
> The lockup mechnism seems to be as follows: the process (which is always
> firefox) is killed, and it tries to unlock the file (which is always a
> mmapped sqlite3 WAL index) which still has some pending IOs going on. The
> return value of nfs_wait_bit_killable() (-ERESTARTSYS from
> fatal_signal_pending(current)) is ignored and the process just keeps looṕing
> because io_count seems to be stuck at 1 (I still don't know why..). 

I wrote a simple program which simulates the behavior described above
and causes softlockups (see the bottom of the file).

Here's what it does:
- creates and opens jamfile.dat (10M)
- locks the file with flock
- spawns N threads which all:
  - mmap the whole file and write to the map
- unlocks the file after spawning threads

Sometimes unlocking flock() blocks for a while, waiting for pending
IOs [*]. If the process is killed during unlock (signaled SIGINT before the
program has printed 'unlock ok'), it seems to get stuck: pending IOs are
not finished and -ERESTARTSYS from nfs_wait_bit_killable() is not
handled, causing the task to loop inside __nfs_iocounter_wait()
indefinitely.

How to cause soft lockups:

1. Compile: gcc -pthread -o jam jam.c

2. Run ./jam

3. Press C-c shortly after running the script, after 'unlock' but before
   'unlock ok' is printed

4. You might need to repeat steps 2. and 3. couple of times

[*]: Sometimes flock() seem to block for *very* long time (for ever?),
     but sometimes only for a short period of time. But regarding this
     problem, it does not matter: whenever the task is killed during the
     unlock, the process freezes.

Applying the patch from my previous mail fixes the soft lockup issue,
because the task does not get into a infinite (or at least indefinite)
loop because interruptible wait_on_bit() is used instead. But what are
its side-effects? Is it completely brain-dead idea?

jam.c:

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/file.h>
#include <sys/mman.h>
#include <unistd.h>

#define MAP_SIZE (sizeof(char) *  1024 * 1024 * 10)
#define THREADS 4

void *work_on_file(void *const arg)
{
	int i;
	int fd;
	char *map;

	fd = *((int *) arg);
	map = (char *) mmap(NULL, MAP_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

	printf("write begins\n");
	for (i = 0; i < MAP_SIZE; ++i) {
		map[i] = 'a';
	}
	printf("write ends\n");

	return NULL;
}

int main(void)
{
	int i;
	pthread_t *threads;
	int fd;

	fd = open("jamfile.dat", O_RDWR | O_CREAT);
	ftruncate(fd, MAP_SIZE);

	threads = malloc(sizeof(pthread_t) * THREADS);

	printf("lock\n");
	if (flock(fd, LOCK_EX) == -1) {
		perror("failed to lock");
		return -1;
	}
	printf("lock ok\n");

	for (i = 0; i < THREADS; ++i) {
		pthread_attr_t attr;
		pthread_attr_init(&attr);
		pthread_create(&threads[i], &attr, &work_on_file, &fd);
		pthread_attr_destroy(&attr);
	}

	printf("unlock\n");
	if (flock(fd, LOCK_UN) == -1) {
		perror("failed to unlock");
		return -1;
	}
	printf("unlock ok\n");

	for (i = 0; i < THREADS; ++i) {
		pthread_join(threads[i], NULL);
	}

	free(threads);

	return close(fd);
}

-- 
Tuomas
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux