Re: 2.6.24-rc6 reproducible raid5 hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 27 Dec 2007, Justin Piszcz wrote:

> With that high of a stripe size the stripe_cache_size needs to be greater than
> the default to handle it.

i'd argue that any deadlock is a bug...

regardless i'm still seeing deadlocks with the default chunk_size of 64k 
and stripe_cache_size of 256... in this case it's with a workload which is 
untarring 34 copies of the linux kernel at the same time.  it's a variant 
of doug ledford's memtest, and i've attached it.

-dean
#!/usr/bin/perl

# Copyright (c) 2007 dean gaudet <dean@xxxxxxxxxx>
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
# OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
# ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.

# this idea shamelessly stolen from doug ledford

use warnings;
use strict;

# ensure stdout is not buffered
select(STDOUT); $| = 1;

my $usage = "usage: $0 linux.tar.gz /path1 [/path2 ...]\n";
defined(my $tarball = shift) or die $usage;
-f $tarball or die "$tarball does not exist or is not a file\n";

my @paths = @ARGV;
$#paths >= 0 or die "$usage";

# determine size of uncompressed tarball
open(GZIP, "-|") || exec "gzip", "--quiet", "--list", $tarball;
my $line = <GZIP>;
my ($tarball_size) = $line =~ m#^\s*\d+\s*(\d+)#;
defined($tarball_size) or die "unexpected result from gzip --quiet --list $tarball\n";
close(GZIP);

# determine amount of memory
open(MEMINFO, "</proc/meminfo")
        or die "unable to open /proc/meminfo for read: $!\n";
my $total_mem;
while (<MEMINFO>) {
  if (/^MemTotal:\s*(\d+)\s*kB/) {
    $total_mem = $1;
    last;
  }
}
defined($total_mem) or die "did not find MemTotal line in /proc/meminfo\n";
close(MEMINFO);
$total_mem *= 1024;

print "total memory: $total_mem\n";
print "uncompressed tarball: $tarball_size\n";
my $nr_simultaneous = int(1.2 * $total_mem / $tarball_size);
print "nr simultaneous processes: $nr_simultaneous\n";

sub system_or_die {
  my @args = @_;
  system(@args);
  if ($? == 1) {
    my $msg = sprintf("%s failed to exec %s: $!\n", scalar(localtime), $args[0]);
  }
  elsif ($? & 127) {
    my $msg = sprintf("%s %s died with signal %d, %s coredump\n",
        scalar(localtime), $args[0], ($? & 127), ($? & 128) ? "with" : "without");
    die $msg;
  }
  elsif (($? >> 8) != 0) {
    my $msg = sprintf("%s %s exited with non-zero exit code %d\n",
        scalar(localtime), $args[0], $? >> 8);
    die $msg;
  }
}

sub untar($) {
  mkdir($_[0]) or die localtime()." unable to mkdir($_[0]): $!\n";
  system_or_die("tar", "-xzf", $tarball, "-C", $_[0]);
}

print localtime()." untarring golden copy\n";
my $golden = $paths[0]."/dma_tmp.$$.gold";
untar($golden);

my $pass_no = 0;
while (1) {
  print localtime()." pass $pass_no: extracting\n";
  my @outputs;
  foreach my $n (1..$nr_simultaneous) {
    # treat paths in a round-robin manner
    my $dir = shift(@paths);
    push(@paths, $dir);

    $dir .= "/dma_tmp.$$.$n";
    push(@outputs, $dir);

    my $pid = fork;
    defined($pid) or die localtime()." unable to fork: $!\n";
    if ($pid == 0) {
      untar($dir);
      exit(0);
    }
  }

  # wait for the children
  while (wait != -1) {}

  print localtime()." pass $pass_no: diffing\n";
  foreach my $dir (@outputs) {
    my $pid = fork;
    defined($pid) or die localtime()." unable to fork: $!\n";
    if ($pid == 0) {
      system_or_die("diff", "-U", "3", "-rN", $golden, $dir);
      system_or_die("rm", "-fr", $dir);
      exit(0);
    }
  }

  # wait for the children
  while (wait != -1) {}

  ++$pass_no;
}

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux