Re: Duplicated files in the pristine FC4t2 installation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> But I think the whole problem is silly as well, FWIW.

When Warren brought this up on IRC a while back, I wrote the following
script and rand it on a rawhide everything install.  This fails to take
into account files that are already hardlinked, so and its results might
well be significantly inflated.  (Someone who cares could hack it further
to check installed names of a duplicate file for being the same inode.)

Total 408578931 bytes in 43107 inodes

That's a max of < 400M on an install that is something 8.5-9G.
So the issue is worth at most on the order of 5% of disk space,
and that is probably a very high estimate.


rpm -qa --qf '[%{FILEMD5S}  %{FILENAMES} %{FILESIZES} %{SOURCERPM}\n]' |
awk '
NF < 4 { next } # directory
{
  md5_name[$1] = $2;
  md5_srpm[$1] = $4;
  info = $2 " " $4;
  if ($1 in sizes) {
    if ($3 != sizes[$1]) print "!!!", $1 ":", info, "VS", md5[info]
  } else {
    sizes[$1] = $3;
  }
  if ($1 in md5) {
    if (info == md5[$1]) next;
    for (i = 1; i < dups[$1]; ++i)
      if (dupinfo[$1 "," i] == info)
        next;
    dups[$1]++;
    dupinfo[$1 "," dups[$1]] = info;
  } else {
    md5[$1] = info;
  }
}
END {
  dupsize = dupcount = 0;
  for (sum in dups) {
    n = dups[sum];
    dupcount += n;
    dupsize += n * sizes[sum];
    print n, "dups:", sum, " ==> ", (n * sizes[sum]);
    print "\t" md5[sum];
    for (i = 1; i <= n; ++i)
      print "\t" dupinfo[sum "," i];
  }
  print "Total", dupsize, "bytes in", dupcount, "inodes";
}
'

-- 
fedora-devel-list mailing list
fedora-devel-list@xxxxxxxxxx
http://www.redhat.com/mailman/listinfo/fedora-devel-list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux