Re: Text file manipulation in CentOS?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Tue, May 11, 2010 at 08:25:43AM +0000, sheraznaz@xxxxxxxxx wrote:
> >>To be more specific, I need to find how many distinct records are there in say column#1?
> 
> awk '{print $1}' filename | sort -u | wc -l
> 
> This will show how many unique entries are present in column one (use awk -F to change delimiter e.g awk -F ":" for : delimiter)
> 
> >> How can I filter out the distinct records with number of occurances less than a pre-determined threshold?
> 
> I don't quite understand this part.
> 
> awk '{print $1}' filename | sort | uniq -c | sort -rn
> 
> Will give you a number of occurrences (reverse numerically sorted) of uniq data from column one. 
> 
> Now I think you want to put that through a loop and only show those that are less than threshold?

If I understand correctly, you can pipe your output to: `awk '{a=$1} {if
(a > 3)   print a}''. `a' is awk variable. `$1' is first column of awk
input so you probably need to change it.

-- 
Dominik Zyla

Attachment: pgpFiQmT6uwRp.pgp
Description: PGP signature

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux