bash/awk equivalent code to gnu datamash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear fellow fedora users,

If I have a data file called 15.dat with the following content:

$ cat 15.dat
1
3
1
0
2

And I want to find min, quartile 1, median, quartile 3 and maximum (Five number summary)
We can use datamash like

$ cat 15.dat | datamash min 1 q1 1 median 1 q3 1 max 1                  0       1       1.5     2.75    6

Q3 is reported as 2.75 but if we split the data file in half the number is 3.

$ sort 15.dat
0
1
1
2
3
6

$ cat GF19.dat
14
0
4
0
0
1
1
7
1
0
3
1
2
0
$

$ sort GF19.dat
0
0
0
0
0
1
1
1
1
14
2
3
4
7
$
Is incorrect, the 14 is biggest or max
We use -n for numeric

$ sort GF19.dat -n
0
0
0
0
0
1
1
1
1
2
3
4
7
14

It works but q3 is also 2.75 but by hand is 3

$ cat GF19.dat | datamash min 1 q1 1 median 1 q3 1 max 1
0       0       1       2.75    14
$

If we apply a code using sort and awk
From

https://unix.stackexchange.com/questions/13731/is-there-a-way-to-get-the-min-max-median-and-average-of-a-list-of-numbers-in

we can get min, max, median and average

sort -n | awk '{a[i++]=$0;s+=$0}END{print a[0],a[i-1],(a[int(i/2)]+a[int((i-1)/2)])/2,s/i}'

How can we find q1 and q3 to generate five number summary?  And does it give 3 for q3 for both files.  I want to use datamash, but question why it outputs 2.75 and not 3?

7
14
$ cat 15.dat | sort -n | awk '{a[i++]=$0;s+=$0}END{print a[0],a[i-1],(a[int(i/2)]+a[int((i-1)/2)])/2,s/i}'
0 6 1.5 2.16667
$

It outputs min, max, median and average.  Average is optional.  Only min, q1, median, q3 and max is needed.

Thank you in advance,


Antonio

Sent from ProtonMail, encrypted email based in Switzerland.

_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux