You're replying to a comment by Jimmy.
You're replying to a comment by Jimmy.
I am being sponsored by Syntress since 2007! They bought me an amazing dedicated server to run catonmat on. If you're looking web services in Chicago area, I highly recommend the Syntress guys!
I love to read science books. They make my day and I get ideas for awesome blog posts, such as Busy Beaver, On Functors, Recursive Regular Expressions and many others.
Take a look at my
Amazon wish list, if you're curious about what I have planned reading next, and want to surprise me. :)


Hi Wondering If I could have some help - sorry for double post
I have a column of numbers sorted in ascending order. I am trying to remove the last 5% of the records then count avg sum etc
The total records are : 99183
I only want to sum the first : 94222 (discarding the outliers 5%)
Where the value of 94222 is in the command line is where I want to use the variable NFIVE -
but if I put variable in awk counts all the records and sums no records.
Desired output:
cat <file> |tr -s '=' ' '|sort -k5n | awk '{NFIVE=NR*.95}; {if (NR<94222) TOTAL+=$5} END{printf("COUNT:%d, TOTAL:%d,MEAN:%d\n",NFIVE,TOTAL,TOTAL/NFIVE)}'OUTPUT: (and correct values)
COUNT:94222, TOTAL:19079403, MEAN:202
Incorrect values I get if using NFIVE
EG:
cat <FILE> |tr -s '=' ' '|sort -k5n | awk '{NFIVE=NR*.95}; {if (NR<NFIVE) TOTAL+=$5} END{printf("COUNT:%d, TOTAL:%d, MEAN:%d\n",NFIVE,TOTAL,TOTAL/NFIVE)}'COUNT:94222, TOTAL:0, MEAN:0
Thanks for any assitance
Reply To This Comment