If you need to sum the total size of files in a directory or matching a pattern, an easy solution is to use awk.
I needed to calculate this total for a set of javascript files, I used this command line:
$ find App/ -name '*.js' -exec ls -l \{\} \; | awk '{sum+=$5} END {print sum}'
1929403
For a human readable result, you can divide your result and use printf to format it:
$ find App/ -name '*.js' -exec ls -l \{\} \; | awk '{sum+=$5} END {printf("%.2fM\n", sum/1024/1024)}'
1.84M
This entry was written by , posted on April 17, 2009 at 11:12 am, filed under Command line and tagged awk. Leave a comment or view the discussion at the permalink.
While monitoring a http/php server, I needed to do some statistics about php-cgi memory usage.
Playing with memory_limit in PHP, we wanted to know the average memory usage per php-cgi process. This is easily calculated with our best friend awk.
First, get the number of php running processes:
# ps aux | grep php-cgi | grep -v grep | wc -l 126
Then, use awk to calculate the average memory usage for these processes:
# ps aux | grep --exclude=grep php-cgi | grep -v grep | awk 'BEGIN{s=0;}{s=s+$6;}END{print s/126;}'
33987.8
The number used in the calculation is the field RSS given by ps. The ps manual page says:
rss: resident set size, the non-swapped physical memory that a task has used (in kiloBytes)
You can also calculate the total memory used by all php-cgi processes:
# ps aux | grep --exclude=grep php-cgi | grep -v grep | awk 'BEGIN{s=0;}{s=s+$6;}END{print s;}'
4302028
If you need to watch the trend of this average memory usage, a little shell loop does the trick:
# while [ 1 ]; do ps aux | grep --exclude=grep php-cgi | grep -v grep | awk 'BEGIN{s=0;}{s=s+$6;}END{print s/126;}'; sleep 2; done
34401.3
34405.1
34408.4
34409.4
34414.2
34417
This entry was written by , posted on March 13, 2009 at 4:14 pm, filed under Benchmarks, Command line, Monitoring and tagged awk, memory, php, shell. Leave a comment or view the discussion at the permalink.
I was surprised when I saw the length of the Chrome user agent string last week:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.X.Y.Z Safari/525.13
And in our logs:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.2.149.29 Safari/525.13
a user agent string of 119 characters. It looks quite a waste of space but is Google Chrome the only one? Surprisingly, Chrome is far from the worst.
Best of one of our log file:
awk -F\" '{print length($6)" "$6}' access.log
awk -F\" '{if ($6 > 200) print length($6)" "$6}' access.log
In those examples, the access.log file has this log format:
xxx.xxx.xxx.xxx \ www.domain.com - \ [15/Sep/2008:00:00:00 +0200] \ "GET / HTTP/1.1" 200 4242 \ "http://www.domain.com/" \ "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1"
If you take an average user agent string likes the Firefox one, you have a 91 charaters string.
awk -F\" '{if (length($6) > 120) print length($6)}' access.log | wc -l
awk -F\" '{if (length($6) > 120) SUM += length($6)-120} END {print SUM/1024/1024" Mo"}' access.log
This entry was written by , posted on September 22, 2008 at 7:48 am, filed under Logs, Web and tagged awk, bandwidth, user agent. Leave a comment or view the discussion at the permalink.