About user agent strings

I was surprised when I saw the length of the Chrome user agent string last week:

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.X.Y.Z Safari/525.13

And in our logs:

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.2.149.29 Safari/525.13

a user agent string of 119 characters. It looks quite a waste of space but is Google Chrome the only one? Surprisingly, Chrome is far from the worst.

Best of one of our log file:

  • 641 characters: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4 GoogleToolbarFF 3.0.20070420 GoogleToolbarFF 3.0.20070420 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525 GoogleToolbarFF 3.0.20070525
  • 337 characters: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; DA4BB049-ADVLOVER|0001|DSL; C:\DOCUME~1\everey\CONFIG~1\Temp\; C:\DOCUME~1\zulcan\CONFIG~1\Temp\; C:\DOCUME~1\nilfer\CONFIG~1\Temp\; C:\DOCUME~1\mirmor\CONFIG~1\Temp\; C:\DOCUME~1\ASTNU~1\CONFIG~1\Temp\; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 2.0.50727)
  • 290 characters: Mozilla/5.0 (Windows; U LupinV2.u2/20080827 LupinV2.u2/20080828 LupinV2.u2/20080829 LupinV2.u2/20080830 LupinV2.u2/20080831 LupinV2.u2/20080902 LupinV2.u2/20080903 LupinV2.u2/20080909 LupinV2.u2/20080911 LupinV2.u2/20080912; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1
  • 272 characters: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; FunWebProducts; SU 3.011; User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; http://bsalsa.com) ; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 1.1.4322; .NET CLR 3.5.30428; .NET CLR 3.0.30422)
  • 202 characters: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media Center PC 4.0; IE7-01NET.COM-1.1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; InfoPath.2; IE7-01NET.COM-1.1)

The full list

How to extract user agent strings from a HTTP log file?

  • Print user agent strings with its length:
awk -F\" '{print length($6)" "$6}'  access.log
  • print user agent strings that are more than 200 characters length:
awk -F\" '{if ($6 > 200) print length($6)" "$6}'  access.log

In those examples, the access.log file has this log format:

xxx.xxx.xxx.xxx \
www.domain.com - \
[15/Sep/2008:00:00:00 +0200] \
"GET / HTTP/1.1" 200 4242 \
"http://www.domain.com/" \
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1"

About bandwith

If you take an average user agent string likes the Firefox one, you have a 91 charaters string.

  • How many entries with a user agent string longer than 120 characters: 249586
awk -F\" '{if (length($6) > 120) print length($6)}' access.log | wc -l
  • Size waste with string longer than 120 characters: 5.67 M
awk -F\" '{if (length($6) > 120) SUM += length($6)-120} END {print SUM/1024/1024" Mo"}'  access.log
  • Bandwidth waste per month for this server: 170M…

This entry was written by CharlyBr, posted on September 22, 2008 at 7:48 am, filed under Logs, Web and tagged , , . Leave a comment or view the discussion at the permalink.

Using Logcheck

Logcheck is a tool to parse system logs and send summaries by email. It filters out logs with a regular expressions database to suppress common/normal entries.

Are you reading / checking your log files? Too many servers? logcheck will help you in this task and eliminates the noise.

Installing on Debian

# apt-get install logcheck
Reading package lists... Done
Building dependency tree... Done
The following extra packages will be installed:
lockfile-progs logtail
Suggested packages:
syslog-summary
Recommended packages:
logcheck-database
The following NEW packages will be installed:
lockfile-progs logcheck logtail
0 upgraded, 3 newly installed, 0 to remove and 6 not upgraded.
Need to get 110kB of archives.
After unpacking 428kB of additional disk space will be used.
Do you want to continue [Y/n]?

Also install logcheck-database which contains lots of rules

# apt-get install logcheck-database

Config files

  • /etc/logcheck/logcheck.conf
    • SENDMAILTO=”root” – your email address
  • /etc/logcheck/logcheck.logfiles
    • configure which logfiles to analyze
  • /etc/cron.d/logcheck
    • logcheck cron (by default, logcheck runs every hour)

You can try it by executing the following command:

# su -s /bin/bash -c "/usr/sbin/logcheck" logcheck

Your mailbox should now contains a report from logcheck if some unusual log entries have been found.

Links

This entry was written by CharlyBr, posted on September 16, 2008 at 11:22 am, filed under Logs and tagged . Leave a comment or view the discussion at the permalink.

Rotate Nginx log files under FreeBSD

To rotate your nginx log files, you can use the log file handler provided by FreeBSD: newsyslog.

Configuring /etc/newsyslog.conf

/var/log/nginx-access.log               644  7     1024 *     JC /var/run/nginx.pid
/var/log/nginx-error.log                644  7     1024 *     JC /var/run/nginx.pid

Before log rotation:

-rw-r--r--  1 root  wheel    104278002 Jul 16 11:35 nginx-access.log
-rw-r--r--  1 root  wheel      1509531 Jul 16 11:17 nginx-error.log

After log rotation:

-rw-r--r--  1 root  wheel        967 Jul 16 12:42 nginx-access.log
-rw-r--r--  1 root  wheel    5310443 Jul 16 12:41 nginx-access.log.0.bz2
-rw-r--r--  1 root  wheel         77 Jul 16 12:41 nginx-error.log
-rw-r--r--  1 root  wheel      37552 Jul 16 12:41 nginx-error.log.0.bz2

Links

This entry was written by CharlyBr, posted on July 17, 2008 at 7:15 am, filed under Logs and tagged , , . Leave a comment or view the discussion at the permalink.

Rotate Apache logs with Cronolog

Cronolog is log rotation program which gives you a lot of options to template the log destination files. The common use is to split logs by year / month / day.

Here is how to configure Apache to send log entries to cronolog :

CustomLog "|/usr/sbin/cronolog /home/log/apache2/%Y-%m-%d_domain.com_access.log" combined

This will create a log file named 2008-06-02_domain.com_access.log for today.

Cronolog reads log entries from standard input and writes them to the output file specified by your template.

More examples

  • Rotate by month :
CustomLog "|/usr/sbin/cronolog /home/log/apache2/%Y-%m_domain.com_access.log" combined
  • Rotate by week number :
CustomLog "|/usr/sbin/cronolog /home/log/apache2/%Y-%W_domain.com_access.log" combined
  • Rotate hourly
CustomLog "|/usr/sbin/cronolog /home/log/apache2/%H_domain.com_access.log" combined

Links

This entry was written by CharlyBr, posted on June 4, 2008 at 7:33 am, filed under Logs, http and tagged , . Leave a comment or view the discussion at the permalink.