Apache/http monitoring: monitor http traffic in realtime using httptop

Server monitoring is a big part of running a solid web site.  As an admin, you must know what is going on your server.  One of the tools most Linux/Unix admins are used to is called “top”.  “top” by itself is a very powerful tool.  Here is a quick guide on how to read output from top:  introduction to load averages under top.  It just makes sense that somebody went and created httptop to monitor http traffic.

Install perl modules:

install Term::ReadKey
install File::Tail
install Time::HiRes

Now copy paste the script below and save it in a location and set +x attribute on it so you can execute it.  On my setup, I have the script under /usr/bin/httptop:

#!/usr/bin/perl -w
use Time::HiRes qw( time );
use File::Tail (  );
use Term::ReadKey;
use Getopt::Std;
use strict;
### Defaults you might be interested in adjusting.
my $Update = 2; # update every n secs
my $Backtrack = 250; # backtrack n lines on startup
my @Paths = qw(
%
/title/%/logs/access_log
/var/log/httpd/%/access_log
/usr/local/apache/logs/%/access_log
);
my $Log_Format = "combined";
my %Log_Fields = (
combined => [qw/ Host x x Time URI Response x Referer Client /],
vhost => [qw/ VHost Host x x Time URI Response x Referer Client /]
);
### Constants & other thingies. Nothing to see here. Move along.
my $Version = "0.4.1";
sub by_hits_per (  ) { $b->{Rate} <=> $a->{Rate} }
sub by_total (  ) { $b->{Total} <=> $a->{Total} }
sub by_age (  ) { $a->{Last} <=> $b->{Last} }
my $last_field = "Client";
my $index = "Host";
my $show_help = 0;
my $order = \&by_hits_per;
my $Help = "htlwufd?q";
my %Keys = (
h => [ "Order by hits/second" => sub { $order = \&by_hits_per } ],
t => [ "Order by total recorded hits" => sub { $order = \&by_total } ],
l => [ "Order by most recent hits" => sub { $order = \&by_age } ],
w => [ "Show remote host" => sub { $index = "Host" } ],
u => [ "Show requested URI" => sub { $index = "URI" } ],
f => [ "Show referring URL" => sub { $index = "Referer" } ],
d => [ "Show referring domain" => sub { $index = "Domain" } ],
'?' => [ "Help (this thing here)" => sub { $show_help++ } ],
q => [ "Quit" => sub { exit } ]
);
my @Display_Fields = qw/ Host Date URI Response Client Referer Domain /;
my @Record_Fields = qw/ Host URI Referer Domain /;
my $Max_Index_Width = 50;
my $Initial_TTL = 50;
my @Months = qw/ Jan Feb Mar Apr May Jun Jul Aug Sep Nov Dec /;
my %Term = (
HOME => "\033[H",
CLS => "\033[2J",
START_TITLE => "\033]0;", # for xterms etc.
END_TITLE => "\007",
START_RV => "\033[7m",
END_RV => "\033[m"
);
my ( %hist, %opt, $spec );
$SIG{INT} = sub { exit };
END { ReadMode 0 };
### Subs.
sub refresh_output
{
my ( $cols, $rows ) = GetTerminalSize;
my $show = $rows - 3;
my $count = $show;
my $now = (shift || time);
for my $type ( values %hist ) {
for my $peer ( values %$type ) {
# if ( --$peer->{_Ttl} > 0 ) {
my $delta = $now - $peer->{Start};
if ( $delta >= 1 ) {
$peer->{ Rate } = $peer->{ Total } / $delta;
} else {
$peer->{ Rate } = 0
}
$peer->{ Last } = int( $now - $peer->{ Date } );
# } else {
# delete $type->{$peer}
# }
}
}
$count = scalar( values %{$hist{$index}} ) - 1 if $show >= scalar values %{$hist{$index}};
my @list = ( sort $order values %{$hist{$index}} )[ 0 .. $count ];
my $first = 0;
$first = ( $first <= $_ ? $_ + 1 : $first ) for map { $_ ? length($_->{$index}) : 0 } @list;
$first = $Max_Index_Width if $Max_Index_Width < $first;
print $Term{START_TITLE}, "Monitoring $spec at: ", scalar localtime, $Term{END_TITLE} if $ENV{TERM} eq "xterm"; # UGLY!!!
my $help = "Help/?";
my $head = sprintf( "%-${first}s %6s %4s %4s %s (%d total)",
$index, qw{ Hits/s Tot Last }, $last_field,
scalar keys %{$hist{$index}}
);
#
# Truncate status line if need be
#
$head = substr($head, 0, ($cols - length($help)));
print @Term{"HOME", "START_RV"}, $head, " " x ($cols - length($head) - length($help)), $help, $Term{END_RV}, "\n";
for ( @list ) {
# $_->{_Ttl}++;
my $line = sprintf( "%-${first}s %6.3f %4d %3d %s",
substr( $_->{$index}, 0, $Max_Index_Width ), @$_{(qw{ Rate Total Last }, $last_field)} );
if ( length($line) > $cols ) {
substr( $line, $cols - 1 ) = "";
} else {
$line .= " " x ($cols - length($line));
}
print $line, "\n";
}
print " " x $cols, "\n" while $count++ < $show;
}
sub process_line
{
my $line = shift;
my $now = ( shift || time );
my %hit;
chomp $line;
@hit{@{$Log_Fields{$Log_Format}}} = grep( $_, split( /"([^"]+)"|\[([^]]+)\]|\s/o, $line ) );
$hit{ URI } =~ s/HTTP\/1\S+//gos;
$hit{ Referer } = "<unknown>" if not $hit{Referer} or $hit{Referer} eq "-";
( $hit{Domain} = $hit{Referer} ) =~ s#^\w+://([^/]+).*$#$1#os;
$hit{ Client } ||= "<none>";
$hit{ Client } =~ s/Mozilla\/[\w.]+ \(compatible; /(/gos;
$hit{ Client } =~ s/[^\x20-\x7f]//gos;
# if $now is negative, try to guess how old the hit is based on the time stamp.
if ( $now < 0 ) {
my @hit_t = ( split( m![:/\s]!o, $hit{ Time } ))[ 0 .. 5 ];
my @now_t = ( localtime )[ 3, 4, 5, 2, 1, 0 ];
my @mag = ( 3600, 60, 1 );
# If the hit didn't parse right, or didn't happen today, the hell with it.
return unless $hit_t[2] == ( $now_t[2] + 1900 )
and $hit_t[1] eq $Months[ $now_t[1] ]
and $hit_t[0] == $now_t[0];
splice( @hit_t, 0, 3 );
splice( @now_t, 0, 3 );
# Work backward to the UNIX time of the hit.
$now = time;
$now -= (shift( @now_t ) - shift( @hit_t )) * $_ for ( 3600, 60, 1 );
}
$hit{ Date } = $now;
for my $field ( @Record_Fields ) {
my $peer = ( $hist{$field}{$hit{$field}} ||= { Start => $now, _Ttl => $Initial_TTL } );
@$peer{ @Display_Fields } = @hit{ @Display_Fields };
$peer->{ Total }++;
}
}
sub display_help {
my $msg = "httptop v.$Version";
print @Term{qw/ HOME CLS START_RV /}, $msg, $Term{END_RV}, "\n\n";
print " " x 4, $_, " " x 8, $Keys{$_}[0], "\n" for ( split "", $Help );
print "\nPress any key to continue.\n";
}
### Init.
getopt( 'frb' => \%opt );
$Backtrack = $opt{b} if $opt{b};
$Update = $opt{r} if $opt{r};
$Log_Format = $opt{f} if $opt{f};
$spec = $ARGV[0];
die <<End unless $spec and $Log_Fields{$Log_Format};
Usage: $0 [-f <format>] [-r <refresh_secs>] [-b <backtrack_lines>] <logdir | path_to_log>
Valid formats are: @{[ join ", ", keys %Log_Fields ]}.
End
for ( @Paths ) {
last if -r $spec;
( $spec = $_ ) =~ s/%/$ARGV[0]/gos;
}
die "No access_log $ARGV[0] found.\n" unless -r $spec;
my $file = File::Tail->new(
name => $spec,
interval => $Update / 2,
maxinterval => $Update,
tail => $Backtrack,
nowait => 1
) or die "$spec: $!";
my $last_update = time;
my ( $line, $now );
# Backtracking.
while ( $Backtrack-- > 0 ) {
last unless $line = $file->read;
process_line( $line, -1 );
}
$file->nowait( 0 );
ReadMode 4; # Echo off.
print @Term{"HOME", "CLS"}; # Home & clear.
refresh_output;
### Main loop.
while (defined( $line = $file->read )) {
$now = time;
process_line( $line, $now );
while ( $line = lc ReadKey(-1) ) {
$show_help = 0 if $show_help;
$Keys{$line}[1]->(  ) if $Keys{$line};
}
if ( $show_help == 1 ) {
display_help;
$show_help++; # Don't display help again.
} elsif ( $now - $last_update > $Update and not $show_help ) {
$last_update = $now;
refresh_output( $now );
}
}

Save/exit and make sure you make it executable by setting it to +x (chmod +x httptop)

Now you can run httptop by typing:  httptop -f combined -r 1 /usr/local/apache2/logs/access_log

NOTE:  Your access_log file might be in different location.  Point to the right location.  This sets the refresh rate to 1 sec (-r 1).  Now you can run httptop any time you want to checkout how your http traffic is doing.  Remember to press “?” to get help once you are in.

—————

DISCLAIMER: As always, if you find any inaccurate information, please comment and let me know. When you do comment, make sure you give me some references to confirm.

MySQL: How do you enable sphinxse (Sphinx Storage Engine) in your mysql installation?

As you may know mysql fulltext search is not highly scalable.  One of the options to get around this scalability limitation, which I prefer, is to use Sphinx.  You can use Sphinx with out having to alter your mysql installation.  But, if you would like to use from within mysql and not have to worry about how to pass data between Sphinx and MySQL, you can enable sphinxse (sphinx storage engine).  It is not included with mysql by default so you will have to compile it yourself.

Here are the instructions on how to get sphinxse compiled with your mysql installation on CentOS x64.  I am sure same instructions will work for other flavors but I have not tested it.  I will be compiling the most current version of sphinx (0.9.8) with most current stable version of mysql (5.0.51b) at the time of the writing.  Let’s get the appropriate packages first:

wget http://www.sphinxsearch.com/downloads/sphinx-0.9.8.tar.gz
wget http://dev.mysql.com/get/Downloads/MySQL-5.0/mysql-5.0.51b.tar.gz/from/http://mysql.he.net/
tar zxpf sphinx*
tar zxpf mysql*

You will also need “bison”, “patch”, “automake” and “libtool” installed.  Let us just do a yum install for it.

yum -y install bison patch automake libtool

NOTE:  if you don’t install bison, you will get the following error:
sed '/^#/ s|y\.tab\.c|sql_yacc.cc|' y.tab.c >sql_yacc.cct && mv sql_yacc.cct sql_yacc.cc
sed: can't read y.tab.c: No such file or directory
make[2]: *** [sql_yacc.cc] Error 2

Let us continue with patching mysql source with sphinx storage engine (sphinxse) code and compile/install our new binaries.

cd mysql*
patch -p1 < ../sphinx-0.9.8/mysqlse/sphinx.5.0.37.diff #Make sure everything succeeded.
BUILD/autorun.sh
mkdir sql/sphinx
cp ../sphinx-0.9.8/mysqlse/* sql/sphinx
./configure --prefix=/usr/local/mysql --with-sphinx-storage-engine
make
make install

Now start your mysql installtion and check if engine support is compiled in:

mysql> show engines\G
Engine: SPHINX
Support: YES
Comment: Sphinx storage engine 0.9.8

To read more about how to use Sphinx storage engine, please refer to:  Sphinx documentation for using sphinx storage engine

————————————-
DISCLAIMER: Please be smart and use code found on internet carefully. Make backups often. And yeah.. last but not least.. I am not responsible for any damage caused by this posting. Use at your own risk.

Linux: yum options you may not know exist.

Most of the users who work with distributions such as: centos, fedora, redhat, etc use yum as a package update/installer. Most of them know how to do “yum update [packagename]” (to update all or [certain packages]) or they do “yum install packagename” to install certain package(s). But yum can do so much more. Here are some options you may find useful:

Following command will search for the string you specified. Generally this will give you all of the packages which has specified string in title or description. Most of the time you will have to look through a lot of output to find what you are looking for.

yum search string

Probably one of the most important options for yum is provides/whatprovides. If you know what command you need, you can find out what package you have to install in order to have that command available to you.

yum provides (or whatprovides) command

Following command is same as above but with less output.

yum -d 1 provides command

So for example if you are trying to figure out what you need to install to use bunzip2, type:

yum -d 1 provides bunzip2

you will get a similar output as below.

# yum -d 1 provides bunzip2
bzip2.x86_64 1.0.2-13.EL4.3 base
bzip2.x86_64 1.0.2-13.EL4.3 base
man-pages-fr.noarch 0.9.7-13.el4 base
man-pages-ja.noarch 20050215-2.EL4.3 base
man-pages-pl.noarch 0.23-5 base

As you can see bunzip2 is part of bzip2 package. So now you can you just install bzip2.x86_64 to get bunzip2.

To learn more about what else is available, read man yum.

MySQL: Fix Microsoft Word characters. Shows weird characters on the web page.

As a consultant, I do a lot of content migrations for clients. One issue I run into quite often is the encoding of databases, tables, columns differs between source and destination. Most clients do not want me to go and change the way their encoding is to fix issues since they are too afraid about messing with production data. Of course amongst other issues, it creates weird characters for data which is copied/pasted from Microsoft Word. You see weird characters like: ’ … – “ †‘

So if you just want to replace these with appropriate symbols, you may do it with a simple sql query. Note that below queries are without where clause. You may what to test it with one of your rows before making changes to the whole table. Of course, you should always backup your data before you try this out. If you have a dev system, that is even better. I put all my sql queries into a file ex: fix.sql and sourced it with mysql client.

vi fix.sql

update table_name set fieldname = replace(fieldname, '’', '\'');
update table_name set fieldname = replace(fieldname, '…','...');
update table_name set fieldname = replace(fieldname, '–','-');
update table_name set fieldname = replace(fieldname, '“','"');
update table_name set fieldname = replace(fieldname, '”','"');
update table_name set fieldname = replace(fieldname, '‘','\'');
update table_name set fieldname = replace(fieldname, '•','-');
update table_name set fieldname = replace(fieldname, '‡','c');

Save/exit.

# mysql
mysql> source fix.sql;

I am not sure if I am missing any other chars. If you know of any other chars, please comment with them and I will add on to the script here.

————————————-
DISCLAIMER: Please be smart and use code found on internet carefully. Make backups often. And yeah.. last but not least.. I am not responsible for any damage caused by this posting. Use at your own risk.

Subversion: What to do when your repository server moves to another ip?

This weekend our networking guys decided to change ips for all of our servers. They also changed our subversion server’s ip. This caused some issues in the subversion world with developers who had checkouts pointing to ips instead of hostname, using command similar to:

svn co svn+ssh://192.168.1.10/svn/myrepos/ /home/mycheckout/

Now when they do “svn update” inside the their /home/mycheckout/ directory, they get an error:

We needed to point the checkout to the new ip. Easiest way to do this is to delete your checkout and re-checkout. Unfortunately, some of the developers had a lot of modified files which wasn’t checked in yet. I fixed it by issuing:

find /home/mycheckout -name "entries"|xargs /usr/bin/perl -w -i -p -e "s/192.168.1.10/10.1.1.10/g"

Find command helps us in finding all the files with name “entries” and xargs takes the filename and passes it to perl. To understand what perl command is doing, see this post.

Another method which may be preferred as mentioned in comments is: svn switch Only downside I see with this is that you have to remember what you used originally. If you did checkout as [email protected], you would have to pass that to the command below.

Syntax is:

svn switch --relocate svn+ssh://192.168.1.10 svn+ssh://10.1.1.10

I would suggest at this time you switch to using hostname instead of ip.