Centralized Backup Script

Hello There!

I thought I’d share a backup script that was written to consolidate backups onto one server instead of spreading the backup process across several servers. The advantages are somewhat obvious to consolidating the script onto one server, namely being that editing or making changes is much easier as you only have one script to edit.

The environment where this may be ideal would be for environments with 15-20 servers or less. I’d recommend a complete end-to-end backup solution for servers that exceed that number such as Bacula perhaps.

The bash shell script that I pasted below is very straightforward and takes two arguments. The first is the hostname or ip address of the destination server you are backing up. The next (and potentially unlimited) arguments will be single quote encased folders which you would want backed up.

This script is dependent on the server the script is running on having ssh key based authentication enabled and implemented for the root user. Security considerations can be made with IP based restrictions either in the ssh configuration, firewall configuration or other considerations.

#!/bin/sh
# Offsite Backup script
# Written by www.stardothosting.com
# Dynamic backup script

currentmonth=`date "+%Y-%m-%d %H:%M:%S"`
currentdate=`date "+%Y-%m-%d%H_%M_%S"`
backup_email="backups@youremail.com"
backupserver="origin-backup-server.hostname.com"

# Check User Input
if [ "$#" -lt 2 ]
then
        echo -e "nnUsage Syntax :"
        echo -e "./backup.sh [hostname] [folder1] [folder2] [folder3]"
        echo -e "Example : ./backup.sh your-server.com '/etc' '/usr/local/www' '/var/lib/mysql'nn"
        exit 1
fi

# get the server's hostname
host_name=`ssh -l root $1 "hostname"`
echo "Host name : $host_name"
if [ "$host_name" == "" ]
then
        host_name="unknown_$currentdate"
fi

echo "$host_name Offsite Backup Report: " $currentmonth > /var/log/backup.log
echo -e "----------------------------------------------------------" >> /var/log/backup.log
echo -e "" >> /var/log/backup.log

# Ensure permissions are correct
chown -R backups:backups /home/fileserver/backups/
ls -d /home/fileserver/backups/* | grep -v ".ssh|.bash" | xargs -d "n" chmod -R 775

# iterate over user arguments & set error level to 0
errors=0
for arg in "${@:2}"
do
        # check if receiving directory exists
        if [ ! -d "/home/fileserver/backups/$host_name" ]
        then
                mkdir /home/fileserver/backups/$host_name
        fi
        sanitize=`echo $arg | sed 's/[^/]/+$ //'`
        sanitize_dir=`echo $arg | awk -F '/' '{printf "%s", $2}'`
        /usr/bin/ssh -o ServerAliveInterval=1 -o TCPKeepAlive=yes -l root $1 "/usr/bin/rsync -ravzp --progress --exclude 'clam_quarantinedir' $sanitize/ backups@$backupserver:/home/fileserver/backups/$host_name/$sanitize_dir; echo $? > /tmp/bu_rlevel.txt" >> /var/log/backup.log 2>&1
        echo "/usr/bin/ssh -o ServerAliveInterval=1 -o TCPKeepAlive=yes -l root $1 "/usr/bin/rsync -ravzp --progress --exclude 'clam_quarantinedir' $sanitize/ backups@$backupserver:/home/fileserver/backups/$host_name/$sanitize_dir""

        runlevel=`ssh -l root $1 "cat /tmp/bu_rlevel.txt"`
        echo "Runlevel : $runlevel"

        if [ "$runlevel" -ge 1 ]
        then
                errors=$((counter+1))
        else
                echo -e "Script Backup for $arg Completed Successfully!" >> /var/log/backup.log 2>&1
        fi

done


# Check error level
if [ $errors -ge 1 ]
then
        echo -e "There were some errors in the backup job for $host_name, please investigate" >> /var/log/backup.log 2>&1
        cat /var/log/backup.log | mail -s "$host_name Backup Job failed" $backup_email
else
        cat /var/log/backup.log | mail -s "$host_name Backup Job Completed" $backup_email
fi

It should be explained further that this script actually connects to the destination server as the root user, using ssh key authentication. It then initiates a remote rsync command on the destination server back to the backup server as a user called “backupuser”. So that means that not only does the ssh key need to be installed for root on the destination servers, but a user called “backupuser” needs to be added on the backup server itself, with the ssh keys of all the destination servers installed for the remote rsync.

Hopefully I did not over complicate this, because it really is quite simple :

Backup Server –> root –> destination server to backup — > backupuser rsync –> Backup Server

Once you implement the script and do a few dry run tests then it might be ready to implement a scheduled task for each destination server. Here is an example of one cron entry for a server to be backed up :

01 1 * * * /bin/sh /usr/local/bin/backups.sh destination-server-hostname '/etc' '/usr/local/www' '/home/automysql-backups'

SVN Offsite Backup Script : Secure offsite backup solution for SVN to Amazon S3

Hi there!

Backing up your code repository is important. Backing up your code repository to an off-site location in a secure manner is imperative. Throughout our travels and experience utilizing the SVN code repository system, we have developed a quick bash script to export the entire SVN repository, encrypt it, compress it into an archive, and then ship it (over an encrypted network connection) to Amazon S3 storage.

We will be using the (familiar) s3sync Ruby script to do the actual transport to Amazon S3, which you can find here.

Note also that this script also keeps a local copy of the backups, taken each day, for a maximum of 7 days of retention. This might be redundant since all revisions are kept within SVN itself, but I thought it would provide an additional layer of backup redundancy. The script can be easily modified to only backup a single file every night, overwriting the older copy after every backup.

Here’s the script :

#!/bin/sh
# SVN Off Site Backup script
# www.stardothosting.com

currentmonth=`date "+%Y-%m-%d %H:%M:%S"`
threedays=`date -v-5d "+%Y-%m-%d"`
todaysdate=`date "+%Y-%m-%d"`

export AWS_ACCESS_KEY_ID="YOUR-S3-KEY-ID"
export AWS_SECRET_ACCESS_KEY="YOUR-S3-ACCESS-KEY"


echo "SVN Offsite Backup Log: " $currentmonth > /var/log/offsite-backup.log
echo -e "--------------------------------------------" >> /var/log/offsite-backup.log
echo -e "" >> /var/log/offsite-backup.log

# Archive Repository Dump Files and remove files older than 7 days
/usr/bin/find /subversion/svn_backups -type f -mtime +7 -delete

# Backup SVN and encrypt it
svnadmin dump /subversion/repo_name | /usr/bin/openssl enc -aes-256-cbc -pass pass:YOUR-ENCRYPTION-PASSWORD -e > /subversion/svn_backups/repo_name-backup-$todaysdate.enc

#fyi to decrypt :
#openssl aes-256-cbc -d -pass pass:YOUR-ENCRYPTION-PASSWORD -in repo_name-backup.enc -out decrypted.dump

# Transfer the files to Amazon S3 Storage via HTTPS
/usr/local/bin/ruby /usr/local/bin/s3sync/s3sync.rb --ssl -v --delete -r /subversion/svn_backups S3_BACKUPS:svn/svnbackup >> /var/log/offsite-backup.log 2>&1

if [ "$?" -eq 1 ]
then
        echo -e "***SVN OFFSITE BACKUP JOB, THERE WERE ERRORS***" >> /var/log/offsite-backup.log 2>&1
        cat /var/log/offsite-backup.log | mail -s "SVN Offsite Backup Job failed" your@email.com
        exit 1
else
        echo -e "Script Completed Successfully!" >> /var/log/offsite-backup.log 2>&1
        cat /var/log/offsite-backup.log | mail -s "SVN Offsite Backup Job Completed" your@email.com
        exit 0
fi

Note how I have provided an example , commented out within the script, on how you can go about decrypting the encrypted SVN dump file. You can also modify this script to back up to any offsite location, obviously. Just remove the s3sync related entries and replace with rsync or your preferred transport method.

I hope this makes your life easier!

SVN Pre Commit Hook : Sanitize your Code!

Hello,

Dealing with several different development environments can be tricky. With SVN specifically, it is ideal to have some “pre-flight” checks in order to make sure some basic standards have been followed.

Some of the things you would want to check might be :

– Does the code generate a fatal PHP error?
– Is there any syntax errors?
– Has valid commit messages been attached to the code commit?

I thought I’d share our pre-commit hook in one of our SVN code repositories in order to let you utilize and perhaps expand on it to include many more checks. Additional checks that may be specific to your code environment might benefit you. Feel free to share if improvements are made!

#!/bin/bash
# pre-commit hooks
# www.stardothosting.com

REPOS="$1"
TXN="$2"

PHP="/usr/bin/php"
SVNLOOK="/usr/bin/svnlook"
AWK="/usr/bin/awk"
GREP="/bin/egrep"
SED="/bin/sed"

# Make sure that the commit message is not empty
SVNLOOKOK=1
$SVNLOOK log -t "$TXN" "$REPOS" | grep "[a-zA-Z0-9]" > /dev/null || SVNLOOKOK=0

if [ $SVNLOOKOK = 0 ]; then
        echo -e "Empty commit messages are not allowed. Please provide a descriptive comment when committing code." 1>&2
        exit 1
fi

# Make sure the commit message is more than 5 characters long.
LOGMSG=$($SVNLOOK log -t "$TXN" "$REPOS" | grep [a-zA-Z0-9] | wc -c)

if [ "$LOGMSG" -le 5 ]; then
        echo -e "Please provide a verbose comment when committing changes." 1>&2
        exit 1
fi


# Check for PHP parse errors
CHANGED=`$SVNLOOK changed -t "$TXN" "$REPOS" | $GREP "^[U|A]" | $AWK '{print $2}' | $GREP .php$`

for FILE in $CHANGED
do
    MESSAGE=`$SVNLOOK cat -t "$TXN" "$REPOS" "$FILE" | $PHP -l`
    if [ $? -ne 0 ]
    then
        echo 1>&2
        echo "-----------------------------------" 1>&2
        echo "PHP error in: $FILE:" 1>&2
        echo `echo "$MESSAGE" | $SED "s| -| $FILE|g"` 1>&2
        echo "-----------------------------------" 1>&2
        exit 1
    fi
done

exit 0

Add your Dynamic IPs to Apache HTACCESS files

Hello!

We threw together a quick & simple script to dynamically update your .htaccess files within apache to add your dynamic IP address to the allow / deny fields.

If you’re looking to password protect an admin area (for example) but your office only has a dynamic IP address, then this script might be handy for you.

Its an extremely simple script that polls your dynamic hostname (if you use no-ip.org or dyndns.org for example) every 15 minutes as a cron job and, if it has changed, updates the .htaccess file

Hopefully it will make your life just a little bit easier 🙂

Sample Cron entry :

*/15 * * * * /bin/sh /usr/local/bin/htaccessdynamic.sh yourhostname.dyndns.org /var/www/website.com/public_html/.htaccess > /dev/null 2>&1

And now the script :

#!/bin/bash
# Dynamic IP .htaccess file generator
# Written by Star Dot Hosting
# www.stardothosting.com

dynDomain="$1"
htaccessLoc="$2"

dynIP=$(/usr/bin/dig +short $dynDomain)

echo "dynip: $dynIP"
# verify dynIP resembles an IP
if ! echo -n $dynIP | grep -Eq "[0-9.]+"; then
    exit 1
fi

# if dynIP has changed
if ! cat $htaccessLoc | /bin/grep -q "$dynIP"; then

        # grab the old IP
        oldIP=`cat /usr/local/bin/htold-ip.txt`

        # output .htaccess file
        echo "order deny,allow" > $htaccessLoc 2>&1
        echo "allow from $dynIP" >> $htaccessLoc 2>&1
        echo "allow from x.x.x.x" >> $htaccessLoc 2>&1
        echo "deny from all" >> $htaccessLoc 2>&1

        # save the new ip to remove next time it changes, overwriting previous old IP
        echo $dynIP > /usr/local/bin/htold-ip.txt
fi

Automated Amazon EBS snapshot backup script with 7 day retention

Hello there!

We have recently been implementing several different backup strategies for properties that reside on the Amazon cloud platform.

These strategies include scripts that incorporate s3sync and s3fs for offsite or redundant “limitless” backup storage capabilities. One of the more recent strategies we have implemented for several clients is an automated Amazon EBS volume snapshot script that only keeps 7 day retention on all snapshot backups.

The script itself is fairly straightforward, but took several dry-runs in order to fine tune it so that it would reliably create the snapshots, but more importantly would clear out old snapshots older than 7 days.

You can see the for loop for deleting older snapshots. This is done by parsing snapshot dates, converting the dates to a pure numeric value and comparing said numeric value to a “7 days ago” date variable.

Take a look at the script below, hopefully it will be useful to you! There could be more error checking, but that should be fairly easy to do.

#!/bin/sh
# EBS Snapshot volume script
# Written by Star Dot Hosting
# www.stardothosting.com

# Constants
ec2_bin="/opt/aws/bin"
my_cert="/opt/aws/cert.txt"
my_key="/opt/aws/key.txt"
instance_id=`wget -q -O- http://169.254.169.254/latest/meta-data/instance-id`

# Dates
datecheck_7d=`date +%Y-%m-%d --date '7 days ago'`
datecheck_s_7d=`date --date="$datecheck_7d" +%s`

# Get all volume info and copy to temp file
$ec2_bin/ec2-describe-volumes -C $my_cert -K $my_key  --filter "attachment.instance-id=$instance_id" > /tmp/volume_info.txt 2>&1


# Get all snapshot info
$ec2_bin/ec2-describe-snapshots -C $my_cert -K $my_key | grep "$instance_id" > /tmp/snap_info.txt 2>&1

# Loop to remove any snapshots older than 7 days
for obj0 in $(cat /tmp/snap_info.txt)
do

        snapshot_name=`cat /tmp/snap_info.txt | grep "$obj0" | awk '{print $2}'`
        datecheck_old=`cat /tmp/snap_info.txt | grep "$snapshot_name" | awk '{print $5}' | awk -F "T" '{printf "%sn", $1}'`
        datecheck_s_old=`date "--date=$datecheck_old" +%s`

#       echo "snapshot name: $snapshot_name"
#       echo "datecheck 7d : $datecheck_7d"
#       echo "datecheck 7d s : $datecheck_s_7d"
#       echo "datecheck old : $datecheck_old"
#       echo "datecheck old s: $datecheck_s_old"

        if (( $datecheck_s_old <= $datecheck_s_7d ));
        then
                echo "deleting snapshot $snapshot_name ..."
                $ec2_bin/ec2-delete-snapshot -C $my_cert -K $my_key $snapshot_name
        else
                echo "not deleting snapshot $snapshot_name ..."

        fi

done


# Create snapshot
for volume in $(cat /tmp/volume_info.txt | grep "VOLUME" | awk '{print $2}')
do
        description="`hostname`_backup-`date +%Y-%m-%d`"
        echo "Creating Snapshot for the volume: $volume with description: $description"
        $ec2_bin/ec2-create-snapshot -C $my_cert -K $my_key -d $description $volume
done

Massive Amazon Route53 API Bind Zone Import Script

Hello there,

Occasionally some of our managed services work has us dealing directly with other cloud providers such as Amazon. One of our clients set a requirement to migrate over 5,000 domain’s to Amazon’s Route53 DNS service.

There was little doubt that this could be automated, but since we have never done this massive of a deployment through Amazon’s API directly, we thought it might be interesting to post the process as well as the script through which we managed the import process.

Essentially the script utilizes a master domain name list file as its basis for looping through the import. The master list refers to the bind zone files and imports them into Amazon’s Route53 via the Cli53 tool package.

One final note, the script outputs all completed domain imports into a CSV file with the following format :

domain.com,ns1.nameserver.com,ns2.nameserver.com,ns3.nameserver.com,ns4.nameserver.com

This is because when facilitating the actual nameserver change request, all the nameservers assigned to domains when imported to Route53 are randomly generated, so the script has to keep track of these nameserver/domain associations.

The script isn’t perfect and could benefit from some optimizations and more error checking (it does a lot of error checking already, however), but here it is in its entirety. We hope you will have some use for it!

#!/bin/sh
# Import all zone files into amazon
# Star Dot Hosting 2012
# www.stardothosting.com

currentmonth=`date "+%Y-%m-%d"`

#sanitize input and verify input was given
command=`echo "$1" | sed 'y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/'`

if [ -z "$1" ];
then
        echo "AWS ZONE IMPORT"
        echo "---------------"
        echo ""
        echo "Usage : ./importzone.sh file.txt"
        echo ""
        exit 0
fi


echo "zone import log : $currentmonth" > /var/log/importzone.log 2>&1
echo " " >> /var/log/importzone.log 2>&1



for obj0 in $(cat $1);
do
        echo "checking if $obj0 was already migrated ..."
        ls -la /usr/local/zones/$1-zones/complete | grep -w $obj0 >> /dev/null 2>&1
        if [ "$?" -eq 1 ]
        then
        echo "importing $obj0 ..."

        #check if zone file has NS records
        cat /usr/local/zones/$1-zones/$obj0.txt | grep NS >> /dev/null 2>&1
        if [ "$?" -eq 0 ]
        then
                echo "Nameserver exists, continuing..."
        else
                echo "Adding nameserver to record..."
                echo "$obj0. 43201 IN NS ns1.nameserver.com." >> /usr/local/zones/$1-zones/$obj0.txt
        fi

        #check if zone exists
        /usr/local/zones/cli53/bin/cli53 info $obj0 >> /var/log/importzone.log 2>&1
        if [ "$?" -eq 0 ]
        then
                # grab NAMESERVERS
                nameservers=`/usr/local/zones/cli53/bin/cli53 rrlist $obj0 | grep "NS" | awk -F "NSt" '{printf "%sn", $2}' | sed 's/.$/g' | sed ':a;N;$!ba;s/n/,/g'`
   # import zone file
                /usr/local/zones/cli53/bin/cli53 import $obj0 -r -f /usr/local/zones/$1-zones/$obj0.txt
                if [ "$?" -eq 0 ]
                then
                        #move to complete folder
                        mv /usr/local/zones/$1-zones/$obj0.txt /usr/local/zones/$1-zones/complete
                else
                        echo "There was an error in importing the zone file!" >> /var/log/importzone.log
                        exit 1
                fi
        else
                #create on route53
                /usr/local/zones/cli53/bin/cli53 create $obj0 >> /var/log/importzone.log 2>&1
                # grab NAMESERVERS
                nameservers=`/usr/local/zones/cli53/bin/cli53 rrlist $obj0 | grep "NS" | awk -F "NSt" '{printf "%sn", $2}' | sed 's/.$/g' | sed ':a;N;$!ba;s/n/,/g'`
                # import zone file
                /usr/local/zones/cli53/bin/cli53 import $obj0 -r -f /usr/local/zones/$1-zones/$obj0.txt
                if [ "$?" -eq 0 ]
                then
                        #move to complete folder
                        mv /usr/local/zones/$1-zones/$obj0.txt /usr/local/zones/$1-zones/complete
                else
                        echo "There was an error in importing the zone file!" >> /var/log/importzone.log
                        exit 1
                fi
        fi

        # output domain + nameservers in a CSV with format : domain.com,ns1,ns2,ns3,ns4
        echo "$obj0,$nameservers" >> nameserver_registrar_request.txt 2&>1
        else
                echo "Domain already migrated .. !"
        fi
done

Checking and repairing mysql replication automatically

Hello!

MySQL replication has been known to easily break, as a result of a large multitude of potential causes.

Sometimes the replication can even break if an erroneous query is executed on the master server.

With all the potential issues that may break replication, we thought it prudent to write an automated check script that can run on a scheduled basis (i.e. every 10-15 minutes), check the Slave status, report on any errors if applicable and attempt to repair replication.

We have built this script to exit and send mail alerts if any step of the checking and repairing process fails or generates an error in itself.

The script also generates a lock file to ensure that no more than one check process can run at any given time. We feel this script could be best used for scenarios for remote MySQL slaves, for example. Adding this extra layer may ensure a more reliable replication.

The repair process is simply 3 MySQL Commands :

stop slave;
reset slave;
slave start;

The above directives assume that you have a master.info with the mysql master server information statically set. No CHANGE MASTER commands have to be executed as a result. Resetting the slave clears the error and resumes replication, and all the queries missed during the time it failed should be queued and applied after it starts again.

Here is the script :

#!/bin/sh
# Slave replication auto recovery and alert
# Star Dot Hosting 2012

currentmonth=`date "+%Y-%m-%d"`
lock_file=/tmp/slave_alert.lck

echo "MySQL Replication Check Script" > /var/log/replication_check.log 2>&1
echo "------------------------------" >> /var/log/replication_check.log 2>&1
echo "$currentmonth" >> /var/log/replication_check.log 2>&1
echo "" >> /var/log/replication_check.log 2>&1


# Check if lock file exists
if [ -f $lock_file ];
then
        echo "Lock file exists! Possible conflict!" >> /var/log/replication_check.log 2>&1
        mail_alert
        exit 1
else
        touch $lock_file
fi

# Fix slave
function fix_replication () {
        mysql -u root --password="XXXXX" -Bse "stop slave" >> /var/log/replication_check.log 2>&1
        if [ "$?" -eq 0 ];
        then
                echo "Stop slave succeeded..." >> /var/log/replication_check.log 2>&1
        else
                echo "Slave recover function failed" >> /var/log/replication_check.log 2>&1
                mail_alert
                exit 1
        fi
        mysql -u root --password="XXXXX" -Bse "reset slave" >> /var/log/replication_check.log 2>&1
        if [ "$?" -eq 0 ];
        then
                echo "Reset slave succeeded..." >> /var/log/replication_check.log 2>&1
        else
                echo "Slave recover function failed" >> /var/log/replication_check.log 2>&1
                mail_alert

                exit 1
        fi
        mysql -u root --password="XXXXX" -Bse "slave start" >> /var/log/replication_check.log 2>&1
        if [ "$?" -eq 0 ];
        then
                echo "Slave start succeeded." >> /var/log/replication_check.log 2>&1
        else
                echo "Slave recover function failed" >> /var/log/replication_check.log 2>&1
                mail_alert
                exit 1
        fi
}


# Alert function
function mail_alert () {
        cat /var/log/replication_check.log | mail -s "Replication check errors!" your@email.com
}


# Check if Slave is running properly
Slave_IO_Running=`mysql -u root --password="XXXXX" -Bse "show slave statusG" | grep Slave_IO_Running | awk '{ print $2 }'`
Slave_SQL_Running=`mysql -u root --password="XXXXX" -Bse "show slave statusG" | grep Slave_SQL_Running | awk '{ print $2 }'`
Last_error=`mysql -u root --password="XXXXX" -Bse "show slave statusG" | grep Last_error | awk -F : '{ print $2 }'`


# If no values are returned, slave is not running
if [ -z $Slave_IO_Running -o -z $Slave_SQL_Running ];
then
        echo "Replication is not configured or you do not have the required access to MySQL"
        exit 1
fi

# If everythings running, remove lockfile if it exists and exit
if [ $Slave_IO_Running == 'Yes' ] && [ $Slave_SQL_Running == 'Yes' ];
then
        rm $lock_file
        echo "Replication slave is running" >> /var/log/replication_check.log 2>&1
        echo "Removed Alert Lock" >> /var/log/replication_check.log 2>&1
elif [ $Slave_SQL_Running == 'No' ] || [ $Slave_IO_Running == 'No' ];
then
        echo "SQL thread not running on server `hostname -s`!" >> /var/log/replication_check.log 2>&1
        echo "Last Error:" $Last_error >> /var/log/replication_check.log 2>&1
        fix_replication
        mail_alert
        rm $lock_file
fi

echo "Script complete!" >> /var/log/replication_check.log 2>&1
exit 0

Clone a XEN VPS server that resides on a LVM / Logical Volume Manager

Hello!

We thought it would be important to share this information as it might be interesting to someone who wants to replicate the same VPS across many instances in order to create a farm of web servers (for example).

This uses very similar concepts to our LVM XEN backup post a while back.

Step 1: Take a snapshot of the current VPS

This is simple. Use the lvcreate command with the -s option to create a snapshot of the running VPS. We assume your VPS is 5GB in size, so just replace that with however large your VPS is :

lvcreate -s -L 5GB -n snapshot_name /dev/VolGroup00/running_vps_image

Step 2: Create your new VPS

This is important. You want to create a new vps, assign a MAC and IP address first and let the creation process fully finish. Then shut the VPS down.

Step 3: Copy the snapshot to the new VPS

All you have to do is use the dd command to transfer the snapshot image to the newly created VPS image :

dd if=/dev/VolGroup00/snapshot_name of=/dev/VolGroup00/new_vps_image

All done! Dont forget to remove the snapshot after your done with it :

lvremove -f /dev/VolGroup00/snapshot_name

Start up the new vps and you should have a carbon copy of the previous vps!

Varnish Caching with Joomla

Hello There!

One of the exciting new technologies to come out in the last few years is a tremendously efficient and dynamic caching system called Varnish (see : http://www.varnish-cache.org).

We have been employing the use of Varnish for high traffic websites for the purposes of user experience improvements as well as for redundancy and load balancing purposes.

Varnish can do it all – complex load balancing and polling based on many different weighting methodologies for fail over, as well as holding on to a “stale” cache in the event of a back end web server outage, or perhaps for geographic redundancy (holding a stale cache in a secondary data center).

One of the challenges we have faced in the many different implementations of varnish into web stacks, is dealing with dynamic and user session (i.e. “logged in”) content.

If the Internet was full of only static (see 1995) html files, varnish would work beautifully out of the box. Unfortunately the web is a complicated mess of session based authentication, POSTS, GETS and query strings among a few things.

One of our recent accomplishments was getting the Joomla 1.5 content management system to work with Varnish 2.1.

The biggest challenge for Joomla was that it creates a session cookie for all users. This means the session is created and established for any guest visiting the site, and if they decide to log in , that same session is used to establish a logged in session through authentication. This is an apparent effort to deter or avoid session hijacking.

The problem with this is that Varnish ends up caching all the logged in session content, as well as the anonymous front page content.

I spent a significant amount of time fine tuning my VCL (varnish configuration language) to play nice with Joomla. Unfortunately it became apparent that some minor modifications to the Joomla code was necessary in order for it to communicate properly with Varnish.

Step 1 : Move the login form off the front page

I realize this might be a hard decision. I cant offer an alternative. If you have an integrated login form on the front page of your site, and you wish to cache that page with varnish, you will likely have to chose one or the other. It would probably be ideal to replace that login form with a button to bring the user to a secondary page off the main page.

For the sake of argument, lets call our site “example.com” and the login page url within Joomla should look like the following :

http://www.example.com/index.php?option=com_user&view=login

Take note of login URI in this string.

The reason we need the login form on a secondary page is because we need an almost “sandboxed” section of the site where the anonymous session cookie can be established, and passed through the authentication process to a logged in session. We will tell varnish to essentially ignore this page.

Step 2 : Modify Joomla to send HTTP headers for user/guest sessions

This isn’t that hard. In the Joomla code, there is a section where it defines the HTTP headers it sends to the browser for cache variables such as expire times and whatnot. I’m going to assume you have turned off the built-in Joomla caching system.

What you need to do is tell Joomla to send a special HTTP header that will give either a True or False value if the user is logged in or not. This is useful information. It will allow varnish to not cache any logged in content such as “Welcome back, USERNAME” after the user is passed back to the front page from logging in.

In my joomla installation, I modified the following file :

libraries/joomla/environment/response.php

The parent folder being the public_html / root folder for your Joomla installation. In this file, please find the line that determines if the Joomla caching system is disabled :

if (JResponse::allowCache() === false)

After this line, you will see about 5 HTTP header declarations (expires, last-modified, cache-control, cache-control again and pragma). Above those declarations , add the following 6 lines of code :

$user =& JFactory::getUser();
if (!$user->guest) {
JResponse::setHeader( 'X-Logged-In', 'True', true);
} else {
JResponse::setHeader( 'X-Logged-In', 'False', true );
}

If you read the above code, its fairly straight forward. I do a check to see if the user is a guest (aka anonymous) or not. If they are logged in I send an HTTP header called “X-Logged-In”, and assign a “True” value to it. If the user is not logged in, it sets it to “False”.

Pretty easy, right?

This will allow varnish to avoid caching a logged in user’s page.

Step 3 : Configure Varnish

This is the part that took the most time during this entire process. Mind you patching the Joomla code and whatnot took some time as well, this process took a lot of experimentation and long hours examining session cookies and host headers.

What I will do is break down the generalized configuration directives into two groups : VCL_RECV and VCL_FETCH.

VCL_RECV

In here, I set a bunch of IF statement directives to tell varnish what it should look up in the cache and what it should pipe to the backend and what it should pass. This could probably be optimized and improved upon, but it works for me :

# If user sends an http POST, pipe to backend
if (req.request == "POST") {
set req.backend = iamloggedin;
return(pipe);
}

# http authenticated sessions are piped
if (req.http.Authenticate || req.http.Authorization) {
set req.backend = iamloggedin;
return(pipe);
}

# if the user is coming FROM the login page, pipe to backend
if (req.http.referer ~ "(?i)(com_user|login)") {
set req.backend = iamloggedin;
return(pipe);
}

VCL_FETCH

The fetch section is a little bit easier. I only have about 5 directives. The first one is the most important one you want to look at. It “unsets” the cookie from any page on the site, EXCEPT the login page. This allows varnish to properly establish the logged in session. The subsequent rules determine what to deliver and what to pass based on URI or HTTP header checks :

# discard backend setcookie unless it equals the following
if (!req.url ~ "(?i)(login|com_user|user|logout)") {
unset beresp.http.Set-Cookie;
}

if (req.http.referer ~ "(?i)(com_user|login|logout)") {
set req.backend = iamloggedin;
return(pass);
}

if (beresp.http.x-logged-in ~ "False"){
set req.backend = webfarm;
return(deliver);
}

if (beresp.http.x-logged-in ~ "True"){
set req.backend = iamloggedin;
return(pass);
}

if (req.http.Authenticate || req.http.Authorization) {
set req.backend = iamloggedin;
return(pass);
}

Thats it! I just saved you many sleepless nights (I hope!). Hopefully your headers will look something like this after you implement varnish in front of Joomla :

Set-Cookie example_auth_129bf15asdfasdf52f3afaafawef; path=/
P3P CP="NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM"
X-Logged-In False
Expires Mon, 1 Jan 2001 00:00:00 GMT
Last-Modified Mon, 08 Aug 2011 20:49:37 GMT
Cache-Control post-check=0, pre-check=0
Pragma no-cache
Content-Type text/html; charset=utf-8
Content-Length 85898
Date Mon, 08 Aug 2011 21:01:52 GMT
X-Varnish 761778669 761751685
Age 735
Via 1.1 varnish
Connection keep-alive
X-Cache-Svr cache.example.com
X-Cache HIT
X-Cache-Hits 121

UPDATE : 12/08/2011

I realize I made a mistake and have corrected this post. In vcl_fetch, i had the following :

# discard backend setcookie unless it equals the following
if (!req.url ~ "(?i)(login|com_user|user|logout)") {
unset req.http.Set-Cookie;
}

Well I realize I should be unsetting the response cookie, not the set cookie. For some reason, the above (erroneous) directive works only right after you login. If you start clicking around the site, your logged in session disappears. I suspect this is because either joomla or varnish is mistakenly unsetting a logged in session.

This is the correct entry (I have fixed it in my original post as well) :

# discard backend setcookie unless it equals the following
if (!req.url ~ "(?i)(login|com_user|user|logout)") {
unset beresp.http.Set-Cookie;
}

After making the above change, I can login and browse the site and my session stays intact. Mind you, the Joomla site I am testing with is definitely not a vanilla Joomla installation.

I’d love to hear from anyone who has accomplished the above scenario either way!

Centralized remote backup script with SSH key authentication

Greetings,

It has been a while since we posted any useful tidbits for you , so we have decided to share one of our quick & dirty centralized backup scripts.

The script relies on ssh key based authentication, described here on this blog. It essentially parses a configuration file where each variable is separated by a comma and colon, as in the example config here :

hostname1,192.168.1.1,etc:var:root
hostname2,192.168.1.2,etc:var:root:usr

Note the intended backup directories in the 3rd variable, separated by colon’s. Simply populate the backup-hosts.txt config file (located in the same folder as the script) with all the hosts you want to be backed up.

The script then ssh’s to the intended host, and sends a tar -czf stream (securely) over ssh, to be output into the destination of your choice. Ideally you should centralize this script on a box that has direct access to alot of disk space.

Find the script here :

#!/bin/sh
# Centralized Linux Backup Script
# By Star Dot Hosting , www.stardothosting.com
# Uses SSH Key based authentication and remote ssh commands to tar.gz folders to iSCSI storage


todaysdate=`date "+%Y-%m-%d %H:%M:%S"`
backupdest="/backups/linux-backups"

echo "Centralized Linux Backup: " $todaysdate > /var/log/linux-backup.log
echo -e "----------------------------------------------" >> /var/log/linux-backup.log
echo -e >> /var/log/linux-backup.log


for obj0 in $(cat /usr/local/bin/backup-hosts.txt | grep -v "#" | awk -F "," '{printf "%sn", $2}');
do
        backupname=`cat /usr/local/bin/backup-hosts.txt | grep -v "#" | grep $obj0 | awk -F "," '{printf "%sn", $1}'`

        for obj1 in $(cat /usr/local/bin/backup-hosts.txt | grep -v "#" | grep $obj0 | awk -F "," '{printf "%sn", $3'} | awk '{gsub(":","n");printf"%s", $
0}');
        do
                echo -e "backing up $obj0 with $obj1 directory" >> /var/log/linux-backup.log
                ssh -l root $obj0 "(cd /$obj1/ && tar -czf - . -C /$obj1)" >> $backupdest/$backupname.$obj1.tar.gz 2>&1
                if [ "$?" -eq 1 ]
                then
                        echo -e "There were some errors while backing up $obj0 / $backupname within the $obj1 directory" >> /var/log/linux-backup.log
                        #exit 1
                else
                        echo -e "Backup completed on $obj0 / $backupname while backing up $obj1 directory" >> /var/log/linux-backup.log
                fi
        done
done

echo "Backup Script Completed." >> /var/log/linux-backup.log
cat /var/log/linux-backup.log | mail -s "Centralized Backup Complete" topsoperations@topscms.com

You could modify this script to keep different daily backups , pruned to keep only X number of days of backups (i.e. only 7 days worth). There is alot you can do here.

If you have a handful of linux or bsd servers that you would like to backup in a centralized location, without having an individual script to maintain on each server, then perhaps you could use or modify this script to suit your needs.

I hope this helps.

Menu