Checking and repairing mysql replication automatically

Hello!

MySQL replication has been known to easily break, as a result of a large multitude of potential causes.

Sometimes the replication can even break if an erroneous query is executed on the master server.

With all the potential issues that may break replication, we thought it prudent to write an automated check script that can run on a scheduled basis (i.e. every 10-15 minutes), check the Slave status, report on any errors if applicable and attempt to repair replication.

We have built this script to exit and send mail alerts if any step of the checking and repairing process fails or generates an error in itself.

The script also generates a lock file to ensure that no more than one check process can run at any given time. We feel this script could be best used for scenarios for remote MySQL slaves, for example. Adding this extra layer may ensure a more reliable replication.

The repair process is simply 3 MySQL Commands :

stop slave;
reset slave;
slave start;

The above directives assume that you have a master.info with the mysql master server information statically set. No CHANGE MASTER commands have to be executed as a result. Resetting the slave clears the error and resumes replication, and all the queries missed during the time it failed should be queued and applied after it starts again.

Here is the script :

#!/bin/sh
# Slave replication auto recovery and alert
# Star Dot Hosting 2012

currentmonth=`date "+%Y-%m-%d"`
lock_file=/tmp/slave_alert.lck

echo "MySQL Replication Check Script" > /var/log/replication_check.log 2>&1
echo "------------------------------" >> /var/log/replication_check.log 2>&1
echo "$currentmonth" >> /var/log/replication_check.log 2>&1
echo "" >> /var/log/replication_check.log 2>&1


# Check if lock file exists
if [ -f $lock_file ];
then
        echo "Lock file exists! Possible conflict!" >> /var/log/replication_check.log 2>&1
        mail_alert
        exit 1
else
        touch $lock_file
fi

# Fix slave
function fix_replication () {
        mysql -u root --password="XXXXX" -Bse "stop slave" >> /var/log/replication_check.log 2>&1
        if [ "$?" -eq 0 ];
        then
                echo "Stop slave succeeded..." >> /var/log/replication_check.log 2>&1
        else
                echo "Slave recover function failed" >> /var/log/replication_check.log 2>&1
                mail_alert
                exit 1
        fi
        mysql -u root --password="XXXXX" -Bse "reset slave" >> /var/log/replication_check.log 2>&1
        if [ "$?" -eq 0 ];
        then
                echo "Reset slave succeeded..." >> /var/log/replication_check.log 2>&1
        else
                echo "Slave recover function failed" >> /var/log/replication_check.log 2>&1
                mail_alert

                exit 1
        fi
        mysql -u root --password="XXXXX" -Bse "slave start" >> /var/log/replication_check.log 2>&1
        if [ "$?" -eq 0 ];
        then
                echo "Slave start succeeded." >> /var/log/replication_check.log 2>&1
        else
                echo "Slave recover function failed" >> /var/log/replication_check.log 2>&1
                mail_alert
                exit 1
        fi
}


# Alert function
function mail_alert () {
        cat /var/log/replication_check.log | mail -s "Replication check errors!" your@email.com
}


# Check if Slave is running properly
Slave_IO_Running=`mysql -u root --password="XXXXX" -Bse "show slave statusG" | grep Slave_IO_Running | awk '{ print $2 }'`
Slave_SQL_Running=`mysql -u root --password="XXXXX" -Bse "show slave statusG" | grep Slave_SQL_Running | awk '{ print $2 }'`
Last_error=`mysql -u root --password="XXXXX" -Bse "show slave statusG" | grep Last_error | awk -F : '{ print $2 }'`


# If no values are returned, slave is not running
if [ -z $Slave_IO_Running -o -z $Slave_SQL_Running ];
then
        echo "Replication is not configured or you do not have the required access to MySQL"
        exit 1
fi

# If everythings running, remove lockfile if it exists and exit
if [ $Slave_IO_Running == 'Yes' ] && [ $Slave_SQL_Running == 'Yes' ];
then
        rm $lock_file
        echo "Replication slave is running" >> /var/log/replication_check.log 2>&1
        echo "Removed Alert Lock" >> /var/log/replication_check.log 2>&1
elif [ $Slave_SQL_Running == 'No' ] || [ $Slave_IO_Running == 'No' ];
then
        echo "SQL thread not running on server `hostname -s`!" >> /var/log/replication_check.log 2>&1
        echo "Last Error:" $Last_error >> /var/log/replication_check.log 2>&1
        fix_replication
        mail_alert
        rm $lock_file
fi

echo "Script complete!" >> /var/log/replication_check.log 2>&1
exit 0

ProFTPD with MySQL Authentication

Since this setup uses one FTP account to create user home directories and upload files, a compromise to this FTP user would cause the attacker to gain access to all FTP user home directories. I guess it just depends on how much you trust the DefaultRoot directive in Proftpd. I run Proftpd in its own chroot environment in addition to using DefaultRoot, so I’m used to feeling pretty safe with my Proftpd install. Anyway, here’s how I did the install/configuration

1. install proftpd-mysql from the ports with WITH_QUOTA set:

cd /usr/ports/ftp/proftpd-mysql/
env WITH_QUOTA=yes make
env WITH_QUOTA=yes make install

2. Add the global proftpd user & Proftpd group to your system.

I used uid & gid 5500 simply because that’s what was used at one of the sites I was referencing (listed below).

pw groupadd -n Proftpd -g 5500
pw useradd proftpd -u 5500 -g Proftpd -s /sbin/nologin -d /dev/null -c "Proftpd User"

3. Create the mySQL database

create database proftpd;
grant all on proftpd.* to 'proftpd'@'localhost' identified by 'password'

( change ‘password’ to something secret! )

4. Create the mySQL tables for the users & quota

create table proftpdUsers (

sqlUID int unsigned auto_increment not null,
userName varchar(30) not null unique,
passwd varchar(80) not null,
uid int unsigned not null unique,
gid int unsigned not null,
homedir tinytext,
shell tinytext,
primary key(sqlUID)

) ;

create table proftpdGroups (

sqlGID int unsigned auto_increment not null,
groupName varchar(30) not null unique,
gid int unsigned not null unique,
members tinytext,
primary key(sqlGID)
);

CREATE TABLE proftpdQuotaLimits (
name VARCHAR(30),
quota_type ENUM("user", "group", "class", "all") NOT NULL,
per_session ENUM("false", "true") NOT NULL,
limit_type ENUM("soft", "hard") NOT NULL,
bytes_in_avail FLOAT NOT NULL,
bytes_out_avail FLOAT NOT NULL,
bytes_xfer_avail FLOAT NOT NULL,
files_in_avail INT UNSIGNED NOT NULL,
files_out_avail INT UNSIGNED NOT NULL,
files_xfer_avail INT UNSIGNED NOT NULL
);


CREATE TABLE proftpdQuotaTallies (
name VARCHAR(30) NOT NULL,
quota_type ENUM("user", "group", "class", "all") NOT NULL,
bytes_in_used FLOAT NOT NULL,
bytes_out_used FLOAT NOT NULL,
bytes_xfer_used FLOAT NOT NULL,
files_in_used INT UNSIGNED NOT NULL,
files_out_used INT UNSIGNED NOT NULL,
files_xfer_used INT UNSIGNED NOT NULL
);

5. Add a test user to the proftpd database

(assumes /home/ftp is where you keep your ftp users. Otherwise, change the homedir location). This is certainly not a necessary step, but you should probably check to see if your configuration is working. You can delete this user later.

insert into proftpdUsers values ( 0, 'test', 'test', 5500, 5500, '/home/ftp/test', '/sbin/nologin' );

6. Set your proftpd configuration to use the mySQL authentication and quotas:

(NOTE: this is not a complete configuration file, it’s basically just the default config file with mySQL auth & quotas added, but note that the User and Group directives are the user & group we added in step 2. )

MaxInstances 30

# Set the user and group under which the server will run.
User proftpd
Group Proftpd

# To cause every FTP user to be "jailed" (chrooted) into their home
# directory, uncomment this line.
DefaultRoot ~

# Normally, we want files to be overwriteable.
AllowOverwrite on

# Bar use of SITE CHMOD by default

DenyAll


# Log format and location
LogFormat               default "%t %h %a %s %m %f %b %T "%r""
ExtendedLog             /var/log/proftpd.log ALL default
SystemLog               /var/log/proftpd.log ALL default
TransferLog             /var/log/proftpd.log ALL default

# Uncomment this if you have "invalid shell" errors in your proftpd.log
#RequireValidShell       off


# The passwords in MySQL are encrypted using CRYPT
SQLAuthTypes Plaintext Crypt
SQLAuthenticate users* groups*

# used to connect to the database
# databasename@host database_user user_password
SQLConnectInfo proftpd@localhost proftpd yourdatabasepassword

# Here we tell ProFTPd the names of the database columns in the "usertable"
# we want it to interact with. Match the names with those in the db
SQLUserInfo proftpdUsers userName passwd uid gid homedir shell

# Here we tell ProFTPd the names of the database columns in the "grouptable"
# we want it to interact with. Again the names match with those in the db
SQLGroupInfo proftpdGroups groupName gid members

# set min UID and GID - otherwise these are 999 each
SQLMinID 5000

#============
# User quotas
# ===========
QuotaEngine on
QuotaDirectoryTally on
QuotaDisplayUnits Mb
QuotaShowQuotas on

# create a user's home directory on demand if it doesn't exist
SQLHomedirOnDemand on

SQLNamedQuery get-quota-limit SELECT "name, quota_type, per_session, limit_type, bytes_in_avail, bytes_out_avail, bytes_xfer_avail, files_in_avail, files_out_avail, files_xfer_avail FROM proftpdQuotaLimits WHERE name = '%{0}' AND quota_type = '%{1}'"

SQLNamedQuery get-quota-tally SELECT "name, quota_type, bytes_in_used, bytes_out_used, bytes_xfer_used, files_in_used, files_out_used, files_xfer_used FROM proftpdQuotaTallies WHERE name = '%{0}' AND quota_type = '%{1}'"

SQLNamedQuery update-quota-tally UPDATE "bytes_in_used = bytes_in_used + %{0}, bytes_out_used = bytes_out_used + %{1}, bytes_xfer_used = bytes_xfer_used + %{2}, files_in_used = files_in_used + %{3}, files_out_used = files_out_used + %{4}, files_xfer_used = files_xfer_used + %{5} WHERE name = '%{6}' AND quota_type = '%{7}'" proftpdQuotaTallies

SQLNamedQuery insert-quota-tally INSERT "%{0}, %{1}, %{2}, %{3}, %{4}, %{5}, %{6}, %{7}" proftpdQuotaTallies

QuotaLimitTable sql:/get-quota-limit
QuotaTallyTable sql:/get-quota-tally/update-quota-tally/insert-quota-tally 

MySQL Replication : Replicating an existing database

You may remember a previous post about MySQL replication.

I decided to make a revised post detailing the different steps required in order to implement a master / slave replication relationship within two or more MySQL servers.

The steps required are slightly different and I think its important to outline the necessary steps in order to accomplish this task — it may actually save you some troubleshooting! 🙂

    Replication of Existing DBs

If you have existing data on your master that you want to synchronize on your slaves before starting the replication process, then you must stop processing statements on the master, obtain the current position, and then dump the data, before allowing the master to continue executing statements.

If you do not stop the execution of statements, the data dump and the master status information that you use will not match and you will end up with inconsistent or corrupted databases on the slaves.

    PREPARATION OF MASTER SERVER

1. Select a master server. It can be either one.

2. Make sure all databases that you want to replicate to the slave already exist! The easist way is to just copy the database dirs inside your MySQL data directory intact over to your slave, and then recursively chown them to “mysql:mysql”. Remember, the binary structures are file-system dependant, so you can’t do this between MySQL servers on different OS’s. In this instance you will want to use mysqldump most likely.

3. Create /etc/my.cnf if you do not already have one:

[mysqld]
socket=/tmp/mysql.sock [enter YOUR path to mysql.sock here]
server-id=1
log-bin=mysql-bin
binlog-do-db=bossdb     # input the database which should be replicated
binlog-ignore-db=mysql1 # input the database that should be ignored for replication
binlog-ignore-db=mysql2  # input the database that should be ignored for replication

4. Permit your slave server to replicate by issuing the following SQL command (substituting your slave’s IP and preferred password):

mysql> GRANT SUPER,REPLICATION CLIENT,REPLICATION SLAVE,RELOAD ON *.* TO 'replicate'@'192.168.1.1' IDENTIFIED BY 'somepass';

5. Flush all talbes and block write statements :

mysql> FLUSH TABLES WITH READ LOCK;

6. Use the SHOW MASTER STATUS statement to determine the current binary log file name and offset on the master:

mysql > SHOW MASTER STATUS;
+---------------+----------+--------------+------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+---------------+----------+--------------+------------------+
| mysql-bin.003 | 73       | test         | manual,mysql     |
+---------------+----------+--------------+------------------+

Copy the file + position for use in Step 4 of the slave configuration.

7. Create data snapshot to import into slave with mysqldump :

shell> mysqldump -u root -p db_name --lock-all-tables >dbdump.sql

8. Unlock the tables of the database :

mysql> UNLOCK TABLES;

9. Transfer & import the db into the slave

10. Shut down and restart MySQL daemon and verify that all is functional.

PREPARATION OF SLAVE

1. Create /etc/my.cnf if you do not already have one:

[mysqld]
socket=/tmp/mysql.sock [enter YOUR path to mysql.sock here]
server-id=2 [MUST be different to master]
master-host=192.168.1.1
master-user=replicate
master-password=somepass

2. Shut down and restart MySQL on slave.

3. Log into mysql and stop slave :

mysql> stop slave;

4. Set the master configuration on the slave :

mysql> CHANGE MASTER TO
    ->     MASTER_HOST='master_host_name',
    ->     MASTER_USER='replication_user_name',
    ->     MASTER_PASSWORD='replication_password',
    ->     MASTER_LOG_FILE='recorded_log_file_name',
    ->     MASTER_LOG_POS=recorded_log_position;

3. Issue the following SQL command to check status:

mysql> show slave statusG;

Ensure that the following two fields are showing this :

Slave_IO_Running: Yes
Slave_SQL_Running: Yes

If not, try to issue the following command :

mysql> start slave;

This will manually start the slave process. Note that only updated tables and entries after the slave process has started will be sent from the master to the slave — it is not a differential replication.

TESTING

Just update some data on the master, and query that record on the slave. The update should be instantaneous.

Test creating a table on the master MySQL server database :

mysql> use replicateddb;
Database changed

mysql> CREATE TABLE example4( id INT NOT NULL AUTO_INCREMENT,  PRIMARY KEY(id),  name VARCHAR(30),   age INT);
Query OK, 0 rows affected (0.04 sec)

How to repair damaged MySQL tables

Once in a while something will happen to a server and the mysql database will get corrupted.

A specific instance comes to mind on one of our Cacti monitoring servers.

The /var partition filled up due to too many messages being sent to the root user in /var/spool/. This caused MySQL to crash as well since the cacti poller couldnt write to the poller_output table in MySQL.

The result was all graphs being blank within cacti.

In any case, a thorough analysis of the mysql database was in order and I decided to post this quick tutorial for performing quick / lengthy table checks for offline and online MySQL databases.

Stage 1: Checking your tables

Run:

myisamchk *.MYI 

or

myisamchk -e *.MYI 

if you have more time. Use the -s (silent) option to suppress unnecessary information.

If the mysqld server is stopped, you should use the –update-state option to tell myisamchk to mark the table as “checked.”

You have to repair only those tables for which myisamchk announces an error. For such tables, proceed to Stage 2.

If you get unexpected errors when checking (such as out of memory errors), or if myisamchk crashes, go to Stage 3.

Stage 2: Easy safe repair

First, try :

myisamchk -r -q tbl_name 


(-r -q means “quick recovery mode”).

This attempts to repair the index file without touching the data file. If the data file contains everything that it should and the delete links point at the correct locations within the data file, this should work, and the table is fixed. Start repairing the next table. Otherwise, use the following procedure:

1. Make a backup of the data file before continuing.

2. Use

myisamchk -r tbl_name 

(-r means “recovery mode”)

This removes incorrect rows and deleted rows from the data file and reconstructs the index file.

3. If the preceding step fails, use

myisamchk --safe-recover tbl_name

Safe recovery mode uses an old recovery method that handles a few cases that regular recovery mode does not (but is slower).

Note: If you want a repair operation to go much faster, you should set the values of the sort_buffer_size and key_buffer_size variables each to about 25% of your available memory when running myisamchk.

If you get unexpected errors when repairing (such as out of memory errors), or if myisamchk crashes, go to Stage 3.

Stage 3: Difficult repair

You should reach this stage only if the first 16KB block in the index file is destroyed or contains incorrect information, or if the index file is missing. In this case, it is necessary to create a new index file. Do so as follows:

1. Move the data file to a safe place.

2. Use the table description file to create new (empty) data and index files:

   shell> mysql db_name
     mysql> SET AUTOCOMMIT=1;
     mysql> TRUNCATE TABLE tbl_name;
     mysql> quit

3. Copy the old data file back onto the newly created data file. (Do not just move the old file back onto the new file. You want to retain a copy in case something goes wrong.)

Go back to Stage 2. :

myisamchk -r -q should work.

(This should not be an endless loop.)

You can also use the REPAIR TABLE tbl_name USE_FRM SQL statement, which performs the whole procedure automatically. There is also no possibility of unwanted interaction between a utility and the server, because the server does all the work when you use REPAIR TABLE. See Section 12.5.2.6, “REPAIR TABLE Syntax”.

Stage 4: Very difficult repair

You should reach this stage only if the .frm description file has also crashed. That should never happen, because the description file is not changed after the table is created:

1. Restore the description file from a backup and go back to Stage 3. You can also restore the index file and go back to Stage 2. In the latter case, you should start with

myisamchk -r

2. If you do not have a backup but know exactly how the table was created, create a copy of the table in another database. Remove the new data file, and then move the .frm description and .MYI index files from the other database to your crashed database. This gives you new description and index files, but leaves the .MYD data file alone. Go back to Stage 2 and attempt to reconstruct the index file.

Thats it!

MySQL Replication : Setting up a Simple Master / Slave

It is often necessary, when designing high availability environments to implement a database replication scenario with MySQL.

This simple how-to is intended to setup a simple master / slave relationship.

PREPARATION OF MASTER SERVER

1. Select a master server. It can be either one.

2. Make sure all databases that you want to replicate to the slave already exist! The easist way is to just copy the database dirs inside your MySQL data directory intact over to your slave, and then recursively chown them to “mysql:mysql”. Remember, the binary structures are file-system dependant, so you can’t do this between MySQL servers on different OS’s. In this instance you will want to use mysqldump most likely.

3. Create /etc/my.cnf if you do not already have one:

[mysqld]
socket=/tmp/mysql.sock [enter YOUR path to mysql.sock here]
server-id=1
log-bin=mysql-bin
binlog-do-db=bossdb     # input the database which should be replicated
binlog-ignore-db=mysql1 # input the database that should be ignored for replication
binlog-ignore-db=mysql2  # input the database that should be ignored for replication

4. Permit your slave server to replicate by issuing the following SQL command (substituting your slave’s IP and preferred password):

mysql> GRANT REPLICATION SLAVE ON *.* TO 'replicate'@'192.168.1.1' IDENTIFIED BY 'somepass';

5. Shut down and restart MySQL daemon and verify that all is functional.

PREPARATION OF SLAVE

1. Create /etc/my.cnf if you do not already have one:

[mysqld]
socket=/tmp/mysql.sock [enter YOUR path to mysql.sock here]
server-id=2 [MUST be different to master]
master-host=192.168.1.1
master-user=replicate
master-password=somepass

2. Shut down and restart MySQL on slave.

3. Issue the following SQL command to check status:

mysql> show slave statusG;

Ensure that the following two fields are showing this :

Slave_IO_Running: Yes
Slave_SQL_Running: Yes

If not, try to issue the following command :

mysql> start slave;

This will manually start the slave process. Note that only updated tables and entries after the slave process has started will be sent from the master to the slave — it is not a differential replication.

TESTING

Just update some data on the master, and query that record on the slave. The update should be instantaneous.

Test creating a table on the master MySQL server database :

mysql> use replicateddb;
Database changed

mysql> CREATE TABLE example4( id INT NOT NULL AUTO_INCREMENT,  PRIMARY KEY(id),  name VARCHAR(30),   age INT);
Query OK, 0 rows affected (0.04 sec)

And check the database on the slave to ensure that the recently created table on the master was replicated properly.

Menu