Refactored: Poor man's MySQL replication monitoring

This is a reply to the blog post Poor man’s MySQL replication monitoring. Haidong Ji had a few problems using MySQLdb (could use the ‘dict’ cursor) and apparently he doesn’t want to much dependencies. I agree that using the [mysql](http://dev.mysql.com/doc/refman/5.5/en/mysql.html) client tool is a nice alternative if you don’t want to use any 3rd party Python modules. And the MySQL client tools are usually and should be installed with the server.

However, since MySQL Connector/Python only needs itself and Python, dependencies are reduced to a minimum. Here you’ll find a refactored version of Haidong’s version (can of course be made much more sophisticated) using the connector:

import sys
from socket import gethostname
import smtplib
import mysql.connector

emailSubject = "Replication problem on slave %s"
emailTo = "recipient@example.com"
emailFrom = "monitor-tool@example.com"

def runCmd(cmd):
    cnx = mysql.connector.connect(user='root',
                                  unix_socket='/path/to/mysql.sock')
    cur = cnx.cursor(buffered=True)
    cur.execute(cmd)
    columns = tuple( [d[0].decode('utf8') for d in cur.description] )
    row = cur.fetchone()
    if row is None:
        raise StandardError("MySQL Server not configured as Slave")
    result = dict(zip(columns, row))
    cur.close()
    cnx.close()
    return result

try:
    slave_status = runCmd("SHOW SLAVE STATUS")
except mysql.connector.

Error, e:
    print >> sys.stderr, "There was a MySQL error:", e
    sys.exit(1)
except StandardError, e:
    print >> sys.stderr, "There was an error:", e
    sys.exit(1)

if (slave_status['Slave_IO_Running'] == 'Yes' and
    slave_status['Slave_SQL_Running'] == 'Yes' and
    slave_status['Last_Errno'] == 0):
    print "Cool"
else:
    emailBody = [
        "From: %s" % emailFrom,
        "To: %s" % emailTo,
        "Subject: %s" % (emailSubject %  gethostname()),
        "",
        '\n'.join([ k + ' : ' + str(v) for k,v in slave_status.iteritems()]),
        "\r\n",
        ]
    server = smtplib.

SMTP("localhost")
    server.sendmail(emailFrom, [emailTo], '\r\n'.join(emailBody))
    server.quit()

Comments

Giuseppe Maxia
Hi Geert! I also took the liberty of refactoring the attempt using very poor material, like the bash shell. http://datacharmer.blogspot.com/2011/04/refactored-again-poor-mans-mysql.html
Geert Vanderkelen
Looks great Giuseppe! :) (corrected your link in the comment)
Ji Village News
I have never tried mysql.connector, and your response expanded my horizon. I will definitely try that out. Thanks!
Joel Chaney

@Joel Chaney Pushed the wrong code. use this one.

#!/bin/sh

Joel Chaney

joel.chaney@mongoosemetrics.com (look at robots.txt)

2012-02-03

repeat_alert_interval=30 # minutes for lock file life lock_file=/tmp/slave_alert.lck # location of lock file

EMAIL=YOURNAME@YOURCOMPANY.DOM # where to send alerts SSTATUS=/tmp/sstatus # location of sstatus file

Code -- do not edit below

NODE=uname -n

Check if alert is locked

function check_alert_lock () { if [ -f $lock_file ] ; then current_file=find $lock_file -cmin -$repeat_alert_interval if [ -n “$current_file” ] ; then # echo “Current lock file found” return 1 else # echo “Expired lock file found” rm $lock_file return 0 fi else touch $lock_file return 0 fi }

SLAVE=mysql

$SLAVE -e ‘SHOW SLAVE STATUS\G’ > $SSTATUS

function extract_value { FILENAME=$1 VAR=$2 grep -w $VAR $FILENAME | awk ‘{print $2}’ }

Master_Binlog=$(extract_value $SSTATUS Master_Log_File ) Master_Position=$(extract_value $SSTATUS Read_Master_Log_Pos ) Master_Host=$(extract_value $SSTATUS Master_Host) Master_Port=$(extract_value $SSTATUS Master_Port) Master_Log_File=$(extract_value $SSTATUS Master_Log_File) Read_Master_Log_Pos=$(extract_value $SSTATUS Read_Master_Log_Pos) Slave_IO_Running=$(extract_value $SSTATUS Slave_IO_Running) Slave_SQL_Running=$(extract_value $SSTATUS Slave_SQL_Running) Slave_ERROR=$(extract_value $SSTATUS Last_Error)

ERROR_COUNT=0 if [ “$Master_Binlog” != “$Master_Log_File” ] then ERRORS[$ERROR_COUNT]=“master binlog ($Master_Binlog) and Master_Log_File ($Master_Log_File) differ” ERROR_COUNT=$(($ERROR_COUNT+1)) fi

POS_DIFFERENCE=$(echo ${Master_Position}-${Read_Master_Log_Pos}|bc)

if [ $POS_DIFFERENCE -gt 1000 ] then ERRORS[$ERROR_COUNT]=“The slave is lagging behind of $POS_DIFFERENCE” ERROR_COUNT=$(($ERROR_COUNT+1)) fi

if [ “$Slave_IO_Running” == “No” ] then ERRORS[$ERROR_COUNT]=“Replication is stopped” ERROR_COUNT=$(($ERROR_COUNT+1)) fi

if [ “$Slave_SQL_Running” == “No” ] then ERRORS[$ERROR_COUNT]=“Replication (SQL) is stopped” ERROR_COUNT=$(($ERROR_COUNT+1)) fi

if [ $ERROR_COUNT -gt 0 ] then if [ check_alert_lock == 0 ] then SUBJECT=“${NODE}-ERRORS in replication” BODY=” CNT=0 while [ “$CNT” != “$ERROR_COUNT” ] do BODY=“$BODY ${ERRORS[$CNT]}” CNT=$(($CNT+1)) done BODY=$BODY” \n${Slave_ERROR}” echo $BODY | mail -s “$SUBJECT” $EMAIL fi else echo “Replication OK” fi