Home > MySQL, Python, Work > Refactored: Poor man’s MySQL replication monitoring

Refactored: Poor man’s MySQL replication monitoring

This is a reply to the blog post Poor man’s MySQL replication monitoring. Haidong Ji had a few problems using MySQLdb (could use the ‘dict’ cursor) and apparently he doesn’t want to much dependencies. I agree that using the mysql client tool is a nice alternative if you don’t want to use any 3rd party Python modules. And the MySQL client tools are usually and should be installed with the server.

However, since MySQL Connector/Python only needs itself and Python, dependencies are reduced to a minimum. Here you’ll find a refactored version of Haidong’s version (can of course be made much more sophisticated) using the connector:

import sys
from socket import gethostname
import smtplib
import mysql.connector

emailSubject = "Replication problem on slave %s"
emailTo = "recipient@example.com"
emailFrom = "monitor-tool@example.com"

def runCmd(cmd):
    cnx = mysql.connector.connect(user='root',
                                  unix_socket='/path/to/mysql.sock')
    cur = cnx.cursor(buffered=True)
    cur.execute(cmd)
    columns = tuple( [d[0].decode('utf8') for d in cur.description] )
    row = cur.fetchone()
    if row is None:
        raise StandardError("MySQL Server not configured as Slave")
    result = dict(zip(columns, row))
    cur.close()
    cnx.close()
    return result

try:
    slave_status = runCmd("SHOW SLAVE STATUS")
except mysql.connector.Error, e:
    print >> sys.stderr, "There was a MySQL error:", e
    sys.exit(1)
except StandardError, e:
    print >> sys.stderr, "There was an error:", e
    sys.exit(1)
    
if (slave_status['Slave_IO_Running'] == 'Yes' and
    slave_status['Slave_SQL_Running'] == 'Yes' and
    slave_status['Last_Errno'] == 0):
    print "Cool"
else:
    emailBody = [
        "From: %s" % emailFrom,
        "To: %s" % emailTo,
        "Subject: %s" % (emailSubject %  gethostname()),
        "",
        '\n'.join([ k + ' : ' + str(v) for k,v in slave_status.iteritems()]),
        "\r\n",
        ]
    server = smtplib.SMTP("localhost")
    server.sendmail(emailFrom, [emailTo], '\r\n'.join(emailBody))
    server.quit()
  1. April 7th, 2011 at 14:43 | #1

    Hi Geert!
    I also took the liberty of refactoring the attempt using very poor material, like the bash shell.
    http://datacharmer.blogspot.com/2011/04/refactored-again-poor-mans-mysql.html

  2. April 8th, 2011 at 05:09 | #3

    I have never tried mysql.connector, and your response expanded my horizon. I will definitely try that out. Thanks!

  3. February 3rd, 2012 at 22:07 | #4

    @Joel Chaney
    Pushed the wrong code. use this one.

    #!/bin/sh
    ## Joel Chaney##
    ## joel.chaney@mongoosemetrics.com (look at robots.txt) ##
    ## 2012-02-03 ##

    repeat_alert_interval=30 # minutes for lock file life
    lock_file=/tmp/slave_alert.lck # location of lock file

    EMAIL=YOURNAME@YOURCOMPANY.DOM # where to send alerts
    SSTATUS=/tmp/sstatus # location of sstatus file

    ### Code — do not edit below ##
    NODE=`uname -n`
    ## Check if alert is locked ##
    function check_alert_lock () {
    if [ -f $lock_file ] ; then
    current_file=`find $lock_file -cmin -$repeat_alert_interval`
    if [ -n "$current_file" ] ; then
    # echo “Current lock file found”
    return 1
    else
    # echo “Expired lock file found”
    rm $lock_file
    return 0
    fi
    else
    touch $lock_file
    return 0
    fi
    }

    SLAVE=mysql

    $SLAVE -e ‘SHOW SLAVE STATUS\G’ > $SSTATUS

    function extract_value {
    FILENAME=$1
    VAR=$2
    grep -w $VAR $FILENAME | awk ‘{print $2}’
    }

    Master_Binlog=$(extract_value $SSTATUS Master_Log_File )
    Master_Position=$(extract_value $SSTATUS Read_Master_Log_Pos )
    Master_Host=$(extract_value $SSTATUS Master_Host)
    Master_Port=$(extract_value $SSTATUS Master_Port)
    Master_Log_File=$(extract_value $SSTATUS Master_Log_File)
    Read_Master_Log_Pos=$(extract_value $SSTATUS Read_Master_Log_Pos)
    Slave_IO_Running=$(extract_value $SSTATUS Slave_IO_Running)
    Slave_SQL_Running=$(extract_value $SSTATUS Slave_SQL_Running)
    Slave_ERROR=$(extract_value $SSTATUS Last_Error)

    ERROR_COUNT=0
    if [ "$Master_Binlog" != "$Master_Log_File" ]
    then
    ERRORS[$ERROR_COUNT]=”master binlog ($Master_Binlog) and Master_Log_File ($Master_Log_File) differ”
    ERROR_COUNT=$(($ERROR_COUNT+1))
    fi

    POS_DIFFERENCE=$(echo ${Master_Position}-${Read_Master_Log_Pos}|bc)

    if [ $POS_DIFFERENCE -gt 1000 ]
    then
    ERRORS[$ERROR_COUNT]=”The slave is lagging behind of $POS_DIFFERENCE”
    ERROR_COUNT=$(($ERROR_COUNT+1))
    fi

    if [ "$Slave_IO_Running" == "No" ]
    then
    ERRORS[$ERROR_COUNT]=”Replication is stopped”
    ERROR_COUNT=$(($ERROR_COUNT+1))
    fi

    if [ "$Slave_SQL_Running" == "No" ]
    then
    ERRORS[$ERROR_COUNT]=”Replication (SQL) is stopped”
    ERROR_COUNT=$(($ERROR_COUNT+1))
    fi

    if [ $ERROR_COUNT -gt 0 ]
    then
    if [ check_alert_lock == 0 ]
    then
    SUBJECT=”${NODE}-ERRORS in replication”
    BODY=”
    CNT=0
    while [ "$CNT" != "$ERROR_COUNT" ]
    do
    BODY=”$BODY ${ERRORS[$CNT]}”
    CNT=$(($CNT+1))
    done
    BODY=$BODY” \n${Slave_ERROR}”
    echo $BODY | mail -s “$SUBJECT” $EMAIL
    fi
    else
    echo “Replication OK”
    fi