Refactored: Poor man’s MySQL replication monitoring
This is a reply to the blog post Poor man’s MySQL replication monitoring. Haidong Ji had a few problems using MySQLdb (could use the ‘dict’ cursor) and apparently he doesn’t want to much dependencies. I agree that using the mysql client tool is a nice alternative if you don’t want to use any 3rd party Python modules. And the MySQL client tools are usually and should be installed with the server.
However, since MySQL Connector/Python only needs itself and Python, dependencies are reduced to a minimum. Here you’ll find a refactored version of Haidong’s version (can of course be made much more sophisticated) using the connector:
import sys
from socket import gethostname
import smtplib
import mysql.connector
emailSubject = "Replication problem on slave %s"
emailTo = "recipient@example.com"
emailFrom = "monitor-tool@example.com"
def runCmd(cmd):
cnx = mysql.connector.connect(user='root',
unix_socket='/path/to/mysql.sock')
cur = cnx.cursor(buffered=True)
cur.execute(cmd)
columns = tuple( [d[0].decode('utf8') for d in cur.description] )
row = cur.fetchone()
if row is None:
raise StandardError("MySQL Server not configured as Slave")
result = dict(zip(columns, row))
cur.close()
cnx.close()
return result
try:
slave_status = runCmd("SHOW SLAVE STATUS")
except mysql.connector.Error, e:
print >> sys.stderr, "There was a MySQL error:", e
sys.exit(1)
except StandardError, e:
print >> sys.stderr, "There was an error:", e
sys.exit(1)
if (slave_status['Slave_IO_Running'] == 'Yes' and
slave_status['Slave_SQL_Running'] == 'Yes' and
slave_status['Last_Errno'] == 0):
print "Cool"
else:
emailBody = [
"From: %s" % emailFrom,
"To: %s" % emailTo,
"Subject: %s" % (emailSubject % gethostname()),
"",
'\n'.join([ k + ' : ' + str(v) for k,v in slave_status.iteritems()]),
"\r\n",
]
server = smtplib.SMTP("localhost")
server.sendmail(emailFrom, [emailTo], '\r\n'.join(emailBody))
server.quit()
Facebook
LinkedIn
Twitter
Hi Geert!
I also took the liberty of refactoring the attempt using very poor material, like the bash shell.
http://datacharmer.blogspot.com/2011/04/refactored-again-poor-mans-mysql.html
Looks great Giuseppe! :)
(corrected your link in the comment)
I have never tried mysql.connector, and your response expanded my horizon. I will definitely try that out. Thanks!
@Joel Chaney
Pushed the wrong code. use this one.
#!/bin/sh
## Joel Chaney##
## joel.chaney@mongoosemetrics.com (look at robots.txt) ##
## 2012-02-03 ##
repeat_alert_interval=30 # minutes for lock file life
lock_file=/tmp/slave_alert.lck # location of lock file
EMAIL=YOURNAME@YOURCOMPANY.DOM # where to send alerts
SSTATUS=/tmp/sstatus # location of sstatus file
### Code — do not edit below ##
NODE=`uname -n`
## Check if alert is locked ##
function check_alert_lock () {
if [ -f $lock_file ] ; then
current_file=`find $lock_file -cmin -$repeat_alert_interval`
if [ -n "$current_file" ] ; then
# echo “Current lock file found”
return 1
else
# echo “Expired lock file found”
rm $lock_file
return 0
fi
else
touch $lock_file
return 0
fi
}
SLAVE=mysql
$SLAVE -e ‘SHOW SLAVE STATUS\G’ > $SSTATUS
function extract_value {
FILENAME=$1
VAR=$2
grep -w $VAR $FILENAME | awk ‘{print $2}’
}
Master_Binlog=$(extract_value $SSTATUS Master_Log_File )
Master_Position=$(extract_value $SSTATUS Read_Master_Log_Pos )
Master_Host=$(extract_value $SSTATUS Master_Host)
Master_Port=$(extract_value $SSTATUS Master_Port)
Master_Log_File=$(extract_value $SSTATUS Master_Log_File)
Read_Master_Log_Pos=$(extract_value $SSTATUS Read_Master_Log_Pos)
Slave_IO_Running=$(extract_value $SSTATUS Slave_IO_Running)
Slave_SQL_Running=$(extract_value $SSTATUS Slave_SQL_Running)
Slave_ERROR=$(extract_value $SSTATUS Last_Error)
ERROR_COUNT=0
if [ "$Master_Binlog" != "$Master_Log_File" ]
then
ERRORS[$ERROR_COUNT]=”master binlog ($Master_Binlog) and Master_Log_File ($Master_Log_File) differ”
ERROR_COUNT=$(($ERROR_COUNT+1))
fi
POS_DIFFERENCE=$(echo ${Master_Position}-${Read_Master_Log_Pos}|bc)
if [ $POS_DIFFERENCE -gt 1000 ]
then
ERRORS[$ERROR_COUNT]=”The slave is lagging behind of $POS_DIFFERENCE”
ERROR_COUNT=$(($ERROR_COUNT+1))
fi
if [ "$Slave_IO_Running" == "No" ]
then
ERRORS[$ERROR_COUNT]=”Replication is stopped”
ERROR_COUNT=$(($ERROR_COUNT+1))
fi
if [ "$Slave_SQL_Running" == "No" ]
then
ERRORS[$ERROR_COUNT]=”Replication (SQL) is stopped”
ERROR_COUNT=$(($ERROR_COUNT+1))
fi
if [ $ERROR_COUNT -gt 0 ]
then
if [ check_alert_lock == 0 ]
then
SUBJECT=”${NODE}-ERRORS in replication”
BODY=”
CNT=0
while [ "$CNT" != "$ERROR_COUNT" ]
do
BODY=”$BODY ${ERRORS[$CNT]}”
CNT=$(($CNT+1))
done
BODY=$BODY” \n${Slave_ERROR}”
echo $BODY | mail -s “$SUBJECT” $EMAIL
fi
else
echo “Replication OK”
fi
…