Custom logger for your MySQL Cluster data nodes

The MySQL Cluster data node log files can become very big. The best solution is to actually fix the underlying problem. But if you know what you are doing, you can work around it and filter out these annoying log entries.

An example of ‘annoying’ entries is when you run MySQL Cluster on virtual machines (not good!) and disks and OS can’t follow any more; a few lines from the ndb_X_out.log:

2011-04-03 10:52:31 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in: Scanning Timers elapsed=100
2011-04-03 10:52:31 [ndbd] INFO     -- timerHandlingLab now: 1301820751642 sent: 1301820751395 diff: 247
2011-04-03 10:52:31 [ndbd] INFO     -- Watchdog: User time: 296  System time: 536
2011-04-03 10:52:31 [ndbd] INFO     -- Watchdog: User time: 296  System time: 536
2011-04-03 10:52:31 [ndbd] WARNING  -- Watchdog: Warning overslept 276 ms, expected 100 ms.
2011-04-03 10:53:33 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in: Performing Receive elapsed=100
2011-04-03 10:53:33 [ndbd] INFO     -- Watchdog: User time: 314  System time: 571
2011-04-03 10:53:33 [ndbd] INFO     -- timerHandlingLab now: 1301820813839 sent: 1301820813476 diff: 363
2011-04-03 10:53:33 [ndbd] INFO     -- Watchdog: User time: 314  System time: 571

You can’t set the log levels like you would do for the cluster logs produced by the management node. However, you can run the data nodes so they put messages to STDOUT and redirect it to a script:

$ ndbd --nodaemon 2>&1 | ndbd_logger.py /var/log/ndb_3_out.log &

And here’s the ndbd_logger.py script filtering out the ‘annoying’ messages. Extra candy: it fixes lines which do not have a timestamp!

import sys
import os
import socket
from time import strftime

FILTERED = (
  'Watchdog',
  'timerHandlingLab',
  'time to complete',
)      

def main():
  try:
    log_file = sys.argv[1]
  except IndexError:
    print "Need location for log file (preferable absolute path)"
    sys.exit(1)

  try:
    fp = open(log_file,'ab')
  except IOError, e:
    print "Failed openeing file: %s" % e
    sys.exit(2)

  while True:
    line = sys.stdin.readline().strip()
    if line == '':
      break
    for f in FILTERED:
      if line.find(f) > -1:
        continue
    if line.find('[ndbd]') == -1:
      line = strftime('%Y-%m-%d %H:%M:%S [ndbd] NA       -- ') + line
    fp.write(line + '\n')
    fp.flush()
  fp.write(strftime('%Y-%m-%d %H:%M:%S Closing log\n'))
  fp.close()

if __name__ == '__main__':
  main()

The above script can definitely be improved, but it shows the basics. I particularly like the timestamp fixing.