Archive

Posts Tagged ‘cluster’

Debugging MySQL Cluster installed using RPMs using gdb

July 19th, 2011 2 comments

This post explains how to debug MySQL Cluster 7.1, installed using the RPM packages, using gdb on a Linux box (Red Hat, Oracle Enterprise Linux, CentOS, ..).

When a data node crashes lots of information goes into the error log, trace files and out log. However, it makes sometimes sense when you can repeat the crash, to run the data node in debug mode, or using gdb.

First, using RPMs and a Linux distribution, make sure you have the ‘debuginfo’ package installed. For example, for Red Hat or Oracle Enterprise Linux on a 64-bit machine, this package would be called: MySQL-Cluster-gpl-debuginfo-7.1.15-1.rhel5.x86_64.rpm .

Create a file with the following commands, we will name it ‘ndbd.gdb’:

set pagination off
set logging overwrite on
set logging file ndbd_gdb_backtrace.txt
set logging on
run --foreground -c <YourMGM:1186> --ndb-nodeid=<YourID>
thread apply all bt
set logging off

Note line 5: pass the options to ‘run’ which you usually pass when starting ndbd, but leave the –foreground option.
Note line 3: you can save of course the logging file wherever you want to.

Then all you need to do is run gdb with the commands file and the ndbd binary you just created:

shell> gdb /usr/sbin/ndbd -x ndbd.gdb

A full backtrace of threads will be available in the logging file when ndbd crashes. This you can then upload when reporting bugs.

There are probably more options and things you can do, please post them in the comments!

Tags: , ,

MySQL Cluster: Rotating the log file of the Data Nodes

May 2nd, 2011 No comments

There is a log file called ndb_<NodeID>_out.log created by the MySQL Cluster data nodes which can become quite big overtime. There is, unlike the cluster logs created by the management nodes, no rotation build in. So you have to revert to the basics and copy the file away, truncating the old one.

For example, if you want to ‘rotate’ the log file of data node with NodeID 3:

shell> mv ndb_3_out.log.1.gz ndb_3_out.log.2.gz
shell> cp ndb_3_out.log ndb_3_out.log.1
shell> cat /dev/null > ndb_3_out.log
shell> gzip ndb_3_out.log.1

It’s not elegant, and you might lose some entries, but it will help you keeping disk usage minimal. If you don’t need the log at all, just line 3 would do the trick.

You can use logrotate‘s copytruncate to achieve something similar.

Tags: , ,

Custom logger for your MySQL Cluster data nodes

April 3rd, 2011 No comments

The MySQL Cluster data node log files can become very big. The best solution is to actually fix the underlying problem. But if you know what you are doing, you can work around it and filter out these annoying log entries.

An example of ‘annoying’ entries is when you run MySQL Cluster on virtual machines (not good!) and disks and OS can’t follow any more; a few lines from the ndb_X_out.log:

2011-04-03 10:52:31 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in: Scanning Timers elapsed=100
2011-04-03 10:52:31 [ndbd] INFO     -- timerHandlingLab now: 1301820751642 sent: 1301820751395 diff: 247
2011-04-03 10:52:31 [ndbd] INFO     -- Watchdog: User time: 296  System time: 536
2011-04-03 10:52:31 [ndbd] INFO     -- Watchdog: User time: 296  System time: 536
2011-04-03 10:52:31 [ndbd] WARNING  -- Watchdog: Warning overslept 276 ms, expected 100 ms.
2011-04-03 10:53:33 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in: Performing Receive elapsed=100
2011-04-03 10:53:33 [ndbd] INFO     -- Watchdog: User time: 314  System time: 571
2011-04-03 10:53:33 [ndbd] INFO     -- timerHandlingLab now: 1301820813839 sent: 1301820813476 diff: 363
2011-04-03 10:53:33 [ndbd] INFO     -- Watchdog: User time: 314  System time: 571

You can’t set the log levels like you would do for the cluster logs produced by the management node. However, you can run the data nodes so they put messages to STDOUT and redirect it to a script:

ndbd --nodaemon 2>&1 | ndbd_logger.py /var/log/ndb_3_out.log &

And here’s the ndbd_logger.py script filtering out the ‘annoying’ messages. Extra candy: it fixes lines which do not have a timestamp!

import sys
import os
import socket
from time import strftime

FILTERED = (
  'Watchdog',
  'timerHandlingLab',
  'time to complete',
)      

def main():
  try:
    log_file = sys.argv[1]
  except IndexError:
    print "Need location for log file (preferable absolute path)"
    sys.exit(1)
  
  try:
    fp = open(log_file,'ab')
  except IOError, e:
    print "Failed openeing file: %s" % e
    sys.exit(2)
    
  while True:
    line = sys.stdin.readline().strip()
    if line == '':
      break
    for f in FILTERED:
      if line.find(f) > -1:
        continue
    if line.find('[ndbd]') == -1:
      line = strftime('%Y-%m-%d %H:%M:%S [ndbd] NA       -- ') + line
    fp.write(line + '\n')
    fp.flush()
  fp.write(strftime('%Y-%m-%d %H:%M:%S Closing log\n'))
  fp.close()

if __name__ == '__main__':
  main()

The above script can definitely be improved, but it shows the basics. I particularly like the timestamp fixing.

Tags: , ,

Running MySQL Cluster without Arbitrator: don’t, but if you have to..

June 16th, 2010 1 comment

This post explains how to disable Arbitration when using MySQL Cluster. It gives a case where this could be useful.

First, a piece of advice: you do not want to run MySQL Cluster with arbitration disabled. But if you must, e.g. because of an oversight in your implementation, you can.
Arbitration is very important in MySQL Cluster. It makes sure you don’t end up with a Split Brain situation: 2 halves working independently, continuing changing data, making it impossible for them to work together later on.

However, Arbitration comes with a price: you need an extra machine. “Sure, what’s the big deal?”. It’s not that easy when you lack the money, or more problematic, when you lack the real-estate in your rack.

Everyone running MySQL Cluster should know that you should not run the ndb_mgmd on the same machines on which the data node processes, ndbd or ndbmtd, are running. The Management Nodes need to be on a separate machine so it can act as an Arbitrator.

Here’s an example why: If you have two hosts A and B and both are running a management and data node process. Host A’s ndb_mgmd is currently the Arbitrator. Now unplug host A *BANG*: one data node and the arbitrator down. The other data node on Host B notices this, and tries to figure out if it can continue. So it checks if it can reach the Arbitrator: but it’s gone as well! So, the data node on host B goes faithfully down. This all happens in a few seconds, there is no time to elect a new Arbitrator. “Cluster’s dead, Jim”.

What if you can’t get a 3rd machine? There’s an option for that.. Data nodes can be configured with setting the Arbitration-option to WaitExternal. This means you will have to develop your own arbitration application or script. How cool is that? Well, it might be cool, but it’s a pain in the butt.

[ndbd default]
Arbitration = WaitExternal
ArbitrationTimeout = 3

What happens with our 2 host setup with above changes: When Host A, which has the Arbitrator, goes down, the data node on Host B will wait for 3 seconds, i.e. ArbitrationTimeout. It will block all incoming transactions, refusing changes. An application, the External Arbitrator, running on Host B (actually on all hosts running MySQL Cluster proceses) has 3 seconds to figure out whether Host B can continue running it’s ndbd process(es), or not. In this case, it should find out that Host A is down and that Host B should continue keeping the data available.

“Ah, easy! Problem solved!”, you might joyfully exclaim. No, it isn’t. It’s more complicated than that. What happens when Host A doesn’t go down, but both hosts can’t see each other due to a network issue between them? Both External Arbitrators would figure out that they need to continue: you end up again with a split brain. So you still need someway to handle that.

At this point, I would like to say: “Goodluck!”. Every situation is going to be different. Everyone will have his own External Arbitrator requirements or ways to check if a host or blade chassis is up or not. It’s a great option, and it puts you more in control of your MySQL Cluster, but it adds a lot of complexity.

So, my advice: revise and correct your MySQL Cluster setup when you think you need to disable Arbitration.

Tags: , ,

Signals to freeze a Data Node: simulating trouble

May 3rd, 2010 1 comment

Last week I was struggling to find an easy way to simulate a troubled Data Node (ndbd process) using MySQL Cluster. It’s as simple as pancackes: using the kill command!

To freeze a process you just need to kill the process using the SIGSTOP signal. To let the processes continue, use SIGCONT. Here’s an example shell script showing how you would use these two signals on a data node:

# 2010-05-03 08:11:46 [ndbd] INFO     -- Angel pid: 542 ndb pid: 543
NDBDPID=`grep 'Angel pid' ndb_3_out.log | tail -n1 | awk '{ print $11 }'`
kill -STOP $NDBDPID
sleep 10
kill -CONT $NDBDPID

I’m using the out-log because the file ndb_3.pid contains only the PID of the Angel process. The sleep command is something variable which you can set as low or as high as you want.

In the above example the script sleeps long enough for data node to fail with an Arbitration Error. If you would set options HeartbeatIntervalDbDb and TimeBetweenWatchDogCheck to a lower value than the default, you would only be able to sleep for a few seconds. The result:

 [MgmtSrvr] WARNING  -- Node 2: Node 3 missed heartbeat 2
 [MgmtSrvr] WARNING  -- Node 2: Node 3 missed heartbeat 3
 [MgmtSrvr] ALERT    -- Node 1: Node 3 Disconnected
 [MgmtSrvr] ALERT    -- Node 1: Node 3 Disconnected
 [MgmtSrvr] WARNING  -- Node 2: Node 3 missed heartbeat 4
 [MgmtSrvr] ALERT    -- Node 2: Node 3 declared dead due to missed heartbeat
 [MgmtSrvr] INFO     -- Node 2: Communication to Node 3 closed
 [MgmtSrvr] ALERT    -- Node 2: Network partitioning - arbitration required
 [MgmtSrvr] INFO     -- Node 2: President restarts arbitration thread [state=7]
 [MgmtSrvr] ALERT    -- Node 2: Arbitration won - positive reply from node 1
 [MgmtSrvr] ALERT    -- Node 2: Node 3 Disconnected
 [MgmtSrvr] INFO     -- Node 2: Started arbitrator node 1 [ticket=019b00025cc8aad8]
 [MgmtSrvr] ALERT    -- Node 3: Forced node shutdown completed. 
  Caused by error 2305:  'Node lost connection to other nodes and can not
  form a unpartitioned cluster, please investigate if there are error(s)
  on other node(s)(Arbitration error). Temporary error, restart node'.

How is this useful? Well, for simulating a data node which is having problems while having load for example. Maybe you would like to see what happens if you tune the WatchDog or Hearbeat parameters. Or maybe you want to give a demonstration to your management without going through hassel of overloading a disk or CPU or pulling network cables (e.g. for prove of concept).

In any case, I think it’s a cool use of the kill-command. One I didn’t know of.

Tags: ,

Insert data into a VARCHAR field using NDB API: a solution

April 2nd, 2010 2 comments

You are using MySQL Cluster and crazy enough to digest NDB API? Sick of SQL? Here’s a treat: a function to make C/C++ strings ready for inserting into a VARCHAR field. The special thing about them is that the length is prefixed in the first 2 bytes.

void make_ndb_varchar(char *buffer, char *str)
{
  int len = strlen(str);
  int hlen = (len > 255) ? 2 : 1;
  buffer[0] = len & 0xff;
  if( len > 255 )
    buffer[1] = (len / 256);
  strcpy(buffer+hlen, str);
}

Yes, you can use memcpy. Whatever floats your boat.

Lets use this function for a table t1, defined as follows (note: latin1!):

CREATE TABLE t1 (
  id INT UNSIGNED NOT NULL,
  vc VARCHAR(128),
  vclong VARCHAR(1280),
  PRIMARY KEY (id)
  ) ENGINE=NDB DEFAULT CHARSET=latin1

Here is part of the code, simplified for this post:

 char vc[128+1]; // Size of 'vc', +1 for length info
 char vclong[1280+2]; // Size of 'vclong', +2 for length info
 ..
 make_ndb_varchar(vc, "NDB API kicks ass");
 operation->setValue("vc", vc);
 ..

The above example uses latin1. You could use Unicode, but that would probably mean converting from one encoding to the other using iconv. That’s another story.

This post complements Johan Andersson’s blog entry. Thanks to my colleagues Mats and Roger who helped me with a silly problem today regarding this function.

Tags: , ,

My sessions at the MySQL Conference & Expo 2010

February 10th, 2010 No comments

A preliminary schedule is now available for the MySQL Conference & Expo 2010 (in Santa Clara, California, USA). I have two talks and a tutorial, currently scheduled as follows:

Schedule might change a bit, so I suggest keeping an eye on the conference website.

And please remember: we might have changed colors a few times past 2 years, but we are still the nice and friendly people we were back in MySQL AB!

MySQL Cluster: Listen and know what’s going on

December 16th, 2009 2 comments

You’re not scared of writing clusterious code and eavesdropping is your favorite pastime at work? You want to know what’s going on in your MySQL Cluster but were afraid asking? The MySQL Cluster Management API can help you!

Below you’ll find example C-code that will get you started with MGM API. It’s rather dull at first, but imagine you, instead of printing the event information, taking action. Imagine you starting another thread where you run some procedure which tells a monitoring system: “Hey! Some error happend!”. Imagine you spending countless hours of clusterious fun with this API!

The example listens for events in 4 categories (Statistic, Info, Error and Checkpoint) with the greatest level of 15. The code only prints out what event was received, and it also shows how to interprete the event.category, which proves to be a bit tricky.

#include <stdlib.h>
#include <unistd.h>
#include <mgmapi/mgmapi.h>

int main()
{
  NdbMgmHandle handle = ndb_mgm_create_handle();
  
  NdbLogEventHandle logevent;
  struct ndb_logevent event;
  
  int res, category;
  int timeout = 1000; // in ms
  int filter[] = {
    15, NDB_MGM_EVENT_CATEGORY_STATISTIC,
    15, NDB_MGM_EVENT_CATEGORY_INFO,
    15, NDB_MGM_EVENT_CATEGORY_ERROR,
    15, NDB_MGM_EVENT_CATEGORY_CHECKPOINT,
    0
  };
  
  ndb_mgm_set_connectstring(handle,"localhost");
  ndb_mgm_connect(handle,0,0,0);
  
  logevent = ndb_mgm_create_logevent_handle(handle, filter);
  while(true) {
    res = ndb_logevent_get_next(logevent,&event,timeout);
    category = event.category + CFG_MIN_LOGLEVEL;
    if (category == NDB_MGM_EVENT_CATEGORY_CHECKPOINT) {
      printf("We got a checkpoint event!\n");
    }
    printf("Event type %d category %d\n",
       event.type, event.category);
    usleep(500); // temporize!
  }
  
  ndb_mgm_destroy_handle(&handle);
}

Note that the above is very bare bone without error handling, but it works. Also, run it on the machine which runs your Management Node (ndb_mgmd), or change the connection string in the code.

Compiling can be a bit of a chore, unless you read “Want to compile a MySQL Cluster MGM API application?“.

Tags: , ,

Clusterious defined

December 10th, 2009 1 comment

clus·te·ri·ous |ˈkləstərēəs;|
adjective
highly pleasant to MySQL Support Engineers: A clusterious issue a day, keeps the spouse away.
· clusterful : a hard day.

DERIVATIVES
clusteriously adverb
clusteriousness noun

ORIGIN MySQL Support Team: via Sun Microsystems.

Clusterious
noun
a rare variety of MySQL Support Engineers originally cultivated in Sweden.

Tags: , ,

Running MySQL Cluster on Mac: working around a ndb_mgmd bug

November 25th, 2009 3 comments

A week ago we found a workaround for a bug in MySQL Cluster making it impossible to run a management node on MacOS X. Until the bug is fixed, you should use the --nodaemon option for the ndb_mgmd executable. Both MySQL Cluster v6.3 and v7.0 are affected.

Currently, I’m starting the management node like this:


(
cd /opt/mysql/mysql ;
./libexec/ndb_mgmd -f /opt/mysql/config.ini \
--nodaemon 2>/dev/null 1>&2 </dev/null &
)

Obviously, you’ll want to change the paths.

Eventually, the bug will get fixed, but until then you got no excuse to not try MySQL Cluster on Mac!

Tags: , ,