MySQL Connector/Python is (or should be) compliant with the Python DB-API 2.0 specification. This means that you can use DBUtils' PooledDB module to implement database connection pooling.
Here below you'll find an example which will output the connection ID of each connection requested through the pooling mechanism.
from DBUtils.PooledDB import PooledDB
import mysql.connector
def main():
pool_size = 3
pool = PooledDB(mysql.connector, pool_size,
database='test', user='root', host='127.0.0.1')
cnx = [None,] * pool_size
for i in xrange(0,pool_size):
cnx[i] = pool.connection()
cur = cnx[i].cursor()
cur.execute("SELECT CONNECTION_ID()")
print "Cnx %d has ID %d" % (i+1,cur.fetchone()[0])
cur.close()
for c in cnx:
c.close()
The output will be something like this:
Cnx 1 has ID 42 Cnx 2 has ID 41 Cnx 3 has ID 40
This post explains how to disable Arbitration when using MySQL Cluster. It gives a case where this could be useful.
First, a piece of advice: you do not want to run MySQL Cluster with arbitration disabled. But if you must, e.g. because of an oversight in your implementation, you can.
Arbitration is very important in MySQL Cluster. It makes sure you don't end up with a Split Brain situation: 2 halves working independently, continuing changing data, making it impossible for them to work together later on.
However, Arbitration comes with a price: you need an extra machine. "Sure, what's the big deal?". It's not that easy when you lack the money, or more problematic, when you lack the real-estate in your rack.
Everyone running MySQL Cluster should know that you should not run the ndb_mgmd on the same machines on which the data node processes, ndbd or ndbmtd, are running. The Management Nodes need to be on a separate machine so it can act as an Arbitrator.
Here's an example why: If you have two hosts A and B and both are running a management and data node process. Host A's ndb_mgmd is currently the Arbitrator. Now unplug host A *BANG*: one data node and the arbitrator down. The other data node on Host B notices this, and tries to figure out if it can continue. So it checks if it can reach the Arbitrator: but it's gone as well! So, the data node on host B goes faithfully down. This all happens in a few seconds, there is no time to elect a new Arbitrator. "Cluster's dead, Jim".
What if you can't get a 3rd machine? There's an option for that.. Data nodes can be configured with setting the Arbitration-option to WaitExternal. This means you will have to develop your own arbitration application or script. How cool is that? Well, it might be cool, but it's a pain in the butt.
[ndbd default] Arbitration = WaitExternal ArbitrationTimeout = 3
What happens with our 2 host setup with above changes: When Host A, which has the Arbitrator, goes down, the data node on Host B will wait for 3 seconds, i.e. ArbitrationTimeout. It will block all incoming transactions, refusing changes. An application, the External Arbitrator, running on Host B (actually on all hosts running MySQL Cluster proceses) has 3 seconds to figure out whether Host B can continue running it's ndbd process(es), or not. In this case, it should find out that Host A is down and that Host B should continue keeping the data available.
"Ah, easy! Problem solved!", you might joyfully exclaim. No, it isn't. It's more complicated than that. What happens when Host A doesn't go down, but both hosts can't see each other due to a network issue between them? Both External Arbitrators would figure out that they need to continue: you end up again with a split brain. So you still need someway to handle that.
At this point, I would like to say: "Goodluck!". Every situation is going to be different. Everyone will have his own External Arbitrator requirements or ways to check if a host or blade chassis is up or not. It's a great option, and it puts you more in control of your MySQL Cluster, but it adds a lot of complexity.
So, my advice: revise and correct your MySQL Cluster setup when you think you need to disable Arbitration.
This is a friendly reminder to check the publication date and discussed version you MySQL books before starting out hacking or even posting about limitations. Lots of old books are still going around. Maybe it's good to destroy them rather than giving them to students or newbies.
Few days ago (28 May 2010), for example, we had a word-for-word copy of a book on a blog post (now removed) which was discussing MySQL Cluster limitations from years ago. Well, it was funny at first and we had a good laugh. But it's a bit worrisome. My colleague Matthew posted a rebuttal post.
How would you recycle the old, technical books? It's not worth giving them to public libraries, it's maybe unhealthy to burn them? How would you do it?
We just released MySQL Connector/Python 0.1.5 which includes a critical bug fix. It was impossible to read big result sets. The files for 0.1.4-release have been removed.
You can download MySQL Connector/Python from Launchpad.
Highlights:
- It was impossible to retrieve big result sets. (bug lp:551533 and lp:586003)
- Changing copyright from Sun to Oracle (also fixing silly typo)
A very Big Thanks goes to the reporters of bug lp:551533 and lp:586003. Apologies for not being able to reproduce the bug earlier, before releasing 0.1.4.
About MySQL Connector/Python: MySQL Connector/Python is implementing the MySQL Client/Server protocol completely in Python. No MySQL libraries are needed, and no compilation is necessary to run this Python DB API v2.0 compliant driver. It is compatible with Python v2.5 and later as well as Python v3.1 and later.
Next development release 0.1.4 of MySQL Connector/Python is now available for download. This will be the last in the 0.1-series as we move on to 0.2. The aim is to release more often to get to v1.0. Hurray!
Highlights:
- Reading from network was broken for bigger packages.
- Reimplementing protocol.MySQLProtocol marking simpler and easier to maintain.
- It is now possible to send multiple statements to MySQL using MySQLCursor.execute(). The results are accessible by calling the method next_resultset().
- MySQLCursor.callproc() will now store all result sets as a MySQLCursorBuffered. They are accessible using the next_proc_resultset() method. The result of the stored procedure is returned by callproc() itself as defined by PEP249.
- MySQL DATETIME conversion to Python datetime.datetime is now much faster.
- Some overall performance improvements.
- Copyright notice changes.
Big thanks to everyone using and reporting bugs found in MySQL Connector/Python. Don't hesitate to ask questions and report problems or feature requests using Launchpad.
About MySQL Connector/Python: MySQL Connector/Python is implementing the MySQL Client/Server protocol completely in Python. No MySQL libraries are needed, and no compilation is necessary to run this Python DB API v2.0 compliant driver. It is compatible with Python v2.5 and later as well as Python v3.1 and later.
This blog posts explains how to add a new line in strings in MySQL Stored Procedures and how to output the result using the MySQL client tool.
Today I was fooling around with some stored procedure making it more fancy and stuff. What I wanted was the OUT variable to contain a newline. Easy of course, using CONCAT:
mysql> SELECT CONCAT('foo','\n','bar');
+--------------------------+
| CONCAT('foo','\n','bar') |
+--------------------------+
| foo
bar |
+--------------------------+Now, if youconcat strings in a stored procedure, it doesn't work as expected when you run it through the MySQL client tool mysql:
DELIMITER //
CREATE PROCEDURE sp1(OUT pres VARCHAR(6000))
BEGIN
SET pres = CONCAT('foo','\n','bar');
END;
//
DELIMITER ;
SET @res = 'foo ';
CALL sp1(@res);
SELECT @res;When we execute it, we get this:
shell> mysql -N test < foo.sql foo\nbar
What on earth is wrong? After some looking, we found a not so often used option called --raw. This produces the the desired effect:
shell> mysql -Nr test < foo.sql foo bar
But that's not all! Use \G when selecting the OUT-variable and it is also working. The output is not so useful though.
Ah.. The things you find out while having the day off..
Last week I was struggling to find an easy way to simulate a troubled Data Node (ndbd process) using MySQL Cluster. It's as simple as pancackes: using the kill command!
To freeze a process you just need to kill the process using the SIGSTOP signal. To let the processes continue, use SIGCONT. Here's an example shell script showing how you would use these two signals on a data node:
# 2010-05-03 08:11:46 [ndbd] INFO -- Angel pid: 542 ndb pid: 543
NDBDPID=`grep 'Angel pid' ndb_3_out.log | tail -n1 | awk '{ print $11 }'`
kill -STOP $NDBDPID
sleep 10
kill -CONT $NDBDPID
I'm using the out-log because the file ndb_3.pid contains only the PID of the Angel process. The sleep command is something variable which you can set as low or as high as you want.
In the above example the script sleeps long enough for data node to fail with an Arbitration Error. If you would set options HeartbeatIntervalDbDb and TimeBetweenWatchDogCheck to a lower value than the default, you would only be able to sleep for a few seconds. The result:
[MgmtSrvr] WARNING -- Node 2: Node 3 missed heartbeat 2 [MgmtSrvr] WARNING -- Node 2: Node 3 missed heartbeat 3 [MgmtSrvr] ALERT -- Node 1: Node 3 Disconnected [MgmtSrvr] ALERT -- Node 1: Node 3 Disconnected [MgmtSrvr] WARNING -- Node 2: Node 3 missed heartbeat 4 [MgmtSrvr] ALERT -- Node 2: Node 3 declared dead due to missed heartbeat [MgmtSrvr] INFO -- Node 2: Communication to Node 3 closed [MgmtSrvr] ALERT -- Node 2: Network partitioning - arbitration required [MgmtSrvr] INFO -- Node 2: President restarts arbitration thread [state=7] [MgmtSrvr] ALERT -- Node 2: Arbitration won - positive reply from node 1 [MgmtSrvr] ALERT -- Node 2: Node 3 Disconnected [MgmtSrvr] INFO -- Node 2: Started arbitrator node 1 [ticket=019b00025cc8aad8] [MgmtSrvr] ALERT -- Node 3: Forced node shutdown completed. Caused by error 2305: 'Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node'.
How is this useful? Well, for simulating a data node which is having problems while having load for example. Maybe you would like to see what happens if you tune the WatchDog or Hearbeat parameters. Or maybe you want to give a demonstration to your management without going through hassel of overloading a disk or CPU or pulling network cables (e.g. for prove of concept).
In any case, I think it's a cool use of the kill-command. One I didn't know of.
Last week, my colleague Massimo and I discussed how to handle big result sets coming from MySQL in Python. The problem is that MySQL doesn't support server-side cursors, so you need to select everything and then read it. You can do it either buffered or not. MySQL Connector/Python defaults to non-buffered, meaning that you need to fetch all rows after issuing a SELECT statement. You can also turn on the buffering, mimicking what MySQL for Python (MySQLdb) does.
For big result sets, it's better to limit your search. You can do this using an integer primary key or some temporal field for example. Or you can use the LIMIT keyword. The latter solution is what is used in the MySQLCursorServerSide cursor-class. Using the SELECT it creates a temporary table from which the fetch-methods will get the information. It is something people have probably implemented in their applications, but I hope this new class will make it easier since it's done transparently.
The code is not pushed yet, but expect it to be available in next release. Here is an example how you could use it. This code selects cities staring with Z, loops over the result getting the country (yes, this is a simple join made difficult):
cnx = db.connect(user='root',db='world')
cur = cnx.cursor()
curCity = cnx.cursor(db.cursor.MySQLCursorServerSide)
curCity.execute("SELECT ID,Name,CountryCode FROM City "\
"WHERE NAME LIKE 'Z%' ORDER BY ID")
for city in curCity:
cur.execute("SELECT Code,Name FROM Country WHERE CODE = %s",
(city[2],))
country = cur.fetchone()
print "%s (%s)" % (city[1], country[1])
cur.close()
cnx.close()
I guess the main advantage is that you can use two or more cursor objects with the same connection without the need of buffering everything in Python. On the MySQL side, the temporary table could go to disk when to big. It's maybe slower, but keeping big result sets in memory ain't good either.
Comments are welcome!
Latest code of MySQL Connector/Python on launchpad has support for multiple result sets when you execute a stored procedure. We also changed the way the actual result of the routine is returned to conform to PEP249.
Here is some example code: it creates a stored procedure which generates 2 result sets. You can get the result by calling next_resultset(), which is returning a MySQLCursorBuffered.
cur = cnx.cursor()
cur.execute("DROP PROCEDURE IF EXISTS multi")
proc = """
CREATE PROCEDURE multi(IN pFac1 INT, IN pFac2 INT, OUT pProd INT)
BEGIN
SELECT 1,'a' as FooBar;
SELECT 2;
SET pProd := pFac1 * pFac2;
END"""
cur.execute(proc)
result = cur.callproc("multi", (5, 6, 0))
print "Result:", result
extcur = cur.next_resultset()
i = 1
while extcur:
rows = extcur.fetchall()
print "Result set #%d:" % i
print rows
extcur = cur.next_resultset()
i += 1
cur.close()
The output:
Result: ('5', '6', 30)
Result set #1:
[(1, u'a')]
Result set #2:
[(2,)]
As mentioned above: this will be part of 0.1.4-devel release due next week. Comments are welcome through the associated bug report.
You are using MySQL Cluster and crazy enough to digest NDB API? Sick of SQL? Here's a treat: a function to make C/C++ strings ready for inserting into a VARCHAR field. The special thing about them is that the length is prefixed in the first 2 bytes.
void make_ndb_varchar(char *buffer, char *str)
{
int len = strlen(str);
int hlen = (len > 255) ? 2 : 1;
buffer[0] = len & 0xff;
if( len > 255 )
buffer[1] = (len / 256);
strcpy(buffer+hlen, str);
}
Yes, you can use memcpy. Whatever floats your boat.
Lets use this function for a table t1, defined as follows (note: latin1!):
CREATE TABLE t1 ( id INT UNSIGNED NOT NULL, vc VARCHAR(128), vclong VARCHAR(1280), PRIMARY KEY (id) ) ENGINE=NDB DEFAULT CHARSET=latin1
Here is part of the code, simplified for this post:
char vc[128+1]; // Size of 'vc', +1 for length info
char vclong[1280+2]; // Size of 'vclong', +2 for length info
..
make_ndb_varchar(vc, "NDB API kicks ass");
operation->setValue("vc", vc);
..
The above example uses latin1. You could use Unicode, but that would probably mean converting from one encoding to the other using iconv. That's another story.
This post complements Johan Andersson's blog entry. Thanks to my colleagues Mats and Roger who helped me with a silly problem today regarding this function.
Few days ago, the folks at SQLAlchemy pushed some proposed modification to the MySQL Connector/Python dialect. Before this patch, previous report yielded 72 errors and 11 failures. Now we got down to 9 errors, but the failures are still lingering. Is this an improvement? Yes and no, failures should go down, but there are some SQLAlchemy tests I just can't figure out, yet.. clues are welcome!
Here are some detailed results which also included MySQLdb and oursql. I used SQLAlchemy revision 6788 (i.e. from svn trunk) which has now 2143 unittests:
- mysql.connector rev216: SKIP=1, errors=9, failures=11 (355.707s)
- MySQLdb 1.2.3c1: SKIP=2, errors=8, failures=1 (315.884s)
- oursql 0.9.1: SKIP=1, errors=8, failures=2 (322.318s)
Software used: MacOSX v10.6.2, MySQL v5.1.42 and Python v2.6.1.
Using SQLAlchemy might not be the best way to messure how mature MySQL Connector/Python is, but it sure helps lots.
To far fetched (pun inteded), some might think.. Below you'll find a procedure to get a list of MySQL supported character sets and their collations. The output is Python and can be used to build a (big) tuple.
The problem is that character set IDs in MySQL have 'gaps'. For example hebrew has ID 16, and the next character set in the COLLATIONS-table, tis620, has ID 18. Not a big problem, just a bit annoying. This list is hardcoded in MySQL Connector/Python and I needed something to easily maintain it.
Solutions were using text editing skills, parsing it in Python to produce a list, etc.. But why not a Stored Routine? The following stored procedure is looping over a cursor, calculating the number of rows the gap has, and inserts blanks as needed.
DROP PROCEDURE IF EXISTS python_getcharsets;
DELIMITER //
CREATE PROCEDURE python_getcharsets()
BEGIN
DECLARE i,diff,cid,done INT DEFAULT 0;
DECLARE chname,coname VARCHAR(32);
DECLARE cur CURSOR FOR SELECT ID,CHARACTER_SET_NAME,
COLLATION_NAME FROM INFORMATION_SCHEMA.COLLATIONS
ORDER BY ID;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
OPEN cur;
loop_cur: LOOP
FETCH cur INTO cid,chname,coname;
IF done THEN LEAVE loop_cur; END IF;
SET i = i + 1; SET diff = cid - i;
WHILE diff DO
SELECT "None,";
SET diff = diff - 1; SET i = i + 1;
END WHILE;
SELECT CONCAT('("',chname,'","',coname,'"), # ',cid);
END LOOP loop_cur;
CLOSE cur;
END//
DELIMITER ;
CALL python_getcharsets();
DROP PROCEDURE IF EXISTS python_getcharsets;
You would save the above to a file called getcharsets.sql for example, and execute it like this:
shell> mysql -N test < getcharsets.sql
..
("cp1251","cp1251_bulgarian_ci"), # 14
("latin1","latin1_danish_ci"), # 15
("hebrew","hebrew_general_ci"), # 16
None,
("tis620","tis620_thai_ci"), # 18
("euckr","euckr_korean_ci"), # 19
..
I'll spare you the complete output, but as you can see from above sample: the gap has been stuffed with a None-Python value. Taking this output, you'll inserted it your code:
desc = (
None,
("big5","big5_chinese_ci"), # 1
("latin2","latin2_czech_cs"), # 2
("dec8","dec8_swedish_ci"), # 3
("cp850","cp850_general_ci"), # 4
..
Silly? Definitely basic stuff, but I got my code a bit faster, and cleaner!
Is there a performance penalty using a pure Python database interface? Yes there is. But how much? .. also depends on who wrote it.
I started implementing some benchmarking for MySQL Connector/Python. Main reason is to identified bottlenecks or just plain bad coded. Another reasons: it's cool and the question was raised during my talk at FOSDEM and also online.
Oh, yes, MySQL Connector/Python ain't fast right now. Just compare a script spawning 10 threads opening 1000 connections. It indeed just connects.
mysql.connector 0.1.3-devel average: 0.0091820 MySQLdb 1.2.3c1 average: 0.0026477 oursql 0.9.1 average: 0.0007394 (MacOSX 10.6.2, MySQL 5.1.42, Python 2.6.1)
I'm not surprised with how bad Connector/Python did. I'm more intrigued by how oursql is performing compared to MySQLdb. Nice!
Well, it's a start. Now off for a walk!
A preliminary schedule is now available for the MySQL Conference & Expo 2010 (in Santa Clara, California, USA). I have two talks and a tutorial, currently scheduled as follows:
- MySQL Cluster Tutorial, Mon 08:30 on 12 Apr 2010, together with Andrew Hutchings and Andrew Morgan
- MySQL Cluster: An Introduction, Tue 11:55 on 13 Apr 2010
- Connecting MySQL and Python, Tue 15:05 on 13 Apr 2010
Schedule might change a bit, so I suggest keeping an eye on the conference website.
And please remember: we might have changed colors a few times past 2 years, but we are still the nice and friendly people we were back in MySQL AB!
Yes, MySQL has transactions if you use InnoDB or NDB Cluster for example. Using these transactional storage engines, you'll have to commit (or roll back) your inserts, deletes or updates.
I've seen it a few times now with people being surprised that no data is going into the tables. It's not so a silly problem in the end. If you are used to the defaults in MySQL you don't have to commit anything since it is automatically done for you.
Take the Python Database Interfaces for MySQL. PEP-249 says that, by default, auto-commit should be turned off. You could turn it back on, but it's good practice to be explicit and commit in your code. Remember the Zen of Python!
Here is just a small example to show it. Uses MySQL Connector/Python, but it should also work with the other MySQL database interfaces:
import mysql.connector
cnx = mysql.connector.connect(db='test')
cur = cnx.cursor()
cur.execute("""CREATE TABLE innodb_t1 (
id INT UNSIGNED NOT NULL,
c1 VARCHAR(128),
PRIMARY KEY (id)
) ENGINE=InnoDB""")
ins = "INSERT INTO innodb_t1 (id,c1) VALUES (%s,%s)"
cur.execute(ins,
(1,'MySQL Support Team _is_ already the best',))
cnx.commit()
cur.close()
cnx.close()
This post explains how to compile oursql and install it on MacOS 10.6. oursql is a Python database interface for MySQL, an alternative to MySQL for Python (i.e. MySQLdb) and MySQL Connector/Python.
First, find out which MySQL you installed. This can be either the 32-bit or the 64-bit version. To make sure, find the mysqld (e.g. in /usr/local/mysql/bin) and do the following in a Terminal window:
shell> file /usr/local/mysql/bin/mysqld .../mysqld: Mach-O 64-bit executable x86_64
If you see x86_64, you got 64-bit, otherwise 32-bit. If you see both, then you have a universal build. This is important for specifying the ARGSFLAG when building.
Download oursql from Launchpad and unpack it into some directory. Using the information from above, you'll have to do following for 64-bit platform (or universal build) in a Terminal window:
shell> ARCHFLAGS="-arch x86_64" python setup.py build shell> sudo python setup.py install
For 32-bit, you'll have to do:
shell> ARCHFLAGS="-arch i386" python setup.py build shell> sudo python setup.py install
Following error will be reported when you don't specify the correct ARCHFLAGS:
ld: warning: in .../lib/libmysqlclient.dylib, file is not of required architecture
Tips:
- When building failed, it is good to remove oursql, unpack it and try again.
- If you don't want to compile anything, or run into more troubles, give MySQL Connector/Python a try (alpha releases). It's a pure Python implementation of the MySQL Client/Server protocol and doesn't need compiling or a MySQL installation.
- You can download MySQL from either www.mysql.com or dev.mysql.com.
Apparently, my talk at FOSDEM 2010 about Connecting MySQL and Python was the only one about Python? There should be more, or?
I have a hand-out ready in PDF. The slides are not usable without my chatter. It contains a few examples and links. Any comments, corrections, criticism.. are welcome!
The longer version of this talk will be given at the O'Reilly MySQL Conference&Expo 2010 in Santa Clara, California (USA).
Today we made a change in the schedule of talks held in the MySQL Developer Room at FOSDEM 2010, swapping two talks. Change is:
- 12:15 CET - Giuseppe's 'Sharding for the Masses
- 14:25 CET - Geert's Connecting MySQL and Python (title also changed)
The printed booklets found at the conference will not reflect the change, but the printable schedule has already been updated.
MySQL Connector/Python v0.1.3-devel is now available for download from Launchpad.org. Please note that this is a development (i.e. alpha, unstable, ..) release and we welcome everyone to test and report problems.
Highlights for this v0.1.3-devel:
- Important memory leak fixed when closing cursors.
- Warnings can now be raised as exceptions.
- Fixing unicode usage and broken error message when MySQL chops them
- Client flags can now be set correctly when connecting
- Conversion fixes for BIT/YEARSET and Python to DATE/DATETIME
- Adding MySQL Client Errors and raising better exceptions based on errno.
Enjoy!
Using SQLAlchemy unit test cases to further develop MySQL Connector/Python. It's probably debatable whether that's a good method or just lame. But it sure helps lots!
We've been pushing some code past days that makes Connector/Python almost pass all tests. Well, 4% is still failing, but I'm confident that in a few most problems will be dealt with. I had to make some changes to the SQLAlchemy v0.6 dialect as well, and some test cases had to be corrected. Hopefully those corrections will also go in the SQLAlchemy trunk later on.
shell> nosetests --dburi=mysql+mysqlconnector://root:@localhost/sqlalchemy
..
----------------------------------------------------------------------
Ran 2092 tests in 314.656s
FAILED (errors=72, failures=11)
If you run the same tests against MySQLdb, you'll see (of course) less failures (9 to be exact).
