Connector/Python 2.1.1 Alpha released with C Extension

MySQL Connector/Python 2.1.1 took a while to release and that was because we had to add some more packages which contains the optional C Extension. Note that this is still Alpha and we want you guys to report any problems and requests.

The Connector/Python C Extension was added because in certain situations, for example reading a huge result set, can take a long time with pure Python. That’s why we choose to interface with Connector/C (libmysqlclient).

Note: Pure Python is still default and it will be kept that way!

Installing Connector/Python 2.1 didn’t change much:

$ sudo python setup.py install

If you’d like the C Extension, you have to first install MySQL Connector/C or have the MySQL Server development packages available. Careful with mixing 32 and 64-bit: make sure Python matches your MySQL libraries. Connector/Python will try to detect the mismatch and notify you.

For example, on OS X with development tools installed, I would do the following:

$ virtualenv CPYENV
$ source CPYENV/bin/activate
$ tar xzf ~/Downloads/mysql-connector-c-6.1.5-osx10.7-x86_64.tar.gz
$ tar xzf ~/Downloads/mysql-connector-python-2.1.1.tar.gz
$ cd mysql-connector-2.1.1
$ python setup.py install --with-mysql-capi=../mysql-connector-c-6.1.5-osx10.7-x86_64

If all goes well, the above would have compiled and install the C Extension together with the pure Python code inside a virtual environment. Here is how you can check if the C Extension is available:

import mysql.connector
mysql.connector.HAVE_CEXT

# Output == True

If you want to see the speed improvements, you can load up the employees sample database and do the following in the Python interpreter:

import mysql.connector
cnx = mysql.connector.connect(user='root', database='employees')
cnxc = mysql.connector.connect(use_pure=False, user='root', database='employees')
cur = cnx.cursor()
q = "SELECT * FROM salaries"
s=time(); cur.execute(q); r=cur.fetchall(); print("%.2f" % (time()-s))

# Output ~= 65.57

cur = cnxc.cursor()
s=time(); cur.execute(q); r=cur.fetchall(); print("%.2f" % (time()-s))

# Output ~= 13.09

That’s 66 seconds vs. 13 seconds using the C Extension.

If that is not fast enough, and it is not, you can directly load the C Extension and use the wrapper around the MySQL C API (see manual). Here’s an example:

import _mysql_connector
cnx = _mysql_connector.MySQL()
cnx.connect(user='root', database='employees')
cnx.query("SELECT emp_no, last_name, hire_date FROM employees")
print(cnx.fetch_row())
cnx.free_result()
cnx.close()

# Output == (10001, 'Facello', datetime.date(1986, 6, 26))

It is a bit different than using mysql.connector, but notice that result coming from the C Extension is also converted to Python data types.

How fast is using _mysql_connector? Lets say we want the raw data, save the following to a Python script file and execute:

from time import time

import _mysql_connector

cnx = _mysql_connector.MySQL(raw=True)
cnx.connect(user='root', database='employees')
cnx.query("SELECT * FROM salaries")
s = time()
row = cnx.fetch_row()

while row:
  row = cnx.fetch_row()

cnx.free_result()
print("All fetched in %.2fs" % (time() - s))

cnx.close()

# Output ~= All fetched in 2.25s

If you put it all together, and this is not scientific, just on my OS X MacBook, SELECT * FORM slaries:

SELECT * FORM salaries Duration
Pure Python, PEP-249 66s
Pure + C Extension, PEP-249 13s
C Extension, direct, with conversion 12s
C Extension, direct, raw 3s

If you want to dump big sets of data, and you want to do it the Python way, you can use the C Extension to get it faster.

Yes, the C Extension works and compiles on Windows!

Comments

Wlad
What it is exactly that is taking so much in pure python connector? An explanation would be nice
Geert Vanderkelen
The issue is with parsing the (text) MySQL packets coming from the server. For relatively small result sets it wouldn’t show. For big results how ever, it starts to matter. There are a few solutions, but to make it really fast, we had to put it in a C Extension.
Luke

I think the examples used to illastrate the difference of the speed between 2 versions are not meaningfull the second is fast than the first maybe because the data is already in the buffer pool of MySQL (so actually the second fetch with Connector with C Extension just get data from memory and no need for physical I/Os)

I think you should restart the mysql server(so the buffer pool is empty) to run the second test

Geert Vanderkelen

I actually ran the SELECT * FROM salaries first, then re-run while timing it. So the data was warmed up. Real benchmarking would evolve more work, but that I think it’s an exercise for the users to publish that :)

The point is: it is faster, and it will solve a few things for people who want to stick with Connector/Python and need to process huge result sets.

Darragh McCurragh
“66 seconds vs. 13 seconds using the C Extension” then down to 3 seconds is really a twelve-fold improvement. Several hundred calls per day that is the difference between “life and death”. thanks.
Jacek
Is the upcoming version comatible with Django 1.8? Current stable no longer works with Django 1.8. Update would be very nice :)
Geert Vanderkelen
@Jacek: Django is releasing fast! Please open a report on bugs.mysql.com and ask for it. It is always good to have community pushing :)