This blog post shows how to create a cursor class for MySQL Connector/Python which will allow you to cache queries. It will hold the query itself and the result in a global variable.
Note: this is a proof of concept and is only meant as a demonstration on how to extend MySQL Connector/Python.
Why query caching?
You are doing lots of queries that have the same result. It would be expensive to always run the same exact query. MySQL has already a query cache, and there is also memcached. But you like MySQL Connector/Python so much you’d like to do it yourself.
A cursor caching queries and their result
To demonstrate a simple implementation of a query cache, we inherit from an existing class: MySQLCursorBuffered
. It will save the executed operation with their results in a ‘global’ variable. We call this cursor MySQLCursorQueryCache
.
We take the buffered cursor because we’d like to save the result right away. Below you see we only changed two methods for MySQLCursorBuffered
:
.execute()
: it will now first check using an md5 checksum whether we executed the query before. If we did, we set the make the cached result active. If not, we simply executed.._handle_resultset()
: called when.execute()
did an operation which has a result set. The result we know save in theQUERY_CACHE
globaldict
.
from hashlib import md5
import mysql.connector
QUERY_CACHE = dict()
class MySQLCursorQueryCache(mysql.connector.cursor.MySQLCursorBuffered):
def execute(self, operation, params=None):
self._qmd5 = md5(operation).digest()
if QUERY_CACHE.has_key(self._qmd5):
(self._rows, eof) = QUERY_CACHE[self._qmd5]
self.rowcount = len(self._rows)
self._handle_eof(eof)
self._next_row = 0
else:
super(MySQLCursorQCache, self).execute(operation, params)
def _handle_resultset(self):
(self._rows, eof) = self.db().protocol.get_rows()
self.rowcount = len(self._rows)
self._handle_eof(eof)
self._next_row = 0
QUERY_CACHE[self._qmd5] = (self._rows, eof)
try:
self.db().unread_result = False
except:
pass
self._qmd5 = None
The above code is a proof of concept, there is lots of room for improvement. For example, you need something to invalidate entries in the query cache.
How to use MySQLCursorQueryCache
def main():
cnx = mysql.connector.connect(database='test')
cur = cnx.cursor(cursor_class=MySQLCursorQueryCache)
cur.execute("SELECT NOW()")
print cur.fetchone()
time.sleep(2)
cur.execute("SELECT NOW()")
print cur.fetchone()
cur.close()
cnx.close()
When you would use the default cursor, both executed SQL statements would produce a different result. The above produces the following output:
(datetime.datetime(2010, 11, 22, 21, 20, 4),)
(datetime.datetime(2010, 11, 22, 21, 20, 4),)
When you’d like to have some statements cached, and some not, just create a second cursor cursor_class=MySQLCursorBuffered
(see Buffering results with MySQL Connector/Python).
Comments
@Marius Gedminas Good point Marius! You’ll need to store the statement after substituting the parameters. Or, you could add the params-argument to the operation string when hashing it:
self._qmd5 = md5(operation + repr(params)).digest()