Query caching with MySQL Connector/Python
This blog post shows how to create a cursor class for MySQL Connector/Python which will allow you to cache queries. It will hold the query itself and the result in a global variable.
Note: this is a proof of concept and is only meant as a demonstration on how to extend MySQL Connector/Python.
Why query caching?
You are doing lots of queries that have the same result. It would be expensive to always run the same exact query. MySQL has already a query cache, and there is also memcached. But you like MySQL Connector/Python so much you’d like to do it yourself.
A cursor caching queries and their result
To demonstrate a simple implementation of a query cache, we inherit from an existing class: MySQLCursorBuffered. It will save the executed operation with their results in a ‘global’ variable. We call this cursor MySQLCursorQueryCache.
We take the buffered cursor because we’d like to save the result right away. Below you see we only changed two methods for MySQLCursorBuffered:
- .execute(): it will now first check using an md5 checksum whether we executed the query before. If we did, we set the make the cached result active. If not, we simply executed.
- ._handle_resultset(): called when .execute() did an operation which has a result set. The result we know save in the QUERY_CACHE global dict.
from hashlib import md5 import mysql.connector QUERY_CACHE = dict() class MySQLCursorQueryCache(mysql.connector.cursor.MySQLCursorBuffered): def execute(self, operation, params=None): self._qmd5 = md5(operation).digest() if QUERY_CACHE.has_key(self._qmd5): (self._rows, eof) = QUERY_CACHE[self._qmd5] self.rowcount = len(self._rows) self._handle_eof(eof) self._next_row = 0 else: super(MySQLCursorQCache, self).execute(operation, params) def _handle_resultset(self): (self._rows, eof) = self.db().protocol.get_rows() self.rowcount = len(self._rows) self._handle_eof(eof) self._next_row = 0 QUERY_CACHE[self._qmd5] = (self._rows, eof) try: self.db().unread_result = False except: pass self._qmd5 = None
The above code is a proof of concept, there is lots of room for improvement. For example, you need something to invalidate entries in the query cache.
How to use MySQLCursorQueryCache
def main(): cnx = mysql.connector.connect(database='test') cur = cnx.cursor(cursor_class=MySQLCursorQueryCache) cur.execute("SELECT NOW()") print cur.fetchone() time.sleep(2) cur.execute("SELECT NOW()") print cur.fetchone() cur.close() cnx.close()
When you would use the default cursor, both executed SQL statements would produce a different result. The above produces the following output:
(datetime.datetime(2010, 11, 22, 21, 20, 4),) (datetime.datetime(2010, 11, 22, 21, 20, 4),)
When you’d like to have some statements cached, and some not, just create a second cursor cursor_class=MySQLCursorBuffered (see Buffering results with MySQL Connector/Python).