Home > Python > Stuffing the gaps in the COLLATIONS table using a stored procedure

Stuffing the gaps in the COLLATIONS table using a stored procedure

February 17th, 2010 Leave a comment Go to comments

To far fetched (pun inteded), some might think.. Below you’ll find a procedure to get a list of MySQL supported character sets and their collations. The output is Python and can be used to build a (big) tuple.

The problem is that character set IDs in MySQL have ‘gaps’. For example hebrew has ID 16, and the next character set in the COLLATIONS-table, tis620, has ID 18. Not a big problem, just a bit annoying. This list is hardcoded in MySQL Connector/Python and I needed something to easily maintain it.

Solutions were using text editing skills, parsing it in Python to produce a list, etc.. But why not a Stored Routine? The following stored procedure is looping over a cursor, calculating the number of rows the gap has, and inserts blanks as needed.

DROP PROCEDURE IF EXISTS python_getcharsets;
DELIMITER //
CREATE PROCEDURE python_getcharsets()
BEGIN
  DECLARE i,diff,cid,done INT DEFAULT 0;
  DECLARE chname,coname VARCHAR(32);
  DECLARE cur CURSOR FOR SELECT ID,CHARACTER_SET_NAME,
    COLLATION_NAME FROM INFORMATION_SCHEMA.COLLATIONS
    ORDER BY ID;
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
  OPEN cur;
  loop_cur: LOOP
    FETCH cur INTO cid,chname,coname;
    IF done THEN LEAVE loop_cur; END IF;
    SET i = i + 1; SET diff = cid - i;
    WHILE diff DO
      SELECT "None,";
      SET diff = diff - 1; SET i = i + 1;
    END WHILE;
    SELECT CONCAT('("',chname,'","',coname,'"), # ',cid);
  END LOOP loop_cur;
  CLOSE cur;
END//
DELIMITER ;

CALL python_getcharsets();
DROP PROCEDURE IF EXISTS python_getcharsets;

You would save the above to a file called getcharsets.sql for example, and execute it like this:

shell> mysql -N test < getcharsets.sql 
..
("cp1251","cp1251_bulgarian_ci"), # 14
("latin1","latin1_danish_ci"), # 15
("hebrew","hebrew_general_ci"), # 16
None,
("tis620","tis620_thai_ci"), # 18
("euckr","euckr_korean_ci"), # 19
..

I’ll spare you the complete output, but as you can see from above sample: the gap has been stuffed with a None-Python value. Taking this output, you’ll inserted it your code:

desc = (
    None,
    ("big5","big5_chinese_ci"), # 1
    ("latin2","latin2_czech_cs"), # 2
    ("dec8","dec8_swedish_ci"), # 3
    ("cp850","cp850_general_ci"), # 4
..

Silly? Definitely basic stuff, but I got my code a bit faster, and cleaner!

  1. Roland Bouman
    February 17th, 2010 at 19:25 | #1

    Nice, but what about this:

    select concat(
    'desc = ('
    , group_concat(
    '\n '
    , if( collations.id is null, 'None'
    , concat(
    '('
    , '"', character_set_name, '"'
    , ','
    , '"', collation_name, '"'
    , ')'
    )
    )
    , if(ids.id=255, '', ',')
    , ' #', ids.id
    order by ids.id
    separator ''
    )
    , '\n)'
    )
    from (
    select (t0.id << 0)
    + (t1.id << 1)
    + (t2.id << 2)
    + (t3.id << 3)
    + (t4.id << 4)
    + (t5.id << 5)
    + (t6.id << 6)
    + (t7.id << 7) id
    from (select 0 id union all select 1) t0
    , (select 0 id union all select 1) t1
    , (select 0 id union all select 1) t2
    , (select 0 id union all select 1) t3
    , (select 0 id union all select 1) t4
    , (select 0 id union all select 1) t5
    , (select 0 id union all select 1) t6
    , (select 0 id union all select 1) t7
    ) ids
    left join information_schema.collations
    on ids.id = collations.id

  2. Geert JM Vanderkelen
    February 17th, 2010 at 19:39 | #2

    @Roland: crazyness! But seems to work! :)

  3. Roland Bouman
    February 18th, 2010 at 00:50 | #3

    Geert, here's some background to the craziness :)

    http://rpbouman.blogspot.com/2010/02/mysql-best-stored-routine-is-one-you.html

  1. No trackbacks yet.