Tuesday, March 01, 2011

Red Deadlock Redemption - MySQL Edition

Howdy there. It has been over a year since I last put finger to keyboard and brought to you my tales of woe. Well here comes another one but this is about MySQL and deadlocks which are fast becoming my nemesis. MySQL seems to deadlock at the slightest of things. Too many processes, high contention tables, flies farting in Greenland; all seem to be a cause.

My latest deadlock problem was from trying to emulate Oracle sequences in MySQL.Sequences are most commonly used to emulate MySQL's auto-incrementing key except they have the advantage that they are not linked to a single table. They can also increment at a defined step increase and are atomic. The work I have been doing required this feature; the ability to atomically reserve a range of identifiers. So a table structure was defined (auto increment identifier, name of the sequence and the last assigned value) & the code was written but this left one very big problem. How do you seed the table? A naive approach was formulated
  1. Select for update the sequence in question
  2. If it existed then update the row with the new value
  3. If not then insert a seed row
All well & good until a user reported back saying this procedure was deadlocking. How on Earth? I checked the row existed, if it did not then I inserted the data. All of this was carried out in one transaction so what was going wrong. You can find the condensed version of the logic from Gist which also causes a deadlock.

Well +1 for being able to replicate the issue; it always helps. So why do we get this failure? Well it is all because of InnoDB locking model and this little comment nestled in the depths of MySQL's documentation (thank you Tim Starling).
Note that if you use SELECT FOR UPDATE to perform a uniqueness check before an insert, you will get a deadlock for every race condition unless you enable the innodb_locks_unsafe_for_binlog option. A deadlock-free method to check uniqueness is to blindly insert a row into a table with a unique index using INSERT IGNORE, then to check the affected row count.
What what what?!? So what does this locking disabling do? Well it turns off next-key locking which combines index locking with gap locking in an attempt to avoid phantom reads (rows which appear in a table when you should have locked them normally caused by inserting new rows). So in the example process 1 and 2 locked the sequence table for rows which did not exist and to make sure no new rows appear. This stops both processes from inserting into the table so one has to fail.

Thankfully there are three options to resolve this. The first is to perform an insert ignore into the table before selecting for update. This will block the other process and the code cannot deadlock. The second way is to pre-populate the table with rows which we can then select for update. Number three is to retry the transaction & hope the deadlock does not re-appear.

All are valid but option two is my preferred solution since it cuts out have to perform an insert I know will fail every-time I want a new identifier range. If the schema & data was a bit more dynamic then option one would be the best solution. Option 3 has the feeling of being a band-aid but I concede sometimes thats all you can do.

Anyway one deadlock down. Another to go but that will have to wait for another time

No comments: