Efficient queue polling on a database table

I know that database tables are supposed to be bad for queuing. Common advice is to use a messaging system, or asynchronous notifications, or both. But suppose for a moment that you have 24/7 queues that are rarely empty, so you mostly don’t need to worry about unneeded polling, and you just need a lot of things to happen constantly. Let’s play Devil’s Advocate for a moment and see if using a database for the task can be made less atrocious than it is supposed to be.

The standard features of a table used for polling are:

  • a sequential ID (or timestamp)
  • a “Status” column to distinguish processed from unprocessed entries
  • the payload data

The properties I seek for polling are:

  • cacheable and lightweight execution plan
  • “locking”, i.e. to avoid retrieving the same row twice
  • mutual exclusion between different pollers, if they exist
  • minimal blocking between different pollers

An approach like the following (T-SQL) satisfies the aforementioned properties. A new “Status” is introduced, to denote entries under processing. It is supposed that the “Status” of entries will be set to “processed” as the processing of each entry finishes, or reverted to “unprocessed” otherwise. An index on the “Status” column and the ID column (in that order) is also needed.

with T as ( 
	SELECT Q.*, ROW_NUMBER() over (order by id) RN FROM TheQueue Q
	WHERE TheStatus = 0
) 
UPDATE TheQueue with (rowlock)  
SET TheStatus=9, TmProcess=getdate() 
OUTPUT deleted.id, deleted.payload
WHERE id in (select id from T where RN<=@N)

If the processing of entries itself involves the database, don’t forget to limit the number of concurrent operations to as low as can be measured to perform efficiently. I’ve seen queues with 900 concurrent threads (I’m not making this up…) which sped up many orders of magnitude when the number of threads was reduced to… five. Notice also the setting of the “TmProcess” column, supposed to hold the timestamp the entry was dequeued. Add a column to register when the entry was enqueued and another one for when its processing was finished, and you can keep track of very useful analytics for your queues. I may return with another post on this particular subject.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s