Tuesday, December 11, 2012

The surprising cost of doing nothing with Celery and SQS

I'm a big fan of leveraging open source solutions, so Celery seemed the obvious solution when I needed an asynchronous task scheduler for a Django project.  Since I was hosting on Amazon EC2 and on a tight schedule, I opted to use Amazon's Simple Queue Service (SQS) to save time, rather than setting up RabbitMQ on my server.  My use case had a trivially small number of messages being sent, so I naively assumed the advertised "First 100,000 Amazon SQS Requests per month are free" would more than cover my needs, and quickly moved on to more pressing tasks.

I had a billing alert set at a single cent, so Amazon emailed me as soon as I had gone over 100,000 requests and started being billed for my usage.  This happened quite quickly, while I was still in the early stages of development.  A quick check of the AWS console, the terms of SQS and the Celery documentation made my folly clear - by default Celery will spawn workers for each core with a default polling rate of 1 second.  My quad-core development machine, running a local django server instance but accessing SQS had been polling an empty queue 4 times every second, meaning I burned through 100,000 requests in a single day.

Celery can easily be configured to change the poll rate and number of workers, and SQS is only 1 cent per 10,000 requests as of this writing, but it's still a silly way to waste money.  The experience did make me take a moment to reconsider my design, because once the site was sending messages and using Celery's periodic tasks as well, I realized my usage of SQS was going to be more expensive than originally expected.

No comments:

Post a Comment