It's all spinning wheels and self-doubt until the first pot of coffee.

Queue everything and delight everyone

This is a blog post I've had simmering in my brainmeats for well over a year or two. I'm suddenly inspired to break blog-radio-silence and get it out of my head.

From Let the microblogs bloom - RussellBeattie.com:

Once this is widely accepted (and I'm sure there are many that would argue with me), the thing that will separate these types of services won't be whether they stay up (ala Twitter), but how fast your subscription messages are updated. Some services might be smaller or offer more features but not update as quickly whereas others will pride themselves on being as close to real-time as possible. The key is that it's all about messaging, not publishing. (Oh, and this also facilitates federation as well, but that's another topic).

See also: Rearchitecting Twitter: Brought to You By the 17th Letter of the Alphabet - random($foo)

One of the problems it seems most modern web apps face is the tendency to want to do everything all at once, and all in the same code that responds directly to a user. Because, while you're in there building a user interface, it's easy to implement everything else that needs to happen in that same UI module or library.

Someone wants to post a bookmark? Someone wants to post a message? Well, of course you want the system to cross-reference and deliver that new piece of User Generated Content through every permutation of tag, recipient, keyword, and notification channel supported by your system.

But, do you really have to do everything all at once—while the person who generated that content is tapping his or her foot, waiting for the web interface to respond with feedback? Are all of these things immediately vital to the person watching the browser spin, right now?

No. Your user wants to get on with things. He or she wants to see the submitted content get accepted and, as feedback and confirmation, see it reflected in a personal view immediately. Does it matter—to this person, at this moment—whether it shows up simultaneously in a friend's inbox, the public timeline, a global tag page, or even an RSS or Atom feed?

Again, no, simultaneity doesn't really matter—because no human beings actually appreciate it. Instead, imagine a ripple effect of concentric social and attention contexts with associated people spreading out from the original submission. (This probably rates the creation of a diagram someday.)

  • To make the person who's submitting something happy, offer feedback visible in their own personal context in under 50-200 milliseconds. (That is, less than half-a-second at worst, in people terms.)

  • The next person to delight is someone following the first person's published content—and humanly speaking, delays of tens of thousands of milliseconds can be acceptable here. (That is, 1-10 seconds at worst, in people terms.)

  • Finally, you can start worrying about strangers, allowing the content to propagate to tag pages, keyword tracking pages, and other public views—and I'd assert that delays of hundreds of thousands of milliseconds are acceptable here. (That is, 1-2 minutes at worst, in people terms.)

The idea here is that the social structure can help you scale, while still delighting people. Even with these delays, the system is still better at getting the word out than the original content creator would be at notifying all the others involved with an out-of-band system like IM or email. And that's at worst—on most good days, all the delays should tend to be on the order of seconds or less.

And how do you do all of this? Use queues. Sure, the original submission of content can and should be done all at once—just enough to get the content into the user's collection. Then, queue a job for further processing and get out of the way. In fact, just queue one job from the user interface—the processor of that queue can then queue further jobs for all the other individual processing tasks that are likely susceptible to plenty of parallel processing and horizontal scaling.

Meanwhile, the original user creating content has been thanked for their submission and life goes on. In fact, their life may include going on to submit many more pieces of content in rapid succession, thanks to your delightfully responsive web user interface.

And, in the end, that's really the purpose of a web-based content creation interface—accepting something as quickly as possible to make the user happy enough to continue submitting more. The other part of the user interface, retrieval, serves simply to get the original content distributed as fast as can be reasonably expected.

Now, preparing for fast retrieval is another story. The flip side to processing queues are message inboxes—expect content duplicated everywhere and fetched simply, rather than using cleverly expressed SQL joins that bring a system to its knees. But, that's another post altogether. :)

Archived Comments

  • Interesting bits of conversation happening all over the place regarding queues and the irony of it all - the much maligned Java has had workqueues since the early days.

    What everyone will learn, rather painfully in in cases like Twitter, is that all data is not created, consumed or processed equally. If you write your system which treats data equally you'll wind up with many Twitters all over the place.

  • Oh, definitely. Work queues are not a new thing at all. It's just that I think there're a lot of modern web app builders who skipped Java "enterprise" software—skipped, or hoped to run away—and are rediscovering the whole set of problems. Maybe the solutions will be less over-engineered this time.

  • The thing that drives me nuts about twitter is that the core data rate is only about 30k/second... yet it kept going down. It's easy to spit out a broadcast to a subnet and never even miss a packet if there are only 100 of them per second or so. There's no reason on god's green earth that twitter should be anywhere near overloaded.

    Bad architecture, on the other hand, is the work of Satan. ;-)

  • Thank you! I've been trying to put my thumb on why queues are so interesting for months; this expresses it perfectly.

  • Usually I would have something serious to say in agreement with you, because I do so much agree with you.

    But I have just one comment:


  • Good suggestions. For social services like Twitter, I would also add one more item:

    Prioritize by Relationship

    For example, two-way Twitter relationships (mutual-follow or recent @ or direct message exchange) should be refreshed before one-way. One can go further by placing higher priority on users whom poster sent messages to or received from within past X-hours.

  • er... eventually consistent social graphs anyone?

  • I agree with everything you've said. Especially the last part, duplicating data in the format it will be retrieved in rather than using complicated and CPU intensive SQL queries. This is especially true for any sort of statistics or reporting. I learned this by seeing my website's statistics growing slower and slower to retrieve as more and more traffic caused the database to become larger and larger and all of a sudden those queries that ran nearly instantly, even with good indexing were taking several seconds to return.

  • Web apps doing things while the user waits unnecessarily is an old phenomenon. I think it's often a matter of developers not wanting to (and/or being politically unable to) venture into what they consider the sysadmin's domain. Take the way-too-common case of apps that make the client wait while it does housekeeping. Why isn't this in a cron job? One reason is maybe this is KewlOSSBlogWikiPackage and it's simpler to say "just untar the package under htdocs and you're done" instead of saying "also, unpack these scripts in a non-servable area and set them up to run hourly, but not all at the same time; stagger them a little. And run them with the same UID your web server is running as". But we end up with a lot of apps that (badly) reimplement basic tools their OS ships with in the first place.

  • I wonder if you're still setting the bar too high for low-priority connections. I mean, microblogging isn't really messaging, and maybe isn't (shouldn't-be?) conversation.

    So why wouldn't 10-15min be good enough?

    What % of "messages" are read instantly after they hit an inbox?

  • I had to solve a similar problem. Needed the fastest possible response, so had to rule out interacting with the Database directly from the web app. Used PHP message queue Dropr to defer all DB work. It is very fast, easily over 1000 messages/second

  • Wow those comments make me feel old. I remember when these java queues everyone's referring to reminded me of how I had a VAX dedicated to queuing and scheduling batch jobs for a Cray.

  • Handling load is probably one of the biggest problems facing websites today. Queueing is definitely the way to go, but like you said, sites need the type of architecture where it's easy to deploy services to different machines. Usually by the time the site is under load, it's too late...

  • I use Inotify as my queue messaging system


    Inotify can wait on MOVED_TO or CLOSE_WRITE events so that you can add them to the queue when the upload has finished.

    It should also be noted that this is a mnethod of load balacing too. Instead of 1000 parallel thumbnails being produced all context switching away, you can determine how many processes get spawned, use the OS' resource managing features etc.