Queue everything and delight everyone

This is a blog post I’ve had simmering in my brainmeats for well over a year or two. I’m suddenly inspired to break blog-radio-silence and get it out of my head.

From Let the microblogs bloom - RussellBeattie.com:

Once this is widely accepted (and I’m sure there are many that would argue with me), the thing that will separate these types of services won’t be whether they stay up (ala Twitter), but how fast your subscription messages are updated. Some services might be smaller or offer more features but not update as quickly whereas others will pride themselves on being as close to real-time as possible. The key is that it’s all about messaging, not publishing. (Oh, and this also facilitates federation as well, but that’s another topic).

See also: Rearchitecting Twitter: Brought to You By the 17th Letter of the Alphabet - random($foo)

One of the problems it seems most modern web apps face is the tendency to want to do everything all at once, and all in the same code that responds directly to a user. Because, while you’re in there building a user interface, it’s easy to implement everything else that needs to happen in that same UI module or library.

Someone wants to post a bookmark? Someone wants to post a message? Well, of course you want the system to cross-reference and deliver that new piece of User Generated Content through every permutation of tag, recipient, keyword, and notification channel supported by your system.

But, do you really have to do everything all at once—while the person who generated that content is tapping his or her foot, waiting for the web interface to respond with feedback? Are all of these things immediately vital to the person watching the browser spin, right now?

No. Your user wants to get on with things. He or she wants to see the submitted content get accepted and, as feedback and confirmation, see it reflected in a personal view immediately. Does it matter—to this person, at this moment—whether it shows up simultaneously in a friend’s inbox, the public timeline, a global tag page, or even an RSS or Atom feed?

Again, no, simultaneity doesn’t really matter—because no human beings actually appreciate it. Instead, imagine a ripple effect of concentric social and attention contexts with associated people spreading out from the original submission. (This probably rates the creation of a diagram someday.)

  • To make the person who’s submitting something happy, offer feedback visible in their own personal context in under 50-200 milliseconds. (That is, less than half-a-second at worst, in people terms.)

  • The next person to delight is someone following the first person’s published content—and humanly speaking, delays of tens of thousands of milliseconds can be acceptable here. (That is, 1-10 seconds at worst, in people terms.)

  • Finally, you can start worrying about strangers, allowing the content to propagate to tag pages, keyword tracking pages, and other public views—and I’d assert that delays of hundreds of thousands of milliseconds are acceptable here. (That is, 1-2 minutes at worst, in people terms.)

The idea here is that the social structure can help you scale, while still delighting people. Even with these delays, the system is still better at getting the word out than the original content creator would be at notifying all the others involved with an out-of-band system like IM or email. And that’s at worst—on most good days, all the delays should tend to be on the order of seconds or less.

And how do you do all of this? Use queues. Sure, the original submission of content can and should be done all at once—just enough to get the content into the user’s collection. Then, queue a job for further processing and get out of the way. In fact, just queue one job from the user interface—the processor of that queue can then queue further jobs for all the other individual processing tasks that are likely susceptible to plenty of parallel processing and horizontal scaling.

Meanwhile, the original user creating content has been thanked for their submission and life goes on. In fact, their life may include going on to submit many more pieces of content in rapid succession, thanks to your delightfully responsive web user interface.

And, in the end, that’s really the purpose of a web-based content creation interface—accepting something as quickly as possible to make the user happy enough to continue submitting more. The other part of the user interface, retrieval, serves simply to get the original content distributed as fast as can be reasonably expected.

Now, preparing for fast retrieval is another story. The flip side to processing queues are message inboxes—expect content duplicated everywhere and fetched simply, rather than using cleverly expressed SQL joins that bring a system to its knees. But, that’s another post altogether. :)

12 Comments

  1. Posted July 4, 2008 at 4:12 pm | Permalink

    Interesting bits of conversation happening all over the place regarding queues and the irony of it all - the much maligned Java has had workqueues since the early days.

    What everyone will learn, rather painfully in in cases like Twitter, is that all data is not created, consumed or processed equally. If you write your system which treats data equally you’ll wind up with many Twitters all over the place.

  2. Posted July 4, 2008 at 4:37 pm | Permalink

    Oh, definitely. Work queues are not a new thing at all. It’s just that I think there’re a lot of modern web app builders who skipped Java “enterprise” software—skipped, or hoped to run away—and are rediscovering the whole set of problems. Maybe the solutions will be less over-engineered this time.

  3. Posted July 4, 2008 at 4:40 pm | Permalink

    The thing that drives me nuts about twitter is that the core data rate is only about 30k/second… yet it kept going down. It’s easy to spit out a broadcast to a subnet and never even miss a packet if there are only 100 of them per second or so. There’s no reason on god’s green earth that twitter should be anywhere near overloaded.

    Bad architecture, on the other hand, is the work of Satan. ;-)

  4. Posted July 4, 2008 at 4:40 pm | Permalink

    Thank you! I’ve been trying to put my thumb on why queues are so interesting for months; this expresses it perfectly.

  5. Posted July 4, 2008 at 5:54 pm | Permalink

    Usually I would have something serious to say in agreement with you, because I do so much agree with you.

    But I have just one comment:

    DUH!!!

  6. Posted July 6, 2008 at 12:16 am | Permalink

    Good suggestions. For social services like Twitter, I would also add one more item:

    Prioritize by Relationship

    For example, two-way Twitter relationships (mutual-follow or recent @ or direct message exchange) should be refreshed before one-way. One can go further by placing higher priority on users whom poster sent messages to or received from within past X-hours.

  7. Posted July 10, 2008 at 7:35 am | Permalink

    er… eventually consistent social graphs anyone?

  8. Posted July 10, 2008 at 8:36 am | Permalink

    I agree with everything you’ve said. Especially the last part, duplicating data in the format it will be retrieved in rather than using complicated and CPU intensive SQL queries. This is especially true for any sort of statistics or reporting. I learned this by seeing my website’s statistics growing slower and slower to retrieve as more and more traffic caused the database to become larger and larger and all of a sudden those queries that ran nearly instantly, even with good indexing were taking several seconds to return.

  9. citric
    Posted July 10, 2008 at 4:49 pm | Permalink

    Web apps doing things while the user waits unnecessarily is an old phenomenon. I think it’s often a matter of developers not wanting to (and/or being politically unable to) venture into what they consider the sysadmin’s domain. Take the way-too-common case of apps that make the client wait while it does housekeeping. Why isn’t this in a cron job? One reason is maybe this is KewlOSSBlogWikiPackage and it’s simpler to say “just untar the package under htdocs and you’re done” instead of saying “also, unpack these scripts in a non-servable area and set them up to run hourly, but not all at the same time; stagger them a little. And run them with the same UID your web server is running as”. But we end up with a lot of apps that (badly) reimplement basic tools their OS ships with in the first place.

  10. Posted July 28, 2008 at 4:23 pm | Permalink

    I wonder if you’re still setting the bar too high for low-priority connections. I mean, microblogging isn’t really messaging, and maybe isn’t (shouldn’t-be?) conversation.

    So why wouldn’t 10-15min be good enough?

    What % of “messages” are read instantly after they hit an inbox?

  11. Posted November 23, 2008 at 5:35 am | Permalink

    I had to solve a similar problem. Needed the fastest possible response, so had to rule out interacting with the Database directly from the web app. Used PHP message queue Dropr to defer all DB work. It is very fast, easily over 1000 messages/second

  12. jmxz
    Posted February 13, 2009 at 3:00 pm | Permalink

    Wow those comments make me feel old. I remember when these java queues everyone’s referring to reminded me of how I had a VAX dedicated to queuing and scheduling batch jobs for a Cray.

7 Trackbacks

  1. [...] In response to Twitter’s recent issues, many have suggested we need some sort of open, decentralized Twitter-like microblogging network. Dave Winer, for one, has written extensively on the subject. Steve Gillmor, Marc Canter, et al have also discussed some sort of “Plan B”. That conversation has fragmented out over the web and generated some very interesting technical discussions. [...]

  2. By Infovore » links for 2008-07-05 on July 5, 2008 at 6:30 pm

    [...] 0xDECAFBAD » Queue everything and delight everyone “…that’s really the purpose of a web-based content creation interface—accepting something as quickly as possible to make the user happy enough to continue submitting more.” Leslie Orchard on message-queue-based design. (tags: queue messaging development software programming architecture) [...]

  3. By text/plain » Blog Archive » 3,737,844,653 on July 7, 2008 at 2:47 am

    [...] L. M. Orchard writes the following in an otherwise quite sensible article on queue-based architectures for broadcast messaging systems: “Even with these delays, the system is still better at getting the word out than the original content creator would be at notifying all the others involved with an out-of-band system like IM or email.” [...]

  4. [...] “Delight Everyone” post is latest greatest addition to the 17th letter of the alphabet for savior [...]

  5. [...] 0xDECAFBAD » Queue everything and delight everyone interesting blog post on architectures for scalability of microblogging and messaging/publishing. argues for queue systems over sql queries, which is reasonable. consider extensibility of the queue processors as the hook for new apps/innovations. article blog microblogging scalability twitter web [...]

  6. By People Over Process » links for 2008-07-12 on July 12, 2008 at 2:31 am

    [...] Queue everything and delight everyone “[D]o you really have to do everything all at once?” (tags: queue messaging architecture programming twitter) [...]

  7. [...] 0xDECAFBAD » Queue everything and delight everyone Yes. (tags: queues asynchronous userinterface responsiveness decafbad blog) [...]

Post a Comment

Your email is never shared. Required fields are marked *

*
*