Mongo query performance

More difficulty today in my efforts to find a scalable way to do the scoring in Exquisite Haiku. I thought that I had this figured out over the weekend when I totally rewrote the models so that all of the data for a given poem – the current round, the set of unique words in play, and the votes for the words – were structured as one, massive, three-levels-of-nesting object. In other words, the only first-class model was Poem, and the other three (Round, Word, and Vote) were all just tacked on as embedded documents.

This results in fantastic benchmarks – since you only have to hit the database once at the start of the process to pop out the master Poem record, you can then just walk downwards on the object and retrieve all of the related word and vote data directly off of the object. So, in other words, what would have required two more “sets” of queries (one to get the words, and then another, for each word, to get the votes) is effectively free, and speeds up the whole process by about an order or magnitude. When everything is chunked out as separate models that just reference each other by ObjectId (the original approach), a benchmark that creates 100 words each with 100 votes runs in about 1000ms; with the single, deeply-nested document, it falls to about 125ms.

After lunch on Monday I ran this idea by Wayne and Eric, and they both immediately pointed out that this opens the door to concurrency problems that might not show up in the benchmark – multiple clients open up the document, make changes, try to commit conflicting versions, etc. This morning I tried to implement a middle ground – completely get rid of the “word” abstraction and push that information onto the vote schema. So, you have poem, round (as a set of embedded documents on poem), and then a collection of votes with ObjectId references to the current round, each with the word to which they pertain, the quantity, and the time that they were applied. This way, we avoid the concurrency problems with the poem super-object, and pare down the total number of queries that have to happen in the scoring process to just two – one findById() to pop out the poem record, and then a costly query for all of the votes in the active round. Then, we score each of the votes, tally up the per-word rank and churn scores, sort, slice, and return.

Unfortunately, this doesn’t really work. It’s faster than the original, totally-decomposed-models approach, but only by a bit. With 100 words, each with 100 votes, it’s still over 1000ms. Wayne pointed out that there would probably be a not-insignificant speed boost when the application is deployed on real hardware with faster disk access. But even if there’s something like a 20-30% speed increase on the query, I really don’t like the idea that a 10,000-vote round would take a full second to score on the server – that means that the slicing interval for a poem of that size would probably have to be at least 4-5 seconds, which is getting pretty glacial. Really, a 500ms slicing interval is the goal, which means that the per-slice scoring can’t take more than about 100ms at the 100-words-100-votes-per-word level.

The frustrating thing is that this speed is algorithmically possible in v8, as evidenced by the really high performance of the implementation that crammed all of the data onto a single document. But the whole thing is shut down by the massive disk I/O cost of the query. As much as I don’t like to admit it after about a month of work, I think Mongo might be the wrong medicine – I need an in-memory store that totally eliminates any touches on the hard disk during the scoring routine. Redis?