Today, user RobB left a comment on my recent blog post, asking this:
I have some doubts on how valid an M/M/m model is in a typical database performance scenario. Taking the example from the wait chapter of your book where you have a 0.49 second (service_time) query that you want to perform in less than a second, 95% percent of the time. The most important point here is the assumption of an exponential distribution for service_time immediately states that about 13% of the queries will take more than 2X(Average Service Time), and going the other way most likely to take 0 seconds. From just this assumption only, it is immediately clear that it is impossible to meet the design criteria without looking at anything else. From your article and link to the Kendall notation, wouldn’t an M/D/m model be more appropriate when looking at something like SQL query response time?? Something like M/M/m seems more suited to queueing at the supermarket, for example, and probably many other ‘human interactive’ scenarios compared to a single sub-component of an IT system.
Here’s my answer, part 1.
First, I believe your observation about the book example is correct. It is correct that if service times are exponentially distributed, then about 13% (13.5335%, more precisely) of those times will be 2S, where S is the mean service time. So in the problem I stated, it would be impossible to achieve subsecond response time in more than about 86% of executions, even if there were no competing workload at all. You’re right: you don't need a complicated model to figure that out. You could get that straight from the CDF of the exponential distribution.
However, I think the end of the example provides significant value, where it demonstrates how to use the M/M/m model to prove that you're not going to be able to meet your design criteria unless you can work the value of S down to .103 seconds or less (Optimizing Oracle Performance Fig 9-26, p277). I’ve seen lots of people argue, “You need to tune that task,” but until M/M/m, I had never seen anyone be able to say what the necessary service time goal was, which of course varies as a function of the anticipated arrival rate. A numerical goal is what you need when you have developers who want to know when they’re finished writing code.
With regard to whether real-life service times are really exponentially distributed, you’ve got me wondering now, myself. If service times are exponentially distributed, then for any mean service time S, there’s a 9.5% probability that a randomly selected service time will be less than .1S (in Mathematica, CDF[ExponentialDistribution[1/s],.1s] is 0.0951626 if 0.1s > 0). I’ve got to admit that at the moment, I’m baffled as to how this kind of distribution would model any real-life service process, human, IT, or otherwise.
On its face, it seems like a distribution that prohibits service times smaller than a certain minimum value would be a better model (or perhaps, as you suggest, even fixed service times, as in M/D/m). I think I’m missing something right now that I used to know, because I remember thinking about this previously.
I have two anecdotal pieces of evidence to consider.
One, nowhere in my library of books dedicated to the application of queueing theory to modeling computer software performance (that’s more than 6,000 pages, over 14 inches of material) does Kleinrock, Allen, Jain, Gunther, Menascé, et al mention an M/D/m queueing system. That’s no proof that M/D/m is not the right answer, but it’s information that implies that an awful lot of thinking has gone into the application of queueing theory to software applications without anyone deciding that M/D/m is important enough to write about.
Two, I’ve used M/M/m before in modeling a trading system for a huge investment management company. The response time predictions that M/M/m produced were spectacularly accurate. We did macro-level testing only, comparing response times predicted by M/M/m to actual response times measured by Tuxedo. We didn’t check to see whether service times were exponentially distributed, because the model results were consistently within 5% of perfect accuracy.
Neither of these is proof, of course, that M/M/m is superior in routine applicability to M/D/m. One question I want to answer is whether an M/D/m system would provide better or worse performance than a similar M/M/m system. My intuition is leaning in favor of believing that the M/M/m system would give better performance. If that’s true, then M/M/m is an optimistic model compared to M/D/m, which means that if a real-life system is M/D/m and an M/M/m model says it’s not going to meet requirements, then it assuredly won’t.
I did find a paper online by G. J. Franx about M/D/m queueing. Maybe that paper contains an R=f(λ,μ) function that I can use to model an M/D/m system, which would enable me to do the comparison. I’ll look into it.
Then there’s the issue of whether M/M/m or M/D/m is a more appropriate model for a given real circumstance. The answer to that is simple: test your service times to see if they’re exponentially distributed. The Perl code in Optimizing Oracle Performance, pages 248–254 will do that for you.