Tuesday, June 19, 2012

How Many Rooms Should This Hotel Overbook?

The following example is taken from A Second Course in Business Statistics: Regression Analysis (4th edn) by William Mendenhall and Terry Sincich:
Often, travellers who have no intention of showing up fail to cancel their hotel reservations in a timely manner. These travellers are known in the parlance of the hospitality trade, as "no-shows".

The no-shows for a 500-room hotel for a sample of 30 days are as follows:

18, 16, 16, 16, 14, 18, 16, 18, 14, 19, 15, 19, 9, 20, 10, 10, 12, 14, 18, 12, 14, 14, 17, 12, 18, 13, 15, 13, 15, 19

Based on this sample, what is the minimum number of rooms that the hotel should overbook?

The mean number of no-shows for the sample =  15.133

The standard deviation of no-shows for the sample = 2.945

When sample size is 30 or more, as is the case in this example, the distribution of sample means is approximately normal as per the Central Limit Theorem irrespective of the distribution of the sampled population. In the normal distribution, 95% of data points lie within 2 standard deviations from the mean. For our sample,

mean ± 2 * standard deviation = 15.133  ± 2 * 2.945 = 15.133 ± 5.890


 In other words, 95% of the time, the no-shows range between 9.243 and 21.023 (the red region in the figure above). Hence, the hotel can overbook at least 9.243 or 10 rooms each day and still be highly confident of honouring all reservations.

Here  is my Python script to calculate the mean and standard deviation of the example dataset:
#-------------------------------------------------------------------------------
# By Ram Limbu @ ramlimbu.com
# Copyright 2012 Ram Limbu
# License: GNU GPLv3 http://www.gnu.org/licenses/gpl.html
#-------------------------------------------------------------------------------

import math

def mean(t):
''' Returns the mean of the measurements

args:
t: list of measurements
'''
return float(sum(t))/len(t)

def var(t):
''' Returns sample variance

args:
t: list of measurements
'''
mu = mean(t)
devsq = [(x - mu) ** 2 for x in t]
sample_var = sum(devsq) / (len(t) - 1)
return sample_var

def stddev(t):
''' Returns the standard deviation of the sample

args:
t: list of measurements
'''
return math.sqrt(var(t))

def main():
noshows = [18, 16, 16, 16, 14, 18, 16, 18, 14, 19, \
15, 19, 9, 20, 10, 10, 12, 14, 18, 12, \
14, 14, 17, 12, 18, 13, 15, 13, 15, 19]
mu = mean(noshows)
sigma = stddev(noshows)
print 'mean of no-shows is', mu
print 'sample variance of no-shows is', var(noshows)
print 'standard deviation of no-shows is', sigma
print 'mean - 2 * standard deviations is', mu - 2 * sigma
print 'mean + 2 * standard deviations is', mu + 2 * sigma

if __name__ == '__main__':
main()