The Multiple Birthday Problem

by Louie Beuschlein

We’re interested in the probability that there are two or more matching birthdays in a group of people (at least two distinct dates with multiple birthdays). You may want to try using this Java applet to simulate the problem before or after reading this explanation. Let M = number of matching birthdays. In other words, if we compile a list by asking each person in a group to write down his or her birthday, M is the number of dates repeated at least once. We want to know P(M >= 2).

P(M >= 2) = 1 - P(M < 2) = 1 - P(M = 0 or M = 1 ) = 1 - P(M = 0) - P(M = 1)

We already know how to find P(M = 0) from the original birthday problem: Person 1 obviously has some birthday. In order to be different from person 1, person 2 can have 364 different birthdays out of 365 total possible. Person 3 can have 363 different birthdays out of 365 total possible and be different from both person 1 and person 2. This trend yields the following formula for N people in the group.

P(M = 0) = (364/365) (363/365) (362/365) (361/365) ... (365 - N + 1)/365
= (364) (363) (362) (361) ... (365 - N + 1)/365N-1

Side note: In the original birthday problem, we want to know P(M >= 1). This is 1 - P(M = 0) = 1 - (364) (363) (362) (361) ... (365 - N + 1)/365N-1.

In order to find P(M >= 2), we also need P(M = 1). To develop this let’s suppose there are five people in the room (N = 5). Let’s also suppose that the first four people we ask have different birthdays and that the fifth person matches one of the first one. This is one way of attaining exactly one match. The first person can have 365 different birthdays, the second 364, the third 363, and the fourth 362. But the fifth person can only have one birthday, whatever the first person’s birthday is. So, for these conditions, there are (365) (364) (363) (362) different lists of five birthdays, out of a total of 3655 possible lists. The matching birthdays, however, do not have to be the first and fifth. They could be any pair chosen from the five people. There are "5 choose 2" such pairs, which I’ll write like this: C(5, 2). This suggests that

P(M = 1) = C(N, 2) (365) (364) (363) (362) ... (365 - N + 2) / 365N

This approximate formula yields results that are very similar to those produced by the simulation program written by my student, Luiz Felipe Mendes. Technically, the above formula gives the probability for there being exactly one date in which exactly two people have the same birthday. Let Q = the number of people who have the same who share a common birthday with at least one other person. Then, in the analysis above, Q = 2, and we can write

P(M = 1 and Q = 2) = C(N, 2) (365) (364) (363) (362) ... (365 - N + 2) / 365N

There are other cases to consider when M = 1, namely cases in which Q > 2. The formula above does not account for cases in which three or more birthdays occur on the same day. The analysis above assumed that there were four distinct birthdays among five people. It’s possible, though, to have exactly one date with a match but fewer than four distinct birthdays, e.g., three people with the same birthday and two others that don’t match, such as the following set. {Oct 2, Jan 19, Jan 19, Nov 4, Jan 19} In this set, M = 1 and Q = 3.

Let’s figure the odds of three of our five people having the same birthday but with the remaining two people having birthdays different from each other and different from the matching three. That is, let’s find P(M = 1 and Q = 3). There are C(5, 3) groups of three people. These people can have 365 different birthdays, leaving 364 and 363 birthdays for the remaining two. So the probability of this occurring among N people is

P(M = 1 and Q = 3) = C(N, 3) (365) (364) (363) (362) ... (365 - N + 3) / 365N

To find P(M = 1), we must take into account all the different possible values of Q, from 2 to N. To make sure my notation is clear to you, consider the following sets.

Birthday List
N
M
Q
{May 4, Jun 7, Feb 20, Jun 25, May 4, Apr 10}
6
1
2
{Dec 17, Jul 18, Feb 20, Jul 18, Jul 18, Aug 30, Feb 20}
7
2
5
{Sep 1, Sep 1, Jun 15, Oct 8, Sep 1}
5
1
3

Fortunately, our goal of finding P(M >= 2) does not require any analysis of cases in which M > 1. In general, we can write

P(M = 1 and Q = q) = C(N, q) (365) (364) (363) (362) ... (365 - N + q) / 365N

Since the cases for different values of Q are all mutually exclusive,

P(M = 1) = ∑ P(M = 1 and Q = q)
= ∑ C(N, q) (365) (364) (363) (362) ... (365 - N + q) / 365N
where q begins at 2 and stops at N.

Since P(M >= 2) = 1 - P(M = 0) - P(M = 1), we get

P(M >= 2) = 1 - (365) (364) (363) (362) ... (365 - N + 1)/365N
- ∑ C(N, q) (365) (364) (363) (362) ... (365 - N + q) / 365N
where q begins at 2 and stops at N in the summation.

In practice, replacing the summation on the second line with the approximation given on page one for P(M = 1) makes the calculation much simpler and yields a good approximation, since the odds of there being three or more birthdays on the same date are usually pretty small.

Go to the Multiple Birthdays Java Applet to simulate what is happening in this problem.


Last Revised: 6/18/02

Send comments to mste@uiuc.edu