Recently, I wrote about this question at length in an article for MEI:

It struck me, though, that the method the question guides you through is not the only, and perhaps not even the easiest, method.
So here are my further thoughts.
Often, problems like these are couched in terms of numbers rather than probabilities, so we might ask:
“In how many ways can \(n\) cards be shuffled so that no card ends up in its original position?”
or
“If \(n\) people throw their hats in the air, and each hat comes down on a random head (one hat per head), what is the probability that no head ends up with the correct hat?”
or even
“How many permutations of \(n\) distinct objects leave no object in its original position?”
This is the question we are going to tackle: “If people throw their hats in the air, and each hat, miraculously, comes down on a random head (one hat per head), what is the probability that no head ends up with the correct hat?”
What is wrong with the following argument:
If they all land on the wrong head, the first hat must land on one of \(n-1\) heads. The second hat can then land on one of \(n-2\) heads (neither the second head nor the one that the first hat landed on). And so on, so the probability of no hats landing on the right head is:
\(\frac{1}{n-1}\times \frac{1}{n-2}\times \frac{1}{n-3}\times \dots \, \frac{1}{2}\times\frac{1}{1}=\frac{1}{(n-1)!}\)
It’s easy to see what’s wrong with this: if the first hat lands on the second head, then the second hat still has \(n-1\) heads it could land on. Only if the first hat lands on any other head will it have \(n-2\) options.
What about this argument:
If they all land on the wrong head, the first hat must land on one of \(n-1\) heads. Call this head number 2. Then hat number 2 can land on any of \(n-1\) heads. Call this head number 3. Hat number 3 can then land on any head except number 2 or number 3, so it can land on \(n-2\) possible heads. Call this head number 4. And so on, so that the probability of no hats landing on the correct head is
\(\frac{1}{n-1}\times \frac{1}{n-1}\times \frac{1}{n-2}\times \frac{1}{n-3}\dots \, \frac{1}{2}\times\frac{1}{1}=\frac{1}{(n-1)!(n-1)}\)
The problem here is that hat number 2 might land on head number 1, so we can’t call it head number 3, which the argument tries to do.
Firstly, I’m going to count the number of ways something can happen rather than the probability of its happening. Since there are \(n!\) possible ways that hats can fall on heads, then we can convert between counting and probabilities by multiplying or dividing by \(n!\).
Now we want to see how many ways there are in which no one gets the right hat. If there are \(n\) people, we’ll call this number \(D_n\), the number of derangements of \(n\) objects.
Imagine you are the first person, and you get the second person’s hat. If the second person gets your hat, how many other people must still get the wrong hat?
What if your hat doesn't land on the second person?
If they get your hat, then the other \(n-2\) people all have to get the wrong hat, which can happen in \(D_{n-2}\) ways.
To get an arrangement where the second person does not get your hat, imagine, temporarily, the second person wearing your hat, and have everyone apart from you throw their hats in the air. Then everyone else gets the wrong hat in \(D_{n-1}\) ways.
In how many permutations do you get the second person’s hat AND everyone get the wrong hat?
Person 2 gets your hat in \(D_{n-2}\) ways, but does not get your hat in \(D_{n-1}\) ways, making \(D_{n-1}+D_{n-2}\) ways in total
In how many permutations do you get the third person’s hat AND everyone get the wrong hat?
\(D_{n-1}+D_{n-2}\)
In how many permutations does everyone get the wrong hat?
\(D_n=(n-1)\left(D_{n-1}+D_{n-2}\right)\) because you can get one of \(n-1\) hats.
This gives us a recurrence relation, but only when \(n>2\) (or maybe when \(n\ge 2\), as we’ll see).
What are \(D_0\) and \(D_1\)?
\(D_1=0\quad D_2=1\)
Use these results and the recurrence relation to find the first 7 or so terms of the sequence \(D_n\).
\(\begin{align*}
D_1&=0\\[4pt]
D_2&=1(D_1+D_0)=1\quad \left[\Rightarrow D_0=1\right]\\[4pt]
D_3&=2(D_2+D_1)=2\\[4pt]
D_4&=3(D_3+D_2)=9\\[4pt]
D_5&=4(D_4+D_3)=44
\end{align*}\)
and so on, to get the sequence
\(1,\;0,\;1,\;2,\;9,\;44,\;265,\,1854,\,14833\dots\)
Here’s another way of looking at it.
Let’s take \(n=5\) to make things a bit more definite.
Either no hat is on the right head, or 1 hat is on the right head, or 2, or 3, or 4, or 5. And the total number of ways that any of these happens is \(5!\).
In how many ways can exactly one hat land on the right head?
Suppose exactly one hat is on the right head. If this is hat A, then the other 4 must be on the wrong heads, which happens in \(D_4\) ways. The same is true if B is the only hat on the right head, or C, or D, or E.
So the total number of ways that exactly one hat lands on the right head is \(5\times D_4\).
In how many ways can exactly two hats land on the right head?
For exactly two good hats, we need to choose those two in one of \(\binom{5}{2}\) ways. The three bad hats must then be arranged in one of \(D_3\) ways. So the number of ways that exactly two hats land on the right heads is \(\binom{5}{2}\times D_3\). And so on.
So
\(5!=D_5+5D_4+10D_3+10D_2+5D_1+D_0\)
Generalise this to any positive integer \(n\).
\(n!=\binom{n}{n}D_0+\binom{n}{n-1}D_1+\binom{n}{n-2}D_2+\;\dots\;\binom{n}{0}D_n\)
which of course is the same as
\(n!=\binom{n}{0}D_0+\binom{n}{1}D_1+\binom{n}{2}D_2+\;\dots\;\binom{n}{n}D_n\)
Use this formula to find the first 7 terms of the sequence .
\(\begin{align*} &D_0=0!=1\\[4pt] &D_1=1!-D_0=0\\[4pt] &D_2=2!-2D_1-D_0=1\\[4pt] &D_3=3!-3D_2-3D_1-D_0=6-3-0-1=2\\[4pt] &D_4=4!-4D_3-6D_2-4D_1-D_0=24-8-6-0-1=9\\[4pt] &D_5=5!-5D_4-10D_3-10D_2-5D_1-D_0=44\\[4pt] &D_6=6!-6D_5-15D_4-20D_3-15D_2-6D_1-D_0=265 \end{align*}\)
So far, we have found two different recurrence relations for the sequence \(D_n\):
\(D_n=(n-1)\left(D_{n-1}+D_{n-2}\right)\)
and
\(\displaystyle D_n=n!-\sum_{r=0}^{n-1}\binom{n}{r}D_r\)
These give us perfectly reasonable ways to calculate \(D_n\) for any \(n\), though the first is obviously far easier to use than the second.
Compare the sequences
\(1D_0,\;2D_1, \; 3D_2, \; 4D_3,\; 5D_4,\,\dots\) and
\(D_0,\;D_1,\;D_2,\;D_3,\;D_4,\;D_5\,\dots\)
These are:
\(\begin{matrix}
1,&0,&3,&8,&45,&264,&\dots\\
1,&0,&1,&2,&9,&44,&265&\dots
\end{matrix}\)
Use this to suggest a relationship between \(D_n\) and \(D_{n-1}\)
\(D_n=nD_{n-1}+(-1)^n\)
Why could that be? Well, it would be nice to come up with a way of counting derangements that explained this. It turns out, though, that this is harder than it sounds. Fortunately, we can at least derive this first-order recurrence from the first second-order recurrence
\(D_n=(n-1)\left(D_{n-1}+D_{n-2}\right)\):
The original recurrence tells us that \(D_6=5D_5+5D_4\), which means that
\(\begin{align*}
D_6-6D_5&=5D_5+5D_4-6D_5\\[4pt]
&=5D_4-D_5\\
\end{align*}\)
Also, \(D_5=4D_4+4D_3\), which means that
\(\begin{align*}
5D_4-D_5&=5D_4-4D_4-4D_3\\[4pt]
&=D_4-4D_3\\
\end{align*}\)
and so on:
\(\begin{align*}
D_6-6D_5&=5D_4-D_5\\[4pt]
&=D_4-4D_3\\[4pt]
&=3D_2-D_3\\[4pt]
&=D_2-2D_1\\[4pt]
&=1
\end{align*}\)
It’s easy now to see that:
\(\begin{align*}
D_6-6D_5&=1\\[4pt]
D_5-5D_4&=-1\\[4pt]
D_4-4D_3&=1\\[4pt]
D_3-3D_2&=-1
\end{align*}\)
and in fact that:
\(D_n-nD_{n-1}=1\) when \(n\) is even, and
\(D_n-nD_{n-1}=-1\) when \(n\) is odd.
This we can now easily extend to see that
\(D_n-nD_{n-1}=(-1)^n\)
This is really a proof by induction in disguise. In fact, we may as well write out the proof by induction in a formal way:
Suppose \(D_0=1,\;D_1=0,\;D_n=(n-1)\left(D_{n-1}+D_{n-2} \right)\) when \(n>2\).
We want to prove that \(D_{n}=nD_{n-1}+(-1)^n\) when \(n\ge 1\).
Clearly this is true when \(n=1\), because \(D_1=1\times D_0+(-1)^1=0\).
Now suppose it’s true when \(n=k\), which means that
\(D_{k}=kD_{k-1}+(-1)^k\) and therefore \(kD_{k-1}=D_k-(-1)^k\)
Then
\(\begin{align*}
D_{k+1}&=k\left(D_k+D_{k-1}\right)\\[4pt]
&=kD_k+kD_{k-1}\\[4pt]
&=kD_k+D_k-(-1)^k\\[4pt]
&=(k+1)D_k+(-1)^{k+1}
\end{align*}\)
Therefore it’s true when \(n=k+1\), and the induction is complete.
Look at the result we are asked to prove in the original question. A striking feature is, of course, the alternating pluses and minuses. Now look a this new recurrence, which also features alternating pluses and minuses. Could it be that we can get the result from the recurrence?
Yes, it could!
\(\begin{align*}
D_6&=6D_5+1\\[4pt]
&=6(5D_4-1)+1\\[4pt]
&=6\times 5D_4-6+1\\[4pt]
&=6\times 5\times (4D_3+1)-6+1\\[4pt]
&=6\times 5\times 4D_3+6\times 5-6+1\\[4pt]
&=6\times 5\times 4(3D_2-1)+6\times 5-6+1\\[4pt]
&=6\times 5\times 4\times 3D_2-6\times 5\times 4+6\times 5-6+1\\[4pt]
&=6\times 5\times 4\times 3-6\times 5\times 4+6\times 5-6+1\\[4pt]
&=\frac{6!}{2!}-\frac{6!}{3!}+\frac{6!}{4!}-\frac{6!}{5!}+\frac{6!}{6!}
\end{align*}\)
which easily extends to any \(n\).
Again, this is a proof by induction in disguise, but I’ll leave this one to you to write out formally.
Well, maybe I was overstating the case when I said earlier that the use of the Principle of Inclusion and Exclusion might not be the easiest way to get the result. It turns out that this recurrence method has its own significant complications. Interesting, though, and you could so easily have missed all this!