We derive this decomposition as follows, starting with the sample variance S computed by the usual formula:
Replacing the mean vector
by the sum which defines
it,
Adding an appropriate form of zero,
which is
Notice that the second sum as exactly S, so finally
or
where
is the total number of distinct pairs of data positions, of which
there are
.