Definition (Index of Coincidence)
Let be the length of some text and for each letter let be the number of times occurs in this text. The index of coincidence of the text, denoted , is the number
We can use this to break Vignere Cipher.
Lemma (Largest Possible IC)
The largest possible is . This is if for some , but for any . This text must be monographic (composed entirely of a single letter).
Theorem (Sufficient Length for Text)
Let be the length of some Text and for each letter , let be the number of times occurs in this text. Let
and if is very large, then
Proof: It suffices to show that if is very large, then
but then this is trivial.
Theorem (Equal Occurrence)
If for any , such that then
then as .
Heuristic (English IC)
The IC of long English plaintext
- typically around
- almost always between and .