Trigrams are a special case of the N-gram, where N is 3. They are often used in natural language processing for doing statistical analysis of texts.
The 16 most common character-level trigrams in English are:[1]
Rank | Trigram |
---|---|
1 | the |
2 | and |
3 | tha |
4 | ent |
5 | ing |
6 | ion |
7 | tio |
8 | for |
9 | nde |
10 | has |
11 | nce |
12 | edt |
13 | tis |
14 | oft |
15 | sth |
16 | men |
The sentence "the quick red fox jumps over the lazy brown dog" has the following word level trigrams:
the quick red quick red fox red fox jumps fox jumps over jumps over the over the lazy the lazy brown lazy brown dog
And the word-level trigram "the quick red" has the following character-level trigrams (where an underscore "_" marks a space):
the qui k_r he_ uic _re e_q ick red _qu ck_