Talk:Mean Opinion Score
From Wikipedia, the free encyclopedia
[edit] Don't quantise individual scores
It is implied that individual listeners must report their opinions as integers 1 - 2 - 3 - 4 - 5. A consequence that can be seen in MOS scores collected from small groups of listeners and listening sessions is that they are coarsely quantised. For example, possible results in the middle range are MOS = 3, 3-1/8, 3-1/4, 3-3/8, 3-1/2, 3-5/8, etc. where 8 opinion scores are averaged together. However, an important use of MOS listening tests is to evaluate differences in MOS, possibly small, between different audio processes such as codecs, telephone links, etc. After going to the expense of organising human listening tests it is unskilful to enforce that each listener report only 4 grades of difference between audio sequences (s)he hears. During work evaluating the subjective impact of transmission errors in digital telephone links, I find that individual listeners can usually report their order of preference of 8 or more differently impaired sentences with confidence (i.e. repeatability). I contend therefore that listeners should be encouraged to report their scores to any precision they wish. In practice, steps of 0.1 (i.e. 40 possible scores = [1.0 - 1.1 - 1.2....4.9 - 5.0]) can be allowed. Cuddlyable3 (talk) 12:00, 20 February 2008 (UTC)