Talk:Speech synthesis

From Wikipedia, the free encyclopedia

Speech synthesis is within the scope of WikiProject Robotics, an attempt to standardise coverage of Robotics. If you would like to participate, you can edit the article attached to this notice, or visit the project page, where you can join the project or contribute to the discussion.
A This article has been rated as A-Class on the Project's quality scale.
(If you rated the article please give a short summary at comments to explain the ratings and/or to identify the strengths and weaknesses.)
Mid This article has been rated as mid-importance on the importance scale.

This is the talk page for discussing improvements to the Speech synthesis article.

Article policies
Former featured article Speech synthesis is a former featured article. Please see the links under Article milestones below for its original nomination page (for older articles, check the nomination archive) and why it was removed.
Main Page trophy This article appeared on Wikipedia's Main Page as Today's featured article on June 3, 2004.
Peer review This Engtech article has been selected for Version 0.5 and subsequent release versions of Wikipedia. It has been rated A-Class on the assessment scale (comments).

Contents

[edit] Older comments

I think perhaps the different synthesis techniques are long enough to warrant their own pages Nohat 05:20 19 Jun 2003 (UTC)

Could somebody extend on 'Formant synthesis', both on the technical side and terminology, is any system using filting technics on basic waves + noise considered 'formant' synthesis? is it a specific technique, or just the general term of synthesising phenomenons?

Trillium Sound Research Inc (now defunct) offered unlimited vocabulary articulatory speech synthesis on the NeXT computer in 1994, so it is not accurate to say that articulatory speech synthesis is only of academic interest and not far enough advanced for commercial application. It was NeXT Computer that failed, not the synthesis, which was rated the best synthesis available at the time. That software is now the basis of the GnuSpeech project -- a port of the original NeXT software to Linux. It is under a GPL. The basis is an acoustic tube model, so it is low level articulatory synthesis with the necessary databases for varying the tube cross-sections, using the Fant/Carre research on formant sensitivity analysis and control regions. Provision is made for adding the higher level parameters such as tongue height, jaw opening, etc, but this extension is still undeveloped, and would rely on deriving relationships between these higher-level parameters and the low-level tube cross-section parameters. Other ports are possible/likely.

[edit] use in Weatheradio

Not sure where to put it, but the National Weather Service in the U.S. uses it on all Weatheradio stations now. The new voice sounds excellent, and i think uses a hybrid of patched voice and true synthesis. The Weather Channel also may use this for their Vocal Local announcements during the local forecast (but not on their Weatherscan channel). –radiojon 02:47, 2004 Jun 4 (UTC)

The NWS "Tom" and "Donna" AKA "Mara" voices are the SpeechWorks Speechify (now merged with Realspeak) American English voices "Tom" and "Mara" (no longer available), which use a purely concatenative system. [1] Nohat 06:57, 14 Apr 2005 (UTC)

[edit] External links

I think this article has too many (16) external links in the section, "Examples of current systems". I don't know much about the different systems we link to, but either A) each system we list is important in the topic of speech synthesis and needs to be mentioned in the article, or B) not all of these systems are important. In case A), we should use internal links: Don't use external links where we'll want Wikipedia links. In case B), we should just pick one or two examples, or link to a page which contains these links (Wikipedia is not a link repository.) — Matt 14:08, 17 Jun 2004 (UTC)

  • I'm not sure there's such a thing as "too many external links". "Wikipedia is not a link repository" applies to articles which consist of nothing but links, which this article is clearly not. I don't see what the point of removing all or some of the links would be. Nohat 18:59, 2004 Jun 23 (UTC)
  • I agree that once an article starts "collecting" external links, its hard to stop ("If website A is listed, why not website B"). Nohat, I don't agree with your assessment that the "Wikipedia is not a link repository" statement only applies to a particular type of article. Link spamming and excessive linking is becoming a major problem WP:WPSPAM. There is a major update and serious discussion underway here WP:EL with more defined do's and don'ts. The trend I believe will be toward fewer external links. One recommendation now appearing in the guidelines is to eliminate all but a few highly relevant links and placing a link to DMOZ that points to a directory of websites that relate to the article's topic. Speech synthesis in particular had many links to commercial websites promoting products and services, but these were removed per guidelines. I went ahead and removed several more today because they were promotional (one selling a book, free for a limited time, required registration, etc.). Any discussion on these edits would be welcome. Calltech 15:06, 5 December 2006 (UTC)

[edit] Request for references

Hi, I am working to encourage implementation of the goals of the Wikipedia:Verifiability policy. Part of that is to make sure articles cite their sources. This is particularly important for featured articles, since they are a prominent part of Wikipedia. The Fact and Reference Check Project has more information. If some of the external links are reliable sources and were used as references, they can be placed in a References section too. See the cite sources link for how to format them. Thank you, and please leave me a message when a few references have been added to the article. - Taxman 19:43, Apr 22, 2005 (UTC)

[edit] Early Voices Described as "Robotic" Seems Circular

Primitive speech synthesis devices sound robotic. A robotic voice is produced by a primitive speech synthesis device. This is circular. The popular idea of what a robot's voice sounds like comes from early attempts at speech synthesis. Film and television makers must have imitated what had been produced by early efforts at synthesis when creating robotic characters. Would be more accurate I think to say that the idea of a robotic voice came from efforts to produce speech synthesis. Saying that early speech synthesizers were robotic gets it backwards.

I'm not sure it's quite so simple as that. Interestingly, there has only ever been one speech synthesis system that spoke in a monotone (and not very popular or often-used one at that)—yet, the most common feature of "robotic" voices when imitated by humans is a monotone. Clearly this notion of what a robot sounds like was not based on listening to actual synthesized speech. It is more likely that the idea of "robotic" voices came from what people imagined a synthetic voice would sound like, rather than what actual synthetic voices sounded like.
Regardless of all this, to the contemporary reader, the idea of the voice sounding "robotic" is probably a fairly safe if perhaps preposterous in the literal sense base point to explain what old speech synthesis systems sounded like. Nohat 06:50, 25 October 2005 (UTC)

[edit] Open source software

Are there any open source speech synthesis projects? It would be great to summarize how the best few are doing or note the lack if there are none. — Hippietrail 17:36, 15 April 2006 (UTC)

[edit] Possible copyvio

A possible copyvio concern has arisen in the Feature Article review. User:Marskell wrote "I believe the Concatenative Synthesis section may be a text dump from here". This is a serious concern that should be addressed inmediately/ Joelito (talk) 19:30, 7 November 2006 (UTC)

[edit] External links cleanup

External links section was getting filled with lots of links to similar websites. WP is not a directory of links WP:NOT:

"Wikipedia articles are not mere collections of external links or internet directories. There is nothing wrong with adding one or more useful content-relevant links to an article; however, excessive lists can dwarf articles and detract from the purpose of Wikipedia"

I went ahead and removed most of the external links and added DMOZ category for speech synthesis (per WP recommendation). If you feel that any of the deleted links contribute substantially more than the others, please feel free to leave a comment here and we all can discuss. Thanks! Calltech 18:43, 20 December 2006 (UTC)

[edit] Fair use rationale for Image:MS Sam.ogg

Image:MS Sam.ogg is being used on this article. I notice the image page specifies that the image is being used under fair use but there is no explanation or rationale as to why its use in this Wikipedia article constitutes fair use. In addition to the boilerplate fair use template, you must also write out on the image description page a specific explanation or rationale for why using this image in each article is consistent with fair use.

Please go to the image description page and edit it to include a fair use rationale. Using one of the templates at Wikipedia:Fair use rationale guideline is an easy way to ensure that your image is in compliance with Wikipedia policy, but remember that you must complete the template. Do not simply insert a blank template on an image page.

If there is other fair use media, consider checking that you have specified the fair use rationale on the other images used on this page. Note that any fair use images lacking such an explanation can be deleted one week after being tagged, as described on criteria for speedy deletion. If you have any questions please ask them at the Media copyright questions page. Thank you.

BetacommandBot (talk) 13:25, 8 March 2008 (UTC)

[edit] Text to speech based on Festival in Unix

www.wordtosound.com installed on a unix box. type any text (english only) and output as downloadable file wav or mp3. Voice is british accent and kind of croaky, but understandable. More clear in the wave format. —Preceding unsigned comment added by 69.85.110.110 (talk) 21:04, 27 May 2008 (UTC)