摘要:
A method for performing speech recognition can include determining a recognition result for received user speech. The recognition result can include recognized text and a corresponding confidence score. The confidence score of the recognition result can correspond to a predetermined minimum threshold. If the confidence score does not exceed the predetermined minimum threshold, the user can be presented with at least one empirically determined alternate word candidate corresponding to the recognition result.
摘要:
A method for performing speech recognition can include the steps of providing a grammar including entries comprising a parent word and a pseudo word being substantially phonetically equivalent to the parent word. The grammar can provide a translation from the pseudo word to the parent word. The parent word can be received as speech and the speech can be compared to the grammar entries. Additionally, the speech can be matched to the pseudo word and the pseudo word can be translated to the parent word.
摘要:
A system for automatically trimming an audio files based upon textual content associated with the audio file is provided. The source of the textual content may be an electronic document or written language text. The textual content may include predefined hints, a text mark, or end-of-phrase punctuation mark. The system generates a trimming instruction based upon textual content corresponding to the audio file, and the audio file is trimmed based upon the trimming instruction.
摘要:
A system for automatically trimming an audio files based upon textual content associated with the audio file is provided. The source of the textual content may be an electronic document or written language text. The textual content may include predefined hints, a text mark, or end-of-phrase punctuation mark. The system generates a trimming instruction based upon textual content corresponding to the audio file, and the audio file is trimmed based upon the trimming instruction.
摘要:
A method for voice data entry availability in a voice response system can include receiving speech input specifying data in an audio user interface to a data information system for processing data in a data store. The speech input can be received through an audio user interface to the data information system. Subsequently, speech-to-text conversion of the speech input can be performed using a speech recognition engine with reference to a corresponding speech grammar. In particular, the speech grammar can contain a data set of words relating to the data information system. Notably, the data store can contain a subset of the data set, the subset having words which can be processed by the data information system, the subset not having words which cannot be processed by the data information system. If the specified data is included in the speech grammar and if the specified data is in the data store, the speech data in the speech query can be processed. However, if the specified data is not in the data store, it can be reported that the specified data cannot be processed. Finally, if the specified data is not included in the speech grammar, an Out-Of-Grammar (OOG) condition can be reported. Additionally, the speech data in the speech query is not processed.
摘要:
Methods and apparatus for voice-enabling a web application, wherein the web application includes one or more web pages rendered by a web browser on a computer. At least one information source external to the web application is queried to determine whether information describing a set of one or more supported voice interactions for the web application is available, and in response to determining that the information is available, the information is retrieved from the at least one information source. Voice input for the web application is then enabled based on the retrieved information.
摘要:
A method and system for reformatting data. The method involves a series of steps which can include identifying a template which corresponds to a specified document. The specified document can contain formatted data. Additionally, the step of applying a template to the specified document can be extracting data from the formatted content. The step of formatting the data using a different markup language can be performed.
摘要:
A method and apparatus for concurrently accessing network-based electronic content in a Voice Browser and a Visual Browser can include the steps of retrieving a network-based document formatted for display in the Visual Browser; identifying in the retrieved document a reference to the Voice Browser, the reference specifying electronic content formatted for audible presentation in the Voice Browser; and, transmitting the reference to the Voice Browser. The Voice Browser can retrieve the specified electronic content and audibly present the electronic content. Concurrently, the Visual Browser can visually present the network-based document formatted for visual presentation in the Visual Browser. Likewise, the method of the invention can include the steps of retrieving a network-based document formatted for audible presentation in the Voice Browser; identifying in the retrieved document a reference to the Visual Browser, the reference specifying electronic content formatted for visual presentation in the Visual Browser; and, transmitting the reference to the Visual Browser. The Visual Browser can retrieve the specified electronic content and visually present the specified electronic content. Concurrently, the Voice Browser can audibly present the network-based document formatted for audible presentation in the Voice Browser.
摘要:
A VoIP-enabled speech server can include a speech application which can be configured to communicate with a VoIP telephony gateway server over a VoIP communications path. The VoIP-enabled speech server can also include a VoIP-compliant call control interface to the VoIP telephony gate server, the VoIP-compliant call control interface establishing the VoIP communications path. In operation, the speech application can receive VoIP-compliant packets from the VoIP telephony gateway server over the VoIP communications path. Subsequently, digitized audio data can be reconstructed from the VoIP-compliant packets, and the digitized audio data can be speech-to-text converted. Additionally, text can be synthesized into digitized audio data and the digitized audio data can be encapsulated in VoIP-compliant packets which can be transmitted over the VoIP communications path to the telephony gateway server.
摘要:
Methods and apparatus for voice-enabling a web application, wherein the web application includes one or more web pages rendered by a web browser on a computer. At least one information source external to the web application is queried to determine whether information describing a set of one or more supported voice interactions for the web application is available, and in response to determining that the information is available, the information is retrieved from the at least one information source. Voice input for the web application is then enabled based on the retrieved information.