摘要:
Systems and methods providing computer-implemented content propagation for enhanced document retrieval are described. In one aspect, reference information directed to one or more documents is identified. The reference information is identified from one or more sources of data that are independent of a data source that includes the one or more documents. Metadata that is proximally located to the reference information is extracted from the one or more sources of data. Relevance between respective features of the metadata to content of associated ones of the one or more documents is calculated. For each document of the one or more documents, associated portions of the metadata is indexed with the relevance of features from the respective portions into original content of the document. The indexing generates one or more enhanced documents.
摘要:
Systems and methods for enhanced document retrieval are described. In one aspect, a search query from an end-user is received. Responsive to receiving the search query, search results are retrieved. The search results include an enhanced document and a set of non-enhanced documents. The enhanced document and the non-enhanced documents include term(s) of the search query. The enhanced document is derived from a base document. The base document was modified with metadata mined from one or more different documents. The metadata is associated with one or more respective references to the base document. The one or more different documents are independent of the base document.
摘要:
A method for inputting ideograms into a computer system includes receiving phonetic information related to a desired ideogram to be entered and forming a candidate list of possible ideograms as a function of the phonetic information received. Stroke information, comprising one or more strokes in the desired ideogram, is received in order to obtain the desired ideogram from the candidate list.
摘要:
Systems and methods for enhanced document retrieval are described. In one aspect, a search query from an end-user is received. Responsive to receiving the search query, search results are retrieved. The search results include an enhanced document and a set of non-enhanced documents. The enhanced document and the non-enhanced documents include term(s) of the search query. The enhanced document is derived from a base document. The base document was modified with metadata mined from one or more different documents. The metadata is associated with one or more respective references to the base document. The one or more different documents are independent of the base document.
摘要:
A method of constructing a language model for a phrase-based search in a speech recognition system and an apparatus for constructing and/or searching through the language model. The method includes the step of separating a plurality of phrases into a plurality of words in a prefix word, body word, and suffix word structure. Each of the phrases has a body word and optionally a prefix word and a suffix word. The words are grouped into a plurality of prefix word classes, a plurality of body word classes, and a plurality of suffix word classes in accordance with a set of predetermined linguistic rules. Each of the respective prefix, body, and suffix word classes includes a number of prefix words of same category, a number of body words of same category, and a number of suffix words of same category, respectively. The prefix, body, and suffix word classes are then interconnected together according to the predetermined linguistic rules. A method of organizing a phrase search based on the above-described prefix/body/suffix language model is also described. The words in each of the prefix, body, and suffix classes are organized into a lexical tree structure. A phrase start lexical tree structure is then created for the words of all the prefix classes and the body classes having a word which can start one of the plurality of phrases while still maintaining connections of these prefix and body classes within the language model.
摘要:
A method and system for editing words that have been misrecognized. The system allows a speaker to specify a number of alternative words to be displayed in a correction window by resizing the correction window. The system also displays the words in the correction window in alphabetical order. A preferred system eliminates the possibility, when a misrecognized word is respoken, that the respoken utterance will be again recognized as the same misrecognized word. This elimination occurs based on the probabilities of alternative words associated with both the misrecognized utterance and the respoken utterance. The system, when operating with a word processor, allows the speaker to specify the amount of speech that is buffered before transferring to the word processor. The system also uses a word correction metaphor or a phrase correction metaphor.
摘要:
A speech recognition system for Mandarin Chinese comprises a preprocessor, HMM storage, speech identifier, and speech determinator. The speech identifier includes pseudo initials for representing glottal stops that precede syllables of lone finals. The HMM storage stores context dependent models of the initials, finals, and pseudo initials that make the syllables of Mandarin Chinese speech. The models may be dependent on associated initials or finals and on the tone of the syllable. The speech determinator joins the initials and finals and pseudo initials and finals according to the syllables of the speech identifier. The speech determinator then compares input signals of syllables to the joined models to determine the phonetic structure of the syllable and the tone of the syllable. The system also includes a smoother for smoothing models to make recognitions more robust. The smoother comprises an LDM generator and a detailed model modifier. The LDM generator generates less detailed models from the detailed models, and the detailed model modifier smoothes the models with the less detailed models. A method for recognizing Mandarin Chinese speech includes the steps of arranging context dependent, sub-syllable models; comparing an input signal to the arranged models; and selecting the arrangement of models that best matches the input signal to recognize the phonetic structure and tone of the input signal.
摘要:
Architecture that utilizes web search implicitly to assist users in improving writing and associated productivity. The architecture extends the authoring experience of applications of office suite applications which can draw on a web search engine to offer contextual suggestions for revision, word auto-complete, and text prediction. Web-based research and reference to users is enabled as the user writes or revises text. Suggestions are made as to how to complete a phrase or sentence using data from networks such as the Internet or intranet, to how a user how revises a word or phrase in an already-written sentence using data from the network, and to problems in writing style/writing rules. Paragraph analysis is performed to find improper language usage or errors. Prediction and revision suggestions are extracted from web search or enterprise search document summaries, and intent of the user to obtain word completion, revision assistance, and prediction suggestions is identified.
摘要:
A content object indexing process including creating a content object knowledge index, calculating a description vector of a target content object, and indexing the target content object by searching for the description vector in the content object knowledge database. It may be difficult to search for an exact content object such as a music file or academic researcher as a conventional search index may not include related hierarchical information. A content object indexing process may add hierarchical information taken from a content object knowledge index and incorporate the hierarchical information to the index entry for a specific content object. An application of such a content object indexing process may be a world wide web search engine.
摘要:
A server/client system for processing data includes a network having a web server with information accessible remotely. A client device includes a microphone and a rendering component such as a speaker or display. The client device is configured to obtain the information from the web server and record input data associated with fields contained in the information. The client device is adapted to send the input data to a remote location with an indication of a grammar to use for recognition. A recognition server receives the input data and the indication of the grammar. The recognition server returns data indicative of what was recognized to at least one of the client and the web server.