Content propagation for enhanced document retrieval
    21.
    发明授权
    Content propagation for enhanced document retrieval 失效
    增强文档检索的内容传播

    公开(公告)号:US07305389B2

    公开(公告)日:2007-12-04

    申请号:US10826161

    申请日:2004-04-15

    IPC分类号: G06F17/30

    摘要: Systems and methods providing computer-implemented content propagation for enhanced document retrieval are described. In one aspect, reference information directed to one or more documents is identified. The reference information is identified from one or more sources of data that are independent of a data source that includes the one or more documents. Metadata that is proximally located to the reference information is extracted from the one or more sources of data. Relevance between respective features of the metadata to content of associated ones of the one or more documents is calculated. For each document of the one or more documents, associated portions of the metadata is indexed with the relevance of features from the respective portions into original content of the document. The indexing generates one or more enhanced documents.

    摘要翻译: 描述了提供用于增强文档检索的计算机实现的内容传播的系统和方法。 在一个方面,指定针对一个或多个文档的参考信息。 参考信息从一个或多个独立于包括一个或多个文档的数据源的数据来源识别。 从一个或多个数据来源提取近端位于参考信息的元数据。 计算元数据的各个特征与一个或多个文档中相关联的内容的相关性。 对于一个或多个文档的每个文档,将元数据的相关部分与来自相应部分的特征与文档的原始内容的相关性进行索引。 索引生成一个或多个增强文档。

    Multi-modal entry of ideogrammatic languages
    23.
    发明授权
    Multi-modal entry of ideogrammatic languages 失效
    表意文字语言的多式录入

    公开(公告)号:US07174288B2

    公开(公告)日:2007-02-06

    申请号:US10142572

    申请日:2002-05-08

    IPC分类号: G06F17/28

    摘要: A method for inputting ideograms into a computer system includes receiving phonetic information related to a desired ideogram to be entered and forming a candidate list of possible ideograms as a function of the phonetic information received. Stroke information, comprising one or more strokes in the desired ideogram, is received in order to obtain the desired ideogram from the candidate list.

    摘要翻译: 将表意文字输入到计算机系统中的方法包括接收与要输入的期望表意文字相关的语音信息,并形成作为所接收的语音信息的函数的可能的表意文字的候选列表。 在所希望的表意文字中包括一个或多个笔画的行程信息被接收以从候选列表中获得所需的表意文字。

    Search engine for phrase recognition based on prefix/body/suffix
architecture
    25.
    发明授权
    Search engine for phrase recognition based on prefix/body/suffix architecture 失效
    基于前缀/ body / suffix架构的搜索引擎进行短语识别

    公开(公告)号:US5832428A

    公开(公告)日:1998-11-03

    申请号:US538828

    申请日:1995-10-04

    摘要: A method of constructing a language model for a phrase-based search in a speech recognition system and an apparatus for constructing and/or searching through the language model. The method includes the step of separating a plurality of phrases into a plurality of words in a prefix word, body word, and suffix word structure. Each of the phrases has a body word and optionally a prefix word and a suffix word. The words are grouped into a plurality of prefix word classes, a plurality of body word classes, and a plurality of suffix word classes in accordance with a set of predetermined linguistic rules. Each of the respective prefix, body, and suffix word classes includes a number of prefix words of same category, a number of body words of same category, and a number of suffix words of same category, respectively. The prefix, body, and suffix word classes are then interconnected together according to the predetermined linguistic rules. A method of organizing a phrase search based on the above-described prefix/body/suffix language model is also described. The words in each of the prefix, body, and suffix classes are organized into a lexical tree structure. A phrase start lexical tree structure is then created for the words of all the prefix classes and the body classes having a word which can start one of the plurality of phrases while still maintaining connections of these prefix and body classes within the language model.

    摘要翻译: 一种在语音识别系统中构建用于基于短语的搜索的语言模型的方法以及用于通过语言模型构建和/或搜索的装置。 该方法包括将多个短语分离成前缀字,正文和后缀词结构中的多个单词的步骤。 每个短语都有一个正文词和可选的前缀词和一个后缀词。 这些字根据一组预定语言规则分组成多个前缀词类,多个体词类和多个后缀词类。 各个前缀,正文和后缀词类中的每一个分别包括相同类别的多个前缀词,相同类别的正文字数,以及相同类别的多个后缀词。 然后,前缀,正文和后缀词类根据预定的语言规则互连在一起。 还描述了基于上述前缀/主体/后缀语言模型来组织短语搜索的方法。 每个前缀,正文和后缀类中的单词被组织成词法树结构。 然后,针对所有前缀类和具有单词的主体类创建短语开始词法树结构,该单词可以开始多个短语中的一个,同时仍然保持语言模型内的这些前缀和身体类的连接。

    Method and system for correcting misrecognized spoken words or phrases
    26.
    发明授权
    Method and system for correcting misrecognized spoken words or phrases 失效
    用于纠正错误识别的口头单词或短语的方法和系统

    公开(公告)号:US5829000A

    公开(公告)日:1998-10-27

    申请号:US741696

    申请日:1996-10-31

    IPC分类号: G10L15/06 G10L15/22 G01L5/06

    CPC分类号: G10L15/22

    摘要: A method and system for editing words that have been misrecognized. The system allows a speaker to specify a number of alternative words to be displayed in a correction window by resizing the correction window. The system also displays the words in the correction window in alphabetical order. A preferred system eliminates the possibility, when a misrecognized word is respoken, that the respoken utterance will be again recognized as the same misrecognized word. This elimination occurs based on the probabilities of alternative words associated with both the misrecognized utterance and the respoken utterance. The system, when operating with a word processor, allows the speaker to specify the amount of speech that is buffered before transferring to the word processor. The system also uses a word correction metaphor or a phrase correction metaphor.

    摘要翻译: 用于编辑错误识别的单词的方法和系统。 该系统允许扬声器通过调整校正窗口的大小来指定要在校正窗口中显示的替代单词的数量。 系统还会按字母顺序显示校正窗口中的单词。 一个首选的系统消除了当一个错误识别的话被重申时,这个可重复发音将被再次被认为是同一个错误识别的单词的可能性。 这种消除是基于与错误识别的话语和呼出话语相关联的替代词的概率。 当使用文字处理器进行操作时,该系统允许扬声器指定在传送到文字处理器之前缓冲的语音量。 该系统还使用单词修正隐喻或短语校正隐喻。

    System and method for generating and using context dependent
sub-syllable models to recognize a tonal language
    27.
    发明授权
    System and method for generating and using context dependent sub-syllable models to recognize a tonal language 失效
    用于生成和使用与上下文相关的子音节模型来识别音调语言的系统和方法

    公开(公告)号:US5680510A

    公开(公告)日:1997-10-21

    申请号:US378963

    申请日:1995-01-26

    摘要: A speech recognition system for Mandarin Chinese comprises a preprocessor, HMM storage, speech identifier, and speech determinator. The speech identifier includes pseudo initials for representing glottal stops that precede syllables of lone finals. The HMM storage stores context dependent models of the initials, finals, and pseudo initials that make the syllables of Mandarin Chinese speech. The models may be dependent on associated initials or finals and on the tone of the syllable. The speech determinator joins the initials and finals and pseudo initials and finals according to the syllables of the speech identifier. The speech determinator then compares input signals of syllables to the joined models to determine the phonetic structure of the syllable and the tone of the syllable. The system also includes a smoother for smoothing models to make recognitions more robust. The smoother comprises an LDM generator and a detailed model modifier. The LDM generator generates less detailed models from the detailed models, and the detailed model modifier smoothes the models with the less detailed models. A method for recognizing Mandarin Chinese speech includes the steps of arranging context dependent, sub-syllable models; comparing an input signal to the arranged models; and selecting the arrangement of models that best matches the input signal to recognize the phonetic structure and tone of the input signal.

    摘要翻译: 汉语语音识别系统包括预处理器,HMM存储器,语音标识符和语音确定器。 语音标识符包括用于表示在单独决赛的音节之前的声门停止的伪初始值​​。 HMM存储存储使汉语普通话音节的首字母,决赛和伪首字母的上下文相关模型。 模型可能依赖于相关的缩写或决赛以及音节的音调。 语音决定器根据语音标识符的音节连接初始和决赛以及伪首字母和决赛。 然后,语音决定器将音节的输入信号与连接的模型进行比较,以确定音节的语音结构和音节的音调。 该系统还包括更平滑的平滑模型,使识别更加健壮。 平滑器包括LDM发生器和详细的模型修改器。 LDM发生器从详细模型中生成不太详细的模型,详细的模型修正器使用较不详细的模型平滑模型。 普通话语音识别方法包括安排语境相关,子音节模型的步骤; 将输入信号与排列的模型进行比较; 并选择与输入信号最匹配的模型的排列,以识别输入信号的语音结构和音调。

    NETWORK SEARCH FOR WRITING ASSISTANCE
    28.
    发明申请
    NETWORK SEARCH FOR WRITING ASSISTANCE 审中-公开
    网络搜索书面帮助

    公开(公告)号:US20120297294A1

    公开(公告)日:2012-11-22

    申请号:US13109021

    申请日:2011-05-17

    IPC分类号: G06F17/21 G06F17/30

    摘要: Architecture that utilizes web search implicitly to assist users in improving writing and associated productivity. The architecture extends the authoring experience of applications of office suite applications which can draw on a web search engine to offer contextual suggestions for revision, word auto-complete, and text prediction. Web-based research and reference to users is enabled as the user writes or revises text. Suggestions are made as to how to complete a phrase or sentence using data from networks such as the Internet or intranet, to how a user how revises a word or phrase in an already-written sentence using data from the network, and to problems in writing style/writing rules. Paragraph analysis is performed to find improper language usage or errors. Prediction and revision suggestions are extracted from web search or enterprise search document summaries, and intent of the user to obtain word completion, revision assistance, and prediction suggestions is identified.

    摘要翻译: 利用网页搜索隐式地协助用户改进写作和相关生产力的体系结构。 该架构扩展了办公套件应用程序的创作经验,可以利用Web搜索引擎提供修订,字自动完成和文本预测的上下文建议。 当用户编写或修改文本时,可以启用基于Web的研究和用户参考。 建议如何使用来自诸如因特网或内联网之类的网络的数据来完成短语或句子,以及用户如何使用来自网络的数据修改已经写入的句子中的单词或短语,以及如何修改文字中的问题 风格/写作规则。 进行段落分析以查找不正确的语言使用或错误。 从网络搜索或企业搜索文档摘要中提取预测和修订建议,并确定用户获取单词完成,修订协助和预测建议的意图。

    Content object indexing using domain knowledge
    29.
    发明授权
    Content object indexing using domain knowledge 有权
    使用领域知识的内容对象索引

    公开(公告)号:US07698294B2

    公开(公告)日:2010-04-13

    申请号:US11275509

    申请日:2006-01-11

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30613

    摘要: A content object indexing process including creating a content object knowledge index, calculating a description vector of a target content object, and indexing the target content object by searching for the description vector in the content object knowledge database. It may be difficult to search for an exact content object such as a music file or academic researcher as a conventional search index may not include related hierarchical information. A content object indexing process may add hierarchical information taken from a content object knowledge index and incorporate the hierarchical information to the index entry for a specific content object. An application of such a content object indexing process may be a world wide web search engine.

    摘要翻译: 内容对象索引处理包括创建内容对象知识索引,计算目标内容对象的描述向量,并通过搜索内容对象知识库中的描述向量来索引目标内容对象。 可能难以搜索诸如音乐文件或学术研究者的确切内容对象,因为传统的搜索索引可能不包括相关的分层信息。 内容对象索引处理可以添加从内容对象知识索引获取的分层信息,并且将分层信息并入特定内容对象的索引条目。 这样的内容对象索引处理的应用可以是万维网搜索引擎。

    Web enabled recognition architecture
    30.
    发明授权
    Web enabled recognition architecture 有权
    Web启用识别架构

    公开(公告)号:US07506022B2

    公开(公告)日:2009-03-17

    申请号:US09960232

    申请日:2001-09-20

    IPC分类号: G06F15/16 G10L11/00

    摘要: A server/client system for processing data includes a network having a web server with information accessible remotely. A client device includes a microphone and a rendering component such as a speaker or display. The client device is configured to obtain the information from the web server and record input data associated with fields contained in the information. The client device is adapted to send the input data to a remote location with an indication of a grammar to use for recognition. A recognition server receives the input data and the indication of the grammar. The recognition server returns data indicative of what was recognized to at least one of the client and the web server.

    摘要翻译: 用于处理数据的服务器/客户端系统包括具有Web服务器的网络,其中信息可远程访问。 客户端设备包括麦克风和诸如扬声器或显示器的渲染组件。 客户端设备配置为从Web服务器获取信息并记录与包含在信息中的字段相关联的输入数据。 客户端设备适于将输入数据发送到远程位置,并具有用于识别的语法指示。 识别服务器接收输入数据和语法的指示。 识别服务器返回表示对客户机和web服务器中的至少一个识别的内容的数据。