Web-based collocation error proofing
    1.
    发明申请
    Web-based collocation error proofing 有权
    基于Web的搭配错误打样

    公开(公告)号:US20080133444A1

    公开(公告)日:2008-06-05

    申请号:US11633788

    申请日:2006-12-05

    IPC分类号: G06N7/02 G06F17/30 G06F3/048

    摘要: Collocation errors can be automatically proofed using local and network-based corpora, including the Web. For example, according to one illustrative method, one or more collocations from a text sample are compared with a corpus such as the content of the Web. The collocations are identified for whether they are disfavored in the corpus. Indications are provided via an output device of whether the collocations are disfavored in the corpus. Additional steps may then be taken such as searching for and providing potentially proper word collocations via a user output.

    摘要翻译: 可以使用本地和基于网络的语料库(包括Web)自动验证并置错误。 例如,根据一个说明性方法,将来自文本样本的一个或多个并置与诸如Web的内容的语料库进行比较。 识别他们是否在语料库中不利的搭配。 通过输出设备提供指示是否在语料库中不匹配。 然后可以采取额外的步骤,例如通过用户输出搜索并提供潜在的适当的单词搭配。

    Proofing of word collocation errors based on a comparison with collocations in a corpus
    2.
    发明授权
    Proofing of word collocation errors based on a comparison with collocations in a corpus 有权
    基于与语料库中的搭配进行比较来验证单词搭配错误

    公开(公告)号:US07774193B2

    公开(公告)日:2010-08-10

    申请号:US11633788

    申请日:2006-12-05

    IPC分类号: G06F17/28 G06F17/21

    摘要: Collocation errors can be automatically proofed using local and network-based corpora, including the Web. For example, according to one illustrative method, one or more collocations from a text sample are compared with a corpus such as the content of the Web. The collocations are identified for whether they are disfavored in the corpus. Indications are provided via an output device of whether the collocations are disfavored in the corpus. Additional steps may then be taken such as searching for and providing potentially proper word collocations via a user output.

    摘要翻译: 可以使用本地和基于网络的语料库(包括Web)自动验证并置错误。 例如,根据一个说明性方法,将来自文本样本的一个或多个并置与诸如Web的内容的语料库进行比较。 识别他们是否在语料库中不利的搭配。 通过输出设备提供指示是否在语料库中不匹配。 然后可以采取额外的步骤,例如通过用户输出搜索并提供潜在的适当的单词搭配。

    Processing collocation mistakes in documents
    3.
    发明授权
    Processing collocation mistakes in documents 有权
    处理文件中的并置错误

    公开(公告)号:US07574348B2

    公开(公告)日:2009-08-11

    申请号:US11177136

    申请日:2005-07-08

    IPC分类号: G06F17/27

    摘要: A sentence is accessed and at least one query is generated based on the sentence. At least one query can be compared to text within a collection of documents, for example using a web search engine. Collocation errors in the sentence can be detected and/or corrected based on the comparison of the at least one query and the text within the collection of documents.

    摘要翻译: 访问一个句子,并且基于该句子生成至少一个查询。 至少可以将一个查询与文档集合中的文本进行比较,例如使用Web搜索引擎。 可以基于至少一个查询与文档集合内的文本的比较来检测和/或修正该句子中的配置错误。

    Processing collocation mistakes in documents
    4.
    发明申请
    Processing collocation mistakes in documents 有权
    处理文件中的并置错误

    公开(公告)号:US20070010992A1

    公开(公告)日:2007-01-11

    申请号:US11177136

    申请日:2005-07-08

    IPC分类号: G06F17/27

    摘要: A sentence is accessed and at least one query is generated based on the sentence. At least one query can be compared to text within a collection of documents, for example using a web search engine. Collocation errors in the sentence can be detected and/or corrected based on the comparison of the at least one query and the text within the collection of documents.

    摘要翻译: 访问一个句子,并且基于该句子生成至少一个查询。 至少可以将一个查询与文档集合中的文本进行比较,例如使用Web搜索引擎。 可以基于至少一个查询与文档集合内的文本的比较来检测和/或修正该句子中的配置错误。

    Search query and document-related data translation
    5.
    发明授权
    Search query and document-related data translation 有权
    搜索查询和文档相关的数据翻译

    公开(公告)号:US09501759B2

    公开(公告)日:2016-11-22

    申请号:US13328924

    申请日:2011-12-16

    摘要: The subject disclosure is directed towards developing a translation model for mapping search query terms to document-related data. By processing user logs comprising search histories into word-aligned query-document pairs, the translation model may be trained using data, such as probabilities, corresponding to the word-aligned query-document pairs. After incorporating the translation model into model data for a search engine, the translation model is used may used as features for producing relevance scores for current search queries and ranking documents/advertisements according to relevance.

    摘要翻译: 本发明旨在开发用于将搜索查询词语映射到文档相关数据的翻译模型。 通过将包括搜索历史的用户日志处理成字对齐的查询 - 文档对,可以使用对应于字对齐的查询 - 文档对的诸如概率的数据来训练翻译模型。 在将翻译模型合并到搜索引擎的模型数据中之后,使用翻译模型可以用作根据相关性产生当前搜索查询和排序文档/广告的相关性分数的特征。

    Search Query and Document-Related Data Translation
    6.
    发明申请
    Search Query and Document-Related Data Translation 有权
    搜索查询和文档相关数据翻译

    公开(公告)号:US20130103493A1

    公开(公告)日:2013-04-25

    申请号:US13328924

    申请日:2011-12-16

    IPC分类号: G06Q30/02 G06F17/30

    摘要: The subject disclosure is directed towards developing a translation model for mapping search query terms to document-related data. By processing user logs comprising search histories into word-aligned query-document pairs, the translation model may be trained using data, such as probabilities, corresponding to the word-aligned query-document pairs. After incorporating the translation model into model data for a search engine, the translation model is used may used as features for producing relevance scores for current search queries and ranking documents/advertisements according to relevance.

    摘要翻译: 本发明旨在开发用于将搜索查询词语映射到文档相关数据的翻译模型。 通过将包括搜索历史的用户日志处理成字对齐的查询 - 文档对,可以使用对应于字对齐的查询 - 文档对的诸如概率的数据来训练翻译模型。 在将翻译模型合并到搜索引擎的模型数据中之后,使用翻译模型可以用作根据相关性产生当前搜索查询和排序文档/广告的相关性分数的特征。

    Method and system for retrieving confirming sentences
    7.
    发明授权
    Method and system for retrieving confirming sentences 有权
    检索确认句子的方法和系统

    公开(公告)号:US07974963B2

    公开(公告)日:2011-07-05

    申请号:US11187567

    申请日:2005-07-22

    IPC分类号: G06F17/00

    CPC分类号: G06F17/3069 Y10S707/99933

    摘要: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.

    摘要翻译: 提供了一种方法,计算机可读介质和系统,其响应于查询从句子数据库中检索确认句子。 搜索引擎响应于查询从句子数据库中检索确认句子。 在检索确认语句中,搜索引擎基于查询来定义索引单元,索引单元包括来自查询的引理和与查询相关联的扩展索引单元。 然后,搜索引擎使用定义的索引单元作为搜索参数从句子数据库中检索多个句子。 由搜索引擎确定多个检索到的句子和查询中的每一个之间的相似度,其中每个相似度被确定为查询中的术语的语言权重的函数。 然后,搜索引擎基于所确定的相似度对多个检索到的句子进行排序。

    Language classification with random feature clustering
    8.
    发明申请
    Language classification with random feature clustering 审中-公开
    语言分类与随机特征聚类

    公开(公告)号:US20060287848A1

    公开(公告)日:2006-12-21

    申请号:US11157091

    申请日:2005-06-20

    IPC分类号: G06F17/27

    CPC分类号: G06F16/355

    摘要: An ensemble of random feature clusters is built from training data using a clustering algorithm where some randomness has been introduced. For each clustered feature space, a classifier, such as a Naïve Bayesian Classifier, is trained, realizing a classifier ensemble. The final classification decision is made by the resulting classifier ensemble.

    摘要翻译: 随机特征群集由训练数据构建,使用聚类算法,其中引入了一些随机性。 对于每个聚类特征空间,训练一个分类器,如朴素贝叶斯分类器,实现分类器集合。 最终的分类决定是由所得到的分类器集合决定的。

    Query speller
    9.
    发明授权
    Query speller 有权
    查询拼写器

    公开(公告)号:US07818332B2

    公开(公告)日:2010-10-19

    申请号:US11465023

    申请日:2006-08-16

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/3064

    摘要: Candidate suggestions for correcting misspelled query terms input into a search application are automatically generated. A score for each candidate suggestion can be generated using a first decoding pass and paths through the suggestions can be ranked in a second decoding pass. Candidate suggestions can be generated based on typographical errors, phonetic mistakes and/or compounding mistakes. Furthermore, a ranking model can be developed to rank candidate suggestions to be presented to a user.

    摘要翻译: 自动生成用于纠正输入到搜索应用程序中的拼错查询条件的候选建议。 可以使用第一解码通道来生成每个候选建议的得分,并且通过建议的路径可以被排列在第二解码通行证中。 可以根据印刷错误,语音错误和/或复合错误生成候选建议。 此外,可以开发排名模型来排列要呈现给用户的候选建议。

    Method and system for retrieving confirming sentences
    10.
    发明申请
    Method and system for retrieving confirming sentences 有权
    检索确认句子的方法和系统

    公开(公告)号:US20050273318A1

    公开(公告)日:2005-12-08

    申请号:US11187567

    申请日:2005-07-22

    CPC分类号: G06F17/3069 Y10S707/99933

    摘要: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.

    摘要翻译: 提供了一种方法,计算机可读介质和系统,其响应于查询从句子数据库中检索确认句子。 搜索引擎响应于查询从句子数据库中检索确认句子。 在检索确认语句中,搜索引擎基于查询来定义索引单元,索引单元包括来自查询的引理和与查询相关联的扩展索引单元。 然后,搜索引擎使用定义的索引单元作为搜索参数从句子数据库中检索多个句子。 由搜索引擎确定多个检索到的句子和查询中的每一个之间的相似度,其中每个相似度被确定为查询中的术语的语言权重的函数。 然后,搜索引擎基于所确定的相似度对多个检索到的句子进行排序。