Transitive Synonym Creation
    1.
    发明申请
    Transitive Synonym Creation 审中-公开
    传统同义词创作

    公开(公告)号:US20150006563A1

    公开(公告)日:2015-01-01

    申请号:US12856522

    申请日:2010-08-13

    CPC classification number: G06F16/24534

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying synonyms. One method includes receiving a query containing a first phrase, identifying one or more first synonym phrases that are synonyms for the first phrase, identifying a new synonym phrase that is a synonym for one of the first synonym phrases, determining that the new phrase is a synonym for the first phrase, and augmenting the query with the new phrase. Another method includes receiving a query including a first compound term having a first subterm, identifying a first synonym for a first subterm, generating a second compound term, wherein the second compound term is the first compound term modified by replacing the first subterm with the first synonym, and augmenting the query with the second compound term.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于识别同义词的计算机程序。 一种方法包括接收包含第一短语的查询,识别作为第一短语的同义词的一个或多个第一同义词短语,识别作为第一同义词短语之一的同义词的新同义词短语,确定新短语是 第一个短语的同义词,并用新的短语扩充查询。 另一种方法包括接收包括具有第一子项的第一复合项的查询,识别第一子项的第一同义词,生成第二复合项,其中第二复合项是通过用第一子项替换第一子项而修改的第一复合项 同义词,并用第二个复合术语扩充查询。

    INDEX-SIDE SYNONYM GENERATION
    2.
    发明申请
    INDEX-SIDE SYNONYM GENERATION 有权
    指标同步生成

    公开(公告)号:US20130151501A1

    公开(公告)日:2013-06-13

    申请号:US13761920

    申请日:2013-02-07

    CPC classification number: G06F17/3087 G06F17/30631

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for index-side synonym expansion are disclosed. Some implementations include actions of obtaining a token sequence for a resource, wherein each token in the token sequence comprises one or more characters. The actions also include selecting a token from the token sequence, wherein the selected token comprises at least one numeric portion having one or more contiguous numeric characters, and at least one non-numeric portion having one or more non-numeric characters. Further actions include generating a new token corresponding to each of the at least one numeric portions of the selected token and storing data associating the selected token and each of the new tokens corresponding to the at least one numeric portion of the selected token as index terms for the resource, wherein the search engine index is accessed to augment search queries.

    Abstract translation: 公开了用于索引侧同义词扩展的方法,系统和装置,包括在计算机存储介质上编码的计算机程序。 一些实施方式包括获得资源的令牌序列的动作,其中令牌序列中的每个令牌包括一个或多个字符。 动作还包括从令牌序列中选择令牌,其中所选择的令牌包括具有一个或多个相邻数字字符的至少一个数字部分和至少一个具有一个或多个非数字字符的非数字部分。 进一步的动作包括生成对应于所选择的令牌的至少一个数字部分中的每一个的新令牌,并存储将所选择的令牌与对应于所选令牌的至少一个数字部分的每个新的令牌相关联的数据作为索引项, 资源,其中访问搜索引擎索引以增加搜索查询。

    Synonym verification
    3.
    发明授权
    Synonym verification 有权
    同义词验证

    公开(公告)号:US08515731B1

    公开(公告)日:2013-08-20

    申请号:US12568435

    申请日:2009-09-28

    CPC classification number: G06F17/2795 G06F17/2854

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for synonym verification. In one aspect, a method includes receiving a term and a candidate synonym for the term. The method further includes generating a term group of one or more text strings and a synonym group of one or more text strings. Each text string in the term group corresponding to a translation of the term into a language, and each text string in the synonym group corresponding to a translation of the synonym into the language. The method further includes determining whether the candidate synonym is a valid synonym for the term from an amount of overlap between the term group of text strings and the synonym group of text strings.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于同义词验证。 一方面,一种方法包括接收该术语的术语和候选同义词。 该方法还包括生成一个或多个文本串的术语组和一个或多个文本串的同义词组。 术语组中的每个文本字符串对应于术语到语言的翻译,以及同义词组中的每个文本字符串对应于同义词到该语言的翻译。 该方法还包括根据文本串的术语组与文本串的同义词组之间的重叠量确定候选同义词是否是该术语的有效同义词。

    Online de-compounding of query terms
    4.
    发明授权
    Online de-compounding of query terms 有权
    查询词的在线去复合

    公开(公告)号:US08392440B1

    公开(公告)日:2013-03-05

    申请号:US12856495

    申请日:2010-08-13

    CPC classification number: G06F17/3064 G06F17/30646

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for query synonym expansion. One method includes receiving a query including a first compound term, and in response to receiving the query, performing the following operations before search results responsive to the query are identified: generating one or more splits of the first compound term, wherein each split divides the compound term into two or more subterms, assigning a score to each subterm of each split, determining an overall score for each split from the scores for the subterms of the split, selecting one or more of the one or more splits according to the overall score for each split, and augmenting the query with the subterms of each selected split.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于查询同义词扩展。 一种方法包括接收包括第一复合项的查询,并且响应于接收到查询,在识别出响应于查询的搜索结果之前执行以下操作:生成第一复合项的一个或多个分割,其中每个分割将 复合项到两个或多个子项中,为每个分组的每个子项分配分数,根据分组子项的分数确定每个分组的总分,根据总分选择一个或多个分割中的一个或多个 对于每个拆分,并使用每个选定拆分的子句来扩充查询。

    Longest-common-subsequence detection for common synonyms
    5.
    发明授权
    Longest-common-subsequence detection for common synonyms 有权
    常用同义词的最长公共子序列检测

    公开(公告)号:US08001136B1

    公开(公告)日:2011-08-16

    申请号:US12166700

    申请日:2008-07-02

    CPC classification number: G06F17/30731

    Abstract: One embodiment of the present invention provides a system for identifying synonym candidates. During operation, the system receives a first term and a second term. The system then determines a length of the longer one of the first and second terms, and determines a longest common subsequence of the two terms. The system further produces a result to indicate whether the two terms are synonym candidates based on the length of the longer term and a length of the longest common subsequence of the two terms.

    Abstract translation: 本发明的一个实施例提供了一种用于识别同义词候选的系统。 在操作期间,系统接收第一项和第二项。 然后,系统确定第一和第二项中较长的一个的长度,并确定两个项的最长公共子序列。 该系统进一步产生一个结果,以指示两个术语是否是基于长期的长度和两个术语的最长公共子序列的长度的同义词候选。

    INDEX-SIDE DIACRITICAL CANONICALIZATION
    6.
    发明申请
    INDEX-SIDE DIACRITICAL CANONICALIZATION 审中-公开
    指标界面综合评估

    公开(公告)号:US20160307000A1

    公开(公告)日:2016-10-20

    申请号:US12942967

    申请日:2010-11-09

    CPC classification number: G06F21/64 G06F16/3337 G06F17/273 G06F17/2795

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for index-side synonym expansion. One method includes obtaining a token sequence for a resource and indexing a particular token in the token sequence. The indexing includes obtaining a diacritically canonicalized form of the particular token; determining that the diacritically canonicalized form of the particular token is different from the particular token; and storing data associating the resource with both the particular token and the different diacritically canonicalized form of the particular token as index terms for the resource in a search engine.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于索引侧同义词扩展。 一种方法包括获得资源的令牌序列并索引令牌序列中的特定令牌。 索引包括获得特定令牌的二义性规范化形式; 确定特定令牌的二义性规范化形式与特定令牌不同; 以及存储将所述资源与所述特定令牌和所述特定令牌的不同二进制规范化形式相关联的数据,作为所述资源在搜索引擎中的索引项。

    Synonym generation using online decompounding and transitivity
    7.
    发明授权
    Synonym generation using online decompounding and transitivity 有权
    使用在线分解和传递性的同义词生成

    公开(公告)号:US09361362B1

    公开(公告)日:2016-06-07

    申请号:US13247958

    申请日:2011-09-28

    CPC classification number: G06F17/3064 G06F17/30646

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for query synonym expansion. One method includes receiving a query including a first compound term, and in response to receiving the query, performing the following operations before search results responsive to the query are identified: generating one or more splits of the first compound term, wherein each split divides the compound term into two or more subterms, assigning a score to each subterm of each split, determining an overall score for each split from the scores for the subterms of the split, selecting one or more of the one or more splits according to the overall score for each split, and augmenting the query with the subterms of each selected split.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于查询同义词扩展。 一种方法包括接收包括第一复合项的查询,并且响应于接收到查询,在识别出响应于查询的搜索结果之前执行以下操作:生成第一复合项的一个或多个分割,其中每个分割将 复合项到两个或多个子项中,为每个分组的每个子项分配分数,根据分组子项的分数确定每个分组的总分,根据总分选择一个或多个分割中的一个或多个 对于每个拆分,并使用每个选定拆分的子句来扩充查询。

    Index-side synonym generation
    8.
    发明授权
    Index-side synonym generation 有权
    索引侧同义词生成

    公开(公告)号:US08375042B1

    公开(公告)日:2013-02-12

    申请号:US12942965

    申请日:2010-11-09

    CPC classification number: G06F17/3087 G06F17/30631

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for index-side synonym expansion. One method includes indexing a token from a resource, including determining that the token comprises a numeric portion and storing data associating the resource with both the particular token and the numeric portion in a search engine index. Another method includes indexing a token from a resource, including normalizing the token by removing a prefix matching a stopword prefix and storing data associating the resource with both the token and the normalized form of the token in a search engine index. Another method includes creating a token blacklist.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于索引侧同义词扩展。 一种方法包括从资源索引令牌,包括确定令牌包括数字部分并将资源与特定令牌和数字部分相关联的数据存储在搜索引擎索引中。 另一种方法包括从资源索引令牌,包括通过移除与停止词前缀相匹配的前缀来标准化标记,并且将资源与令牌和归一化形式的令牌相关联的数据存储在搜索引擎索引中。 另一种方法包括创建令牌黑名单。

    Storing term substitution information in an index
    10.
    发明授权
    Storing term substitution information in an index 有权
    在索引中存储术语替换信息

    公开(公告)号:US09037591B1

    公开(公告)日:2015-05-19

    申请号:US13460582

    申请日:2012-04-30

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for storing, in an index associated with a document, a particular term that occurs in the document, wherein the particular term comprises n words, and wherein n is greater than 1; identifying a substitute term of the particular term; and in response to identifying the substitute term of the particular term, storing, in the index associated with the document, (i) the substitute term of the particular term, and (ii) data indicating that the substitute term spans the n words of the particular term.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于在与文档相关联的索引中存储在文档中出现的特定术语,其中特定术语包括n个词,并且其中n更大 比1; 确定特定术语的替代术语; 并且响应于识别特定术语的替代项,在与该文档相关联的索引中存储(i)特定术语的替代术语,以及(ii)表示替代术语跨越 特别术语。

Patent Agency Ranking