INDEX-SIDE DIACRITICAL CANONICALIZATION
    1.
    发明申请
    INDEX-SIDE DIACRITICAL CANONICALIZATION 审中-公开
    指标界面综合评估

    公开(公告)号:US20160307000A1

    公开(公告)日:2016-10-20

    申请号:US12942967

    申请日:2010-11-09

    CPC classification number: G06F21/64 G06F16/3337 G06F17/273 G06F17/2795

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for index-side synonym expansion. One method includes obtaining a token sequence for a resource and indexing a particular token in the token sequence. The indexing includes obtaining a diacritically canonicalized form of the particular token; determining that the diacritically canonicalized form of the particular token is different from the particular token; and storing data associating the resource with both the particular token and the different diacritically canonicalized form of the particular token as index terms for the resource in a search engine.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于索引侧同义词扩展。 一种方法包括获得资源的令牌序列并索引令牌序列中的特定令牌。 索引包括获得特定令牌的二义性规范化形式; 确定特定令牌的二义性规范化形式与特定令牌不同; 以及存储将所述资源与所述特定令牌和所述特定令牌的不同二进制规范化形式相关联的数据,作为所述资源在搜索引擎中的索引项。

    Storing term substitution information in an index
    2.
    发明授权
    Storing term substitution information in an index 有权
    在索引中存储术语替换信息

    公开(公告)号:US09037591B1

    公开(公告)日:2015-05-19

    申请号:US13460582

    申请日:2012-04-30

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for storing, in an index associated with a document, a particular term that occurs in the document, wherein the particular term comprises n words, and wherein n is greater than 1; identifying a substitute term of the particular term; and in response to identifying the substitute term of the particular term, storing, in the index associated with the document, (i) the substitute term of the particular term, and (ii) data indicating that the substitute term spans the n words of the particular term.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于在与文档相关联的索引中存储在文档中出现的特定术语,其中特定术语包括n个词,并且其中n更大 比1; 确定特定术语的替代术语; 并且响应于识别特定术语的替代项,在与该文档相关联的索引中存储(i)特定术语的替代术语,以及(ii)表示替代术语跨越 特别术语。

Patent Agency Ranking