Systems and methods for searching and indexing documents comprising chemical information

    公开(公告)号:US11301518B2

    公开(公告)日:2022-04-12

    申请号:US16739799

    申请日:2020-01-10

    摘要: Described herein are systems and methods for indexing document data in order to facilitate chemical structure searching. The document data may include chemical structure data corresponding to a chemical structure. Bit-screening data and connection data in the chemical structure data may be identified. The bit-screening data may correspond to constituent elements of the chemical structure, and the connection data may correspond to connections between the one or more constituent elements. A string tag may be generated based on a portion of the identified bit-screening data. The string tag may include an alphanumeric value for describing the chemical structure that corresponds to the chemical structure data. The document data may be indexed based on the string tag. The chemical structure data corresponding to a chemical structure in the document may be searchable based on correlating at least a portion of text data of a query with the indexed document data.

    SYSTEMS AND METHODS FOR SEARCHING AND INDEXING DOCUMENTS COMPRISING CHEMICAL INFORMATION

    公开(公告)号:US20180253426A1

    公开(公告)日:2018-09-06

    申请号:US15474865

    申请日:2017-03-30

    IPC分类号: G06F17/30

    摘要: Described herein are systems and methods that efficiently search for documents related to chemical structures of interest to a user. In certain embodiments, text data and chemical structure data provided in a user query are simultaneously searched with a text-based search method to efficiently produce search results. Subsequent structure-based searching on the results of the text-based search produces precise results for a particular user query. This approach increases the speed of the structure-based search by reducing the amount of data the structure-based search searches over. Additionally described herein are systems and methods for indexing document data in order to facilitate this efficient searching.