CROSS-CONTEXT NATURAL LANGUAGE MODEL GENERATION

    公开(公告)号:US20210295822A1

    公开(公告)日:2021-09-23

    申请号:US17210311

    申请日:2021-03-23

    申请人: Sorcero, Inc.

    摘要: Provided is a method including obtaining a corpus and an associated set of domain indicators. The method includes learning a set of vectors in an embedding space based on n-grams of the corpus. The method includes updating ontology graphs comprising a set of vertices and edges associating the set of vertices with each other. The method also includes determining a vector cluster using hierarchical clustering based on distances of the set of vectors with respect to each other in the embedding space and determining a hierarchy of the ontology graphs based on a set of domain indicators of a respective set of vertices corresponding to vectors of the vector cluster. The method also includes updating an index based on the ontology graphs.

    ONTOLOGY-AUGMENTED INTERFACE
    2.
    发明申请

    公开(公告)号:US20210294828A1

    公开(公告)日:2021-09-23

    申请号:US17210379

    申请日:2021-03-23

    申请人: Sorcero, Inc.

    摘要: Provided is a process including obtaining a set of natural-language text documents that discuss a topic, the set of documents containing different states of knowledge about the topic at different times. The process includes selecting an ontology from among a plurality of ontologies that correspond to different domains of knowledge, the selection being based on the ontology corresponding to a domain of knowledge including the topic. The process includes identifying concepts discussed in the documents using the ontology and detecting changes in at least some of the concepts over time based on differences between discussion of the concepts in documents authored at different times. The process includes updating natural language instructions on the topic based on the detected changes in the concepts and storing the updated natural language instructions in memory.

    ONTOLOGY INTEGRATION FOR DOCUMENT SUMMARIZATION

    公开(公告)号:US20210294829A1

    公开(公告)日:2021-09-23

    申请号:US17210318

    申请日:2021-03-23

    申请人: Sorcero, Inc.

    摘要: Provided is a method including obtaining parameters and a document, determining a domain based on the parameters, where the domain maps to a first ontology, and where ontologies map n-grams onto a set of concepts. The method includes scoring a first set of n-grams of the document using a scoring model based on relations between members of the first set of n-grams, selecting sections of the text based on n-gram scores provided by the scoring model, and determining an initial n-gram set, where each respective n-gram of the initial n-gram set maps to a respective concept of the set of concepts, and where each respective n-gram is identified by an ontology other than the first ontology. The method includes determining related n-grams mapped to the set of concepts associated with the domain and generating a text summary for the document based on the sections and the related n-grams.

    FEATURE ENGINEERING WITH QUESTION GENERATION

    公开(公告)号:US20210294781A1

    公开(公告)日:2021-09-23

    申请号:US17210320

    申请日:2021-03-23

    申请人: Sorcero, Inc.

    摘要: Provided is a computer-implemented process including obtaining a corpus of natural-language text documents, automatically generating questions about information in corresponding portions of the documents, and associating the questions with the corresponding portions of the documents. The process further includes storing the questions and the associations with the corresponding portions of the documents in memory to form an index of automatically-generated questions to corresponding portions of documents that answer the questions.