Continual neural network learning via explicit structure learning

    公开(公告)号:US11645509B2

    公开(公告)日:2023-05-09

    申请号:US16176419

    申请日:2018-10-31

    CPC classification number: G06N3/08 G06N3/04

    Abstract: Embodiments for training a neural network using sequential tasks are provided. A plurality of sequential tasks are received. For each task in the plurality of tasks a copy of the neural network that includes a plurality of layers is generated. From the copy of the neural network a task specific neural network is generated by performing an architectural search on the plurality of layers in the copy of the neural network. The architectural search identifies a plurality of candidate choices in the layers of the task specific neural network. Parameters in the task specific neural network that correspond to the plurality of candidate choices and that maximize architectural weights at each layer are identified. The parameters are retrained and merged with the neural network. The neural network trained on the plurality of sequential tasks is a trained neural network.

    SYSTEMS AND METHODS FOR OPEN DOMAIN MULTI-HOP QUESTION ANSWERING

    公开(公告)号:US20220383159A1

    公开(公告)日:2022-12-01

    申请号:US17534085

    申请日:2021-11-23

    Abstract: Embodiments described herein provide a fusion-in-decoder (FID) based model (referred to as “PATHID”) for open-domain multi-hop question answering. Specifically, PATHID addresses the gap between the general behavior of the FID model on single-hop and multi-hop question answering, and provides more transparency into the reasoning path. In addition to answer generation, PATHID explicitly models the full reasoning path to resolve the answer with a generative sequence-to-sequence model.

    SYSTEMS AND METHODS FOR MUTUAL INFORMATION BASED SELF-SUPERVISED LEARNING

    公开(公告)号:US20220067534A1

    公开(公告)日:2022-03-03

    申请号:US17006570

    申请日:2020-08-28

    Abstract: Embodiments described herein combine both masked reconstruction and predictive coding. Specifically, unlike contrastive learning, the mutual information between past states and future states are directly estimated. The context information can also be directly captured via shifted masked reconstruction—unlike standard masked reconstruction, the target reconstructed observations are shifted slightly towards the future to incorporate more predictability. The estimated mutual information and shifted masked reconstruction loss can then be combined as the loss function to update the neural model.

    Systems and methods for mutual information based self-supervised learning

    公开(公告)号:US12198060B2

    公开(公告)日:2025-01-14

    申请号:US17006570

    申请日:2020-08-28

    Abstract: Embodiments described herein combine both masked reconstruction and predictive coding. Specifically, unlike contrastive learning, the mutual information between past states and future states are directly estimated. The context information can also be directly captured via shifted masked reconstruction—unlike standard masked reconstruction, the target reconstructed observations are shifted slightly towards the future to incorporate more predictability. The estimated mutual information and shifted masked reconstruction loss can then be combined as the loss function to update the neural model.

    SYSTEMS AND METHODS FOR KNOWLEDGE BASE QUESTION ANSWERING USING GENERATION AUGMENTED RANKING

    公开(公告)号:US20230055188A1

    公开(公告)日:2023-02-23

    申请号:US17565215

    申请日:2021-12-29

    Abstract: Embodiments described herein provide a question answering approach that answers a question by generating an executable logical form. First, a ranking model is used to select a set of good logical forms from a pool of logical forms obtained by searching over a knowledge graph. The selected logical forms are good in the sense that they are close to (or exactly match, in some cases) the intents in the question and final desired logical form. Next, a generation model is adopted conditioned on the question as well as the selected logical forms to generate the target logical form and execute it to obtain the final answer. For example, at inference stage, when a question is received, a matching logical form is identified from the question, based on which the final answer can be generated based on the node that is associated with the matching logical form in the knowledge base.

    SYSTEMS AND METHODS FOR HIERARCHICAL RETRIEVAL OF SEMANTIC-BASED PASSAGES IN DEEP LEARNING

    公开(公告)号:US20220374459A1

    公开(公告)日:2022-11-24

    申请号:US17533613

    申请日:2021-11-23

    Abstract: Embodiments described herein provide a dense hierarchical retrieval for open-domain question and answering for a corpus of documents using a document-level and passage-level dense retrieval model. Specifically, each document is viewed as a structural collection that has sections, subsections and paragraphs. Each document may be split into short length passages, where a document-level retrieval model and a passage-level retrieval model may be applied to return a smaller set of filtered texts. Top documents may be identified after encoding the question and the documents and determining document relevance scores to the encoded question. Thereafter, a set of top passages are further identified based on encoding of the passages and determining passage relevance scores to the encoded question. The document and passage relevance scores may be used in combination to determine a final retrieval ranking for the documents having the set of top passages.

    Phone-based sub-word units for end-to-end speech recognition

    公开(公告)号:US11328731B2

    公开(公告)日:2022-05-10

    申请号:US16903964

    申请日:2020-06-17

    Abstract: System and methods for identifying a text word from a spoken utterance are provided. An ensemble BPE system that includes a phone BPE system and a character BPE system receives a spoken utterance. Both BPE systems include a multi-level language model (LM) and an acoustic model. The phone BPE system identifies first words from the spoken utterance and determine a first score for each first word. The first words are converted into character sequences. The character BPE model converts the character sequences into second words and determines a second score for each second word. For each word from the first words that matches a word in the second words the first and second scores are combined. The text word is the word with a highest score.

    Phone-Based Sub-Word Units for End-to-End Speech Recognition

    公开(公告)号:US20210319796A1

    公开(公告)日:2021-10-14

    申请号:US16903964

    申请日:2020-06-17

    Abstract: System and methods for identifying a text word from a spoken utterance are provided. An ensemble BPE system that includes a phone BPE system and a character BPE system receives a spoken utterance. Both BPE systems include a multi-level language model (LM) and an acoustic model. The phone BPE system identifies first words from the spoken utterance and determine a first score for each first word. The first words are converted into character sequences. The character BPE model converts the character sequences into second words and determines a second score for each second word. For each word from the first words that matches a word in the second words the first and second scores are combined. The text word is the word with a highest score.

Patent Agency Ranking