System and method for automatic task-oriented dialog system

    公开(公告)号:US11568000B2

    公开(公告)日:2023-01-31

    申请号:US16736268

    申请日:2020-01-07

    Abstract: A method for dialog state tracking includes decoding, by a fertility decoder, encoded dialog information associated with a dialog to generate fertilities for generating dialog states of the dialog. Each dialog state includes one or more domains. Each domain includes one or more slots. Each slot includes one or more slot tokens. The method further includes generating an input sequence to a state decoder based on the fertilities. A total number of each slot token in the input sequence is based on a corresponding fertility. The method further includes encoding, by a state encoder, the input sequence to the state decoder, and decoding, by the state decoder, the encoded input sequence to generate a complete sequence of the dialog states.

    SYSTEMS AND METHODS FOR SEMI-SUPERVISED LEARNING WITH CONTRASTIVE GRAPH REGULARIZATION

    公开(公告)号:US20220156591A1

    公开(公告)日:2022-05-19

    申请号:US17160896

    申请日:2021-01-28

    Abstract: Embodiments described herein provide an approach (referred to as “Co-training” mechanism throughout this disclosure) that jointly learns two representations of the training data, their class probabilities and low-dimensional embeddings. Specifically, two representations of each image sample are generated: a class probability produced by the classification head and a low-dimensional embedding produced by the projection head. The classification head is trained using memory-smoothed pseudo-labels, where pseudo-labels are smoothed by aggregating information from nearby samples in the embedding space. The projection head is trained using contrastive learning on a pseudo-label graph, where samples with similar pseudo-labels are encouraged to have similar embeddings.

    SYSTEMS AND METHODS FOR INTERPOLATIVE CENTROID CONTRASTIVE LEARNING

    公开(公告)号:US20220156530A1

    公开(公告)日:2022-05-19

    申请号:US17188232

    申请日:2021-03-01

    Abstract: An interpolative centroid contrastive learning (ICCL) framework is disclosed for learning a more discriminative representation for tail classes. Specifically, data samples, such as natural images, are projected into a low-dimensional embedding space, and class centroids for respective classes are created as average embeddings of samples that belong to a respective class. Virtual training samples are then created by interpolating two images from two samplers: a class-agnostic sampler which returns all images from both the head class and the tail class with an equal probability, and a class-aware sampler which focuses more on tail-class images by sampling images from the tail class with a higher probability compared to images from the head class. The sampled images, e.g., images from the class-agnostic sampler and images from the class-aware sampler may be interpolated to generate interpolated images.

    UNSUPERVISED REPRESENTATION LEARNING WITH CONTRASTIVE PROTOTYPES

    公开(公告)号:US20220156507A1

    公开(公告)日:2022-05-19

    申请号:US17591121

    申请日:2022-02-02

    Abstract: The system and method are directed to a prototypical contrastive learning (PCL). The PCL explicitly encodes the hierarchical semantic structure of the dataset into the learned embedding space and prevents the network from exploiting low-level cues for solving the unsupervised learning task. The PCL includes prototypes as the latent variables to help find the maximum-likelihood estimation of the network parameters in an expectation-maximization framework. The PCL iteratively performs an E-step for finding prototypes with clustering and M-step for optimizing the network on a contrastive loss.

    SYSTEM AND METHOD FOR DIFFERENTIAL ARCHITECTURE SEARCH FOR NEURAL NETWORKS

    公开(公告)号:US20210383188A1

    公开(公告)日:2021-12-09

    申请号:US17072485

    申请日:2020-10-16

    Abstract: A method for generating a neural network, including initializing the neural network including a plurality of cells, each cell corresponding to a graph including one or more nodes, each node corresponding to a latent representation of a dataset. A plurality of gates are generated, wherein each gate independently determines whether an operation between two nodes is used. A first regularization is performed using the plurality of gates. The first regularization is one of a group-structured sparsity regularization and a path-depth-wised regularization. An optimization is performed on the neural network by adjusting its network parameters and gate parameters based on the regularization of the sparsity.

    UNSUPERVISED REPRESENTATION LEARNING WITH CONTRASTIVE PROTOTYPES

    公开(公告)号:US20210295091A1

    公开(公告)日:2021-09-23

    申请号:US16870621

    申请日:2020-05-08

    Abstract: The system and method are directed to a prototypical contrastive learning (PCL). The PCL explicitly encodes the hierarchical semantic structure of the dataset into the learned embedding space and prevents the network from exploiting low-level cues for solving the unsupervised learning task. The PCL includes prototypes as the latent variables to help find the maximum-likelihood estimation of the network parameters in an expectation-maximization framework. The PCL iteratively performs an E-step for finding prototypes with clustering and M-step for optimizing the network on a contrastive loss.

    Parameter utilization for language pre-training

    公开(公告)号:US12072955B2

    公开(公告)日:2024-08-27

    申请号:US17532851

    申请日:2021-11-22

    CPC classification number: G06F18/2148 G06F18/2163 G06F40/00

    Abstract: Embodiments are directed to pre-training a transformer model using more parameters for sophisticated patterns (PSP++). The transformer model is divided into a held-out model and a main model. A forward pass and a backward pass are performed on the held-out model, where the forward pass determines self-attention hidden states of the held-out model and the backward pass determines loss of the held-out model. A forward pass on the main model is performed to determine a self-attention hidden states of the main model. The self-attention hidden states of the main model are concatenated with the self-attention hidden states of the held-out model. A backward pass is performed on the main model to determine a loss of the main model. The parameters of the held-out model are updated to reflect the loss of the held-out model and parameters of the main model are updated to reflect the loss of the main model.

Patent Agency Ranking