-
公开(公告)号:US20210150366A1
公开(公告)日:2021-05-20
申请号:US16877333
申请日:2020-05-18
Applicant: salesforce.com, inc.
Inventor: Govardana Sachithanandam Ramachandran , Ka Chun Au , Shashank Harinath , Wenhao Liu , Alexis Roos , Caiming Xiong
Abstract: An embodiment proposed herein uses sparsification techniques to train the neural network with a high feature dimension that may yield desirable in-domain detection accuracy but may prune away dimensions in the output that are less important. Specifically, a sparsification vector is generated based on Gaussian distribution (or other probabilistic distribution) and is used to multiply with the higher dimension output to reduce the number of feature dimensions. The pruned output may be then used for the neural network to learn the sparsification vector. In this way, out-of-distribution detection accuracy can be improved.
-
公开(公告)号:US20210150340A1
公开(公告)日:2021-05-20
申请号:US16877339
申请日:2020-05-18
Applicant: salesforce.com, inc.
Inventor: Wenhao Liu , Ka Chun Au , Shashank Harinath , Bryan McCann , Govardana Sachithanandam Ramachandran , Alexis Roos , Caiming Xiong
Abstract: Embodiments described herein provides a training mechanism that transfers the knowledge from a trained BERT model into a much smaller model to approximate the behavior of BERT. Specifically, the BERT model may be treated as a teacher model, and a much smaller student model may be trained using the same inputs to the teacher model and the output from the teacher model. In this way, the student model can be trained within a much shorter time than the BERT teacher model, but with comparable performance with BERT.
-
公开(公告)号:US20210150282A1
公开(公告)日:2021-05-20
申请号:US16686051
申请日:2019-11-15
Applicant: salesforce.com, inc.
Inventor: Ankit Chadha , Caiming Xiong , Ran Xu
Abstract: Computing systems may support image classification and image detection services, and these services may utilize object detection/image classification machine learning models. The described techniques provide for normalization of confidence scores corresponding to manipulated target images and for non-max suppression within the range of confidence scores for manipulated images. In one example, the techniques provide for generating different scales of a test image, and the system performs normalization of confidence scores corresponding to each scaled image and non-max suppression per scaled image These techniques may be used to provide more accurate image detection (e.g., object detection and/or image classification) and may be used with models that are not trained on modified image sets. The model may be trained on a standard (e.g. non-manipulated) image set but used with manipulated target images and the described techniques to provide accurate object detection.
-
公开(公告)号:US10929607B2
公开(公告)日:2021-02-23
申请号:US15978445
申请日:2018-05-14
Applicant: salesforce.com, inc.
Inventor: Victor Zhong , Caiming Xiong
Abstract: A method for maintaining a dialogue state associated with a dialogue between a user and a digital system includes receiving, by a dialogue state tracker associated with the digital system, a representation of a user communication, updating, by the dialogue state tracker, the dialogue state and providing a system response based on the updated dialogue state. The dialogue state is updated by evaluating, based on the representation of the user communication, a plurality of member scores corresponding to a plurality of ontology members of an ontology set, and selecting, based on the plurality of member scores, zero or more of the plurality of ontology members to add to or remove from the dialogue state. The dialogue state tracker includes a global-local encoder that includes a global branch and a local branch, the global branch having global trained parameters that are shared among the plurality of ontology members and the local branch having local trained parameters that are determined separately for each of the plurality of ontology members.
-
公开(公告)号:US10902289B2
公开(公告)日:2021-01-26
申请号:US16394992
申请日:2019-04-25
Applicant: salesforce.com, inc.
Inventor: Mingfei Gao , Richard Socher , Caiming Xiong
Abstract: Embodiments described herein provide a two-stage online detection of action start system including a classification module and a localization module. The classification module generates a set of action scores corresponding to a first video frame from the video, based on the first video frame and video frames before the first video frames in the video. Each action score indicating a respective probability that the first video frame contains a respective action class. The localization module is coupled to the classification module for receiving the set of action scores from the classification module and generating an action-agnostic start probability that the first video frame contains an action start. A fusion component is coupled to the localization module and the localization module for generating, based on the set of action scores and the action-agnostic start probability, a set of action-specific start probabilities, each action-specific start probability corresponding to a start of an action belonging to the respective action class.
-
公开(公告)号:US20200302236A1
公开(公告)日:2020-09-24
申请号:US16394992
申请日:2019-04-25
Applicant: Salesforce.com, Inc,
Inventor: Mingfei Gao , Richard Socher , Caiming Xiong
Abstract: Embodiments described herein provide a two-stage online detection of action start system including a classification module and a localization module. The classification module generates a set of action scores corresponding to a first video frame from the video, based on the first video frame and video frames before the first video frames in the video. Each action score indicating a respective probability that the first video frame contains a respective action class. The localization module is coupled to the classification module for receiving the set of action scores from the classification module and generating an action-agnostic start probability that the first video frame contains an action start. A fusion component is coupled to the localization module and the localization module for generating, based on the set of action scores and the action-agnostic start probability, a set of action-specific start probabilities, each action-specific start probability corresponding to a start of an action belonging to the respective action class.
-
177.
公开(公告)号:US10783875B2
公开(公告)日:2020-09-22
申请号:US16027111
申请日:2018-07-03
Applicant: salesforce.com, inc.
Inventor: Ehsan Hosseini-Asl , Caiming Xiong , Yingbo Zhou , Richard Socher
Abstract: A system for domain adaptation includes a domain adaptation model configured to adapt a representation of a signal in a first domain to a second domain to generate an adapted presentation and a plurality of discriminators corresponding to a plurality of bands of values of a domain variable. Each of the plurality of discriminators is configured to discriminate between the adapted representation and representations of one or more other signals in the second domain.
-
公开(公告)号:US10747761B2
公开(公告)日:2020-08-18
申请号:US15885613
申请日:2018-01-31
Applicant: salesforce.com, inc.
Inventor: Victor Zhong , Caiming Xiong , Richard Socher
IPC: G06F16/2452 , G06N3/08 , G06N7/00 , G06N3/04 , G06N3/00 , G06F16/13 , G06F16/2457
Abstract: A computing system uses neural networks to translate natural language queries to database queries. The computing system uses a plurality of machine learning based models, each machine learning model for generating a portion of the database query. The machine learning models use an input representation generated based on terms of the input natural language query, a set of columns of the database schema, and the vocabulary of a database query language, for example, structured query language SQL. The plurality of machine learning based models may include an aggregation classifier model for determining an aggregation operator in the database query, a result column predictor model for determining the result columns of the database query, and a condition clause predictor model for determining the condition clause of the database query. The condition clause predictor is based on reinforcement learning.
-
公开(公告)号:US10573295B2
公开(公告)日:2020-02-25
申请号:US15878113
申请日:2018-01-23
Applicant: salesforce.com, inc.
Inventor: Yingbo Zhou , Caiming Xiong
Abstract: The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions. The multi-objective learning criteria updates model parameters of the model over one thousand to millions of backpropagation iterations by combining, at each iteration, a maximum likelihood objective function that modifies the model parameters to maximize a probability of outputting a correct transcription and a policy gradient function that modifies the model parameters to maximize a positive reward defined based on a non-differentiable performance metric which penalizes incorrect transcriptions in accordance with their conformity to corresponding ground truth transcriptions; and upon convergence after a final backpropagation iteration, persisting the modified model parameters learned by using the multi-objective learning criteria with the model to be applied to further end-to-end speech recognition.
-
公开(公告)号:US10558750B2
公开(公告)日:2020-02-11
申请号:US15817153
申请日:2017-11-17
Applicant: salesforce.com, inc.
Inventor: Jiasen Lu , Caiming Xiong , Richard Socher
Abstract: The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.
-
-
-
-
-
-
-
-
-