Efficient and fine-grained video retrieval

    公开(公告)号:US11568247B2

    公开(公告)日:2023-01-31

    申请号:US16819513

    申请日:2020-03-16

    Abstract: A computer-implemented method executed by at least one processor for performing mini-batching in deep learning by improving cache utilization is presented. The method includes temporally localizing a candidate clip in a video stream based on a natural language query, encoding a state, via a state processing module, into a joint visual and linguistic representation, feeding the joint visual and linguistic representation into a policy learning module, wherein the policy learning module employs a deep learning network to selectively extract features for select frames for video-text analysis and includes a fully connected linear layer and a long short-term memory (LSTM), outputting a value function from the LSTM, generating an action policy based on the encoded state, wherein the action policy is a probabilistic distribution over a plurality of possible actions given the encoded state, and rewarding policy actions that return clips matching the natural language query.

    HIERARCHICAL WORD EMBEDDING SYSTEM
    32.
    发明申请

    公开(公告)号:US20220327489A1

    公开(公告)日:2022-10-13

    申请号:US17714434

    申请日:2022-04-06

    Abstract: Systems and methods for matching job descriptions with job applicants is provided. The method includes allocating each of one or more job applicants' curriculum vitae (CV) into sections; applying max pooled word embedding to each section of the job applicants' CVs; using concatenated max-pooling and average-pooling to compose the section embeddings into an applicant's CV representation; allocating each of one or more job position descriptions into specified sections; applying max pooled word embedding to each section of the job position descriptions; using concatenated max-pooling and average-pooling to compose the section embeddings into a job representation; calculating a cosine similarity between each of the job representations and each of the CV representations to perform job-to-applicant matching; and presenting an ordered list of the one or more job applicants or an ordered list of the one or more job position descriptions to a user.

    LEARNING ORTHOGONAL FACTORIZATION IN GAN LATENT SPACE

    公开(公告)号:US20220254152A1

    公开(公告)日:2022-08-11

    申请号:US17585754

    申请日:2022-01-27

    Abstract: A method for learning disentangled representations of videos is presented. The method includes feeding each frame of video data into an encoder to produce a sequence of visual features, passing the sequence of visual features through a deep convolutional network to obtain a posterior of a dynamic latent variable and a posterior of a static latent variable, sampling static and dynamic representations from the posterior of the static latent variable and the posterior of the dynamic latent variable, respectively, concatenating the static and dynamic representations to be fed into a decoder to generate reconstructed sequences, and applying three regularizers to the dynamic and static latent variables to trigger representation disentanglement. To facilitate the disentangled sequential representation learning, orthogonal factorization in generative adversarial network (GAN) latent space is leveraged to pre-train a generator as a decoder in the method.

    SELF-SUPERVISED SEQUENTIAL VARIATIONAL AUTOENCODER FOR DISENTANGLED DATA GENERATION

    公开(公告)号:US20210142120A1

    公开(公告)日:2021-05-13

    申请号:US17088043

    申请日:2020-11-03

    Abstract: A computer-implemented method is provided for disentangled data generation. The method includes accessing, by a variational autoencoder, a plurality of supervision signals. The method further includes accessing, by the variational autoencoder, a plurality of auxiliary tasks that utilize the supervision signals as reward signals to learn a disentangled representation. The method also includes training the variational autoencoder to disentangle a sequential data input into a time-invariant factor and a time-varying factor using a self-supervised training approach which is based on outputs of the auxiliary tasks obtained by using the supervision signals to accomplish the plurality of auxiliary tasks.

    Pruning filters for efficient convolutional neural networks for image recognition in vehicles

    公开(公告)号:US10755136B2

    公开(公告)日:2020-08-25

    申请号:US15979509

    申请日:2018-05-15

    Abstract: Systems and methods for surveillance are described, including an image capture device configured to mounted to an autonomous vehicle, the image capture device including an image sensor. A storage device is included in communication with the processing system, the storage device including a pruned convolutional neural network (CNN) being trained to recognize obstacles in a road according to images captured by the image sensor by training a CNN with a dataset and removing filters from layers of the CNN that are below a significance threshold for image recognition to produce the pruned CNN. A processing device is configured to recognize the obstacles by analyzing the images captured by the image sensor with the pruned CNN and to predict movement of the obstacles such that the autonomous vehicle automatically and proactively avoids the obstacle according to the recognized obstacle and predicted movement.

Patent Agency Ranking