Three-Dimensional (3D) Convolution With 3D Batch Normalization

    公开(公告)号:US20190213482A1

    公开(公告)日:2019-07-11

    申请号:US16355290

    申请日:2019-03-15

    Abstract: A method of classifying three-dimensional (3D) data includes receiving three-dimensional (3D) data and processing the 3D data using a neural network that includes a plurality of subnetworks arranged in a sequence and the data is processed through each of the subnetworks. Each of the subnetworks is configured to receive an output generated by a preceding subnetwork in the sequence, process the output through a plurality of parallel 3D convolution layer paths of varying convolution volume, process the output through a parallel pooling path, and concatenate output of the 3D convolution layer paths and the pooling path to generate an output representation from each of the subnetworks. Following processing the data through the subnetworks, the method includes processing the output of a last one of the subnetworks in the sequence through a vertical pooling layer to generate an output and classifying the received 3D data based upon the generated output.

    HIERARCHICAL AND INTERPRETABLE SKILL ACQUISITION IN MULTI-TASK REINFORCEMENT LEARNING

    公开(公告)号:US20190130312A1

    公开(公告)日:2019-05-02

    申请号:US15885727

    申请日:2018-01-31

    Abstract: The disclosed technology reveals a hierarchical policy network, for use by a software agent, to accomplish an objective that requires execution of multiple tasks. A terminal policy learned by training the agent on a terminal task set, serves as a base task set of the intermediate task set. An intermediate policy learned by training the agent on an intermediate task set serves as a base policy of the top policy. A top policy learned by training the agent on a top task set serves as a base task set of the top task set. The agent is configurable to accomplish the objective by traversal of the hierarchical policy network. A current task in a current task set is executed by executing a previously-learned task selected from a corresponding base task set governed by a corresponding base policy, or performing a primitive action selected from a library of primitive actions.

    System and Method for Unsupervised Density Based Table Structure Identification

    公开(公告)号:US20210141781A1

    公开(公告)日:2021-05-13

    申请号:US16680302

    申请日:2019-11-11

    Abstract: Embodiments described herein provide unsupervised density-based clustering to infer table structure from document. Specifically, a number of words are identified from a block of text in an noneditable document, and the spatial coordinates of each word relative to the rectangular region are identified. Based on the word density of the rectangular region, the words are grouped into clusters using a heuristic radius search method. Words that are grouped into the same cluster are determined to be the element that belong to the same cell. In this way, the cells of the table structure can be identified. Once the cells are identified based on the word density of the block of text, the identified cells can be expanded horizontally or grouped vertically to identify rows or columns of the table structure.

    Weakly Supervised Natural Language Localization Networks

    公开(公告)号:US20200372116A1

    公开(公告)日:2020-11-26

    申请号:US16531343

    申请日:2019-08-05

    Abstract: Systems and methods are provided for weakly supervised natural language localization (WSNLL), for example, as implemented in a neural network or model. The WSNLL network is trained with long, untrimmed videos, i.e., videos that have not been temporally segmented or annotated. The WSNLL network or model defines or generates a video-sentence pair, which corresponds to a pairing of an untrimmed video with an input text sentence. According to some embodiments, the WSNLL network or model is implemented with a two-branch architecture, where one branch performs segment sentence alignment and the other one conducts segment selection.

    DATA PRIVACY PROTECTED MACHINE LEARNING SYSTEMS

    公开(公告)号:US20200272940A1

    公开(公告)日:2020-08-27

    申请号:US16398757

    申请日:2019-04-30

    Abstract: Approaches for private and interpretable machine learning systems include a system for processing a query. The system includes one or more teacher modules for receiving a query and generating a respective output, one or more privacy sanitization modules for privacy sanitizing the respective output of each of the one or more teacher modules, and a student module for receiving a query and the privacy sanitized respective output of each of the one or more teacher modules and generating a result. Each of the one or more teacher modules is trained using a respective private data set. The student module is trained using a public data set. In some embodiments, human understandable interpretations of an output from the student module is provided to a model user.

    Dense Video Captioning
    28.
    发明申请

    公开(公告)号:US20200084465A1

    公开(公告)日:2020-03-12

    申请号:US16687405

    申请日:2019-11-18

    Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.

    HYBRID TRAINING OF DEEP NETWORKS
    29.
    发明申请

    公开(公告)号:US20190188568A1

    公开(公告)日:2019-06-20

    申请号:US15926768

    申请日:2018-03-20

    CPC classification number: G06N3/082 G06N3/0427

    Abstract: Hybrid training of deep networks includes a multi-layer neural network. The training includes setting a current learning algorithm for the multi-layer neural network to a first learning algorithm. The training further includes iteratively applying training data to the neural network, determining a gradient for parameters of the neural network based on the applying of the training data, updating the parameters based on the current learning algorithm, and determining whether the current learning algorithm should be switched to a second learning algorithm based on the updating. The training further includes, in response to the determining that the current learning algorithm should be switched to a second learning algorithm, changing the current learning algorithm to the second learning algorithm and initializing a learning rate of the second learning algorithm based on the gradient and a step used by the first learning algorithm to update the parameters of the neural network.

    SENTINEL LONG SHORT-TERM MEMORY (Sn-LSTM)
    30.
    发明申请

    公开(公告)号:US20180144248A1

    公开(公告)日:2018-05-24

    申请号:US15817165

    申请日:2017-11-18

    Abstract: The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.

Patent Agency Ranking