Automatic navigation using deep reinforcement learning

    公开(公告)号:US11613249B2

    公开(公告)日:2023-03-28

    申请号:US15944563

    申请日:2018-04-03

    Abstract: A method for training an autonomous vehicle to reach a target location. The method includes detecting the state of an autonomous vehicle in a simulated environment, and using a neural network to navigate the vehicle from an initial location to a target destination. During the training phase, a second neural network may reward the first neural network for a desired action taken by the autonomous vehicle, and may penalize the first neural network for an undesired action taken by the autonomous vehicle. A corresponding system and computer program product are also disclosed and claimed herein.

    Hierarchical encoder for speech conversion system

    公开(公告)号:US11410667B2

    公开(公告)日:2022-08-09

    申请号:US16457150

    申请日:2019-06-28

    Abstract: A speech conversion system is described that includes a hierarchical encoder and a decoder. The system may comprise a processor and memory storing instructions executable by the processor. The instructions may comprise to: using a second recurrent neural network (RNN) (GRU1) and a first set of encoder vectors derived from a spectrogram as input to the second RNN, determine a second concatenated sequence; determine a second set of encoder vectors by doubling a stack height and halving a length of the second concatenated sequence; using the second set of encoder vectors, determine a third set of encoder vectors; and decode the third set of encoder vectors using an attention block.

    DYNAMICALLY ROUTED PATCH DISCRIMINATOR

    公开(公告)号:US20210264284A1

    公开(公告)日:2021-08-26

    申请号:US16800950

    申请日:2020-02-25

    Abstract: The present disclosure discloses a system and a method. In an example implantation, the system and the method can generate, at a discriminator, a plurality of image patches from an image, determine a plurality of routing coefficients within a capsule network based on the plurality of image patches, generate a prediction indicating whether the image is synthetic or sourced from a real distribution based on the plurality of routing coefficients, and update one or more weights of a generator based on the prediction, wherein the generator is connected to the discriminator.

    Hybrid Metric-Topological Camera-Based Localization

    公开(公告)号:US20210082145A1

    公开(公告)日:2021-03-18

    申请号:US17103835

    申请日:2020-11-24

    Abstract: Various examples of hybrid metric-topological camera-based localization are described. A single image sensor captures an input image of an environment. The input image is localized to one of a plurality of topological nodes of a hybrid simultaneous localization and mapping (SLAM) metric-topological map which describes the environment as the plurality of topological nodes at a plurality of discrete locations in the environment. A metric pose of the image sensor can be determined using a Perspective-n-Point (PnP) projection algorithm. A convolutional neural network (CNN) can be trained to localize the input image to one of the plurality of topological nodes and a direction of traversal through the environment.

    Joint automatic speech recognition and text to speech conversion using adversarial neural networks

    公开(公告)号:US11574622B2

    公开(公告)日:2023-02-07

    申请号:US16919315

    申请日:2020-07-02

    Abstract: An end-to-end deep-learning-based system that can solve both ASR and TTS problems jointly using unpaired text and audio samples is disclosed herein. An adversarially-trained approach is used to generate a more robust independent TTS neural network and an ASR neural network that can be deployed individually or simultaneously. The process for training the neural networks includes generating an audio sample from a text sample using the TTS neural network, then feeding the generated audio sample into the ASR neural network to regenerate the text. The difference between the regenerated text and the original text is used as a first loss for training the neural networks. A similar process is used for an audio sample. The difference between the regenerated audio and the original audio is used as a second loss. Text and audio discriminators are similarly used on the output of the neural network to generate additional losses for training.

Patent Agency Ranking