Patent search ap:("Nvidia Corporation") AND inv:"Wonmin Byeon" Page 1

1.

发明公开
SYNTHETIC DATASET GENERATOR 审中-公开

公开(公告)号：US20240127075A1

公开(公告)日：2024-04-18

申请号：US18212629

申请日：2023-06-21

Applicant: NVIDIA Corporation

Inventor： Shalini De Mello , Christian Jacobsen , Xunlei Wu , Stephen Tyree , Alice Li , Wonmin Byeon , Shangru Li

IPC: G06N3/0985

CPC classification number: G06N3/0985

Abstract: Machine learning is a process that learns a model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the costs associated with collecting and labeling real world datasets for use in training the model, computer processes can synthetically generate datasets which simulate real world data. The present disclosure improves the effectiveness of such synthetic datasets for training machine learning models used in real world applications, in particular by generating a synthetic dataset that is specifically targeted to a specified downstream task (e.g. a particular computer vision task, a particular natural language processing task, etc.).

2.

发明公开
PERFORMING SEMANTIC SEGMENTATION TRAINING WITH IMAGE/TEXT PAIRS 审中-公开

公开(公告)号：US20230177810A1

公开(公告)日：2023-06-08

申请号：US17853631

申请日：2022-06-29

Applicant: NVIDIA Corporation

Inventor： Jiarui Xu , Shalini De Mello , Sifei Liu , Wonmin Byeon , Thomas Breuel , Jan Kautz

IPC: G06V10/774 , G06V10/26

CPC classification number: G06V10/774 , G06V10/26

Abstract: Semantic segmentation includes the task of providing pixel-wise annotations for a provided image. To train a machine learning environment to perform semantic segmentation, image/caption pairs are retrieved from one or more databases. These image/caption pairs each include an image and associated textual caption. The image portion of each image/caption pair is passed to an image encoder of the machine learning environment that outputs potential pixel groupings (e.g., potential segments of pixels) within each image, while nouns are extracted from the caption portion and are converted to text prompts which are then passed to a text encoder that outputs a corresponding text representation. Contrastive loss operations are then performed on features extracted from these pixel groupings and text representations to determine an extracted feature for each noun of each caption that most closely matches the extracted features for the associated image.

3.

发明申请
FUTURE OBJECT TRAJECTORY PREDICTIONS FOR AUTONOMOUS MACHINE APPLICATIONS 有权

公开(公告)号：US20230088912A1

公开(公告)日：2023-03-23

申请号：US17952866

申请日：2022-09-26

Applicant: NVIDIA Corporation

Inventor： Ruben Villegas , Alejandro Troccoli , Iuri Frosio , Stephen Tyree , Wonmin Byeon , Jan Kautz

IPC: G06N3/04 , G06N3/08 , B60W40/02

Abstract: In various examples, historical trajectory information of objects in an environment may be tracked by an ego-vehicle and encoded into a state feature. The encoded state features for each of the objects observed by the ego-vehicle may be used—e.g., by a bi-directional long short-term memory (LSTM) network—to encode a spatial feature. The encoded spatial feature and the encoded state feature for an object may be used to predict lateral and/or longitudinal maneuvers for the object, and the combination of this information may be used to determine future locations of the object. The future locations may be used by the ego-vehicle to determine a path through the environment, or may be used by a simulation system to control virtual objects—according to trajectories determined from the future locations—through a simulation environment.

4.

发明授权
Neural network system for stereo image matching 有权

公开(公告)号：US11062471B1

公开(公告)日：2021-07-13

申请号：US16868342

申请日：2020-05-06

Applicant: NVIDIA Corporation

Inventor： Yiran Zhong , Wonmin Byeon , Charles Loop , Stanley Thomas Birchfield

IPC: G06K9/00 , G06T7/593

Abstract: Stereo matching generates a disparity map indicating pixels offsets between matched points in a stereo image pair. A neural network may be used to generate disparity maps in real time by matching image features in stereo images using only 2D convolutions. The proposed method is faster than 3D convolution-based methods, with only a slight accuracy loss and higher generalization capability. A 3D efficient cost aggregation volume is generated by combining cost maps for each disparity level. Different disparity levels correspond to different amounts of shift between pixels in the left and right image pair. In general, each disparity level is inversely proportional to a different distance from the viewpoint.

5.

发明公开
DIFFUSION-BASED OPEN-VOCABULARY SEGMENTATION 审中-公开

公开(公告)号：US20240153093A1

公开(公告)日：2024-05-09

申请号：US18310414

申请日：2023-05-01

Applicant: NVIDIA Corporation

Inventor： Jiarui Xu , Shalini De Mello , Sifei Liu , Arash Vahdat , Wonmin Byeon

IPC: G06T7/10 , G06V10/40

CPC classification number: G06T7/10 , G06V10/40 , G06T2207/20081 , G06T2207/20084

Abstract: An open-vocabulary diffusion-based panoptic segmentation system is not limited to perform segmentation using only object categories seen during training, and instead can also successfully perform segmentation of object categories not seen during training and only seen during testing and inferencing. In contrast with conventional techniques, a text-conditioned diffusion (generative) model is used to perform the segmentation. The text-conditioned diffusion model is pre-trained to generate images from text captions, including computing internal representations that provide spatially well-differentiated object features. The internal representations computed within the diffusion model comprise object masks and a semantic visual representation of the object. The semantic visual representation may be extracted from the diffusion model and used in conjunction with a text representation of a category label to classify the object. Objects are classified by associating the text representations of category labels with the object masks and their semantic visual representations to produce panoptic segmentation data.

6.

发明公开
AUDIO-DRIVEN FACIAL ANIMATION WITH EMOTION SUPPORT USING MACHINE LEARNING 审中-公开

公开(公告)号：US20240013462A1

公开(公告)日：2024-01-11

申请号：US17859615

申请日：2022-07-07

Applicant: Nvidia Corporation

Inventor： Yeongho Seol , Simon Yuen , Dmitry Aleksandrovich Korobchenko , Mingquan Zhou , Ronan Browne , Wonmin Byeon

IPC: G06T13/20 , G06T13/40 , G06T17/20 , G10L25/63 , G10L15/16

CPC classification number: G06T13/205 , G06T13/40 , G06T17/20 , G10L25/63 , G10L15/16

Abstract: A deep neural network can be trained to output motion or deformation information for a character that is representative of the character uttering speech contained in audio input, which is accurate for an emotional state of the character. The character can have different facial components or regions (e.g., head, skin, eyes, tongue) modeled separately, such that the network can output motion or deformation information for each of these different facial components. During training, the network can be provided with emotion and/or style vectors that indicate information to be used in generating realistic animation for input speech, as may relate to one or more emotions to be exhibited by the character, a relative weighting of those emotions, and any style or adjustments to be made to how the character expresses that emotional state. The network output can be provided to a renderer to generate audio-driven facial animation that is emotion-accurate.

7.

发明申请
IMAGE SEGMENTATION USING A NEURAL NETWORK TRANSLATION MODEL 有权

公开(公告)号：US20220254029A1

公开(公告)日：2022-08-11

申请号：US17500338

申请日：2021-10-13

Applicant: NVIDIA Corporation

Inventor： Eugene Vorontsov , Wonmin Byeon , Shalini De Mello , Varun Jampani , Ming-Yu Liu , Pavlo Molchanov

IPC: G06T7/11 , G06K9/62

Abstract: The neural network includes an encoder, a common decoder, and a residual decoder. The encoder encodes input images into a latent space. The latent space disentangles unique features from other common features. The common decoder decodes common features resident in the latent space to generate translated images which lack the unique features. The residual decoder decodes unique features resident in the latent space to generate image deltas corresponding to the unique features. The neural network combines the translated images with the image deltas to generate combined images that may include both common features and unique features. The combined images can be used to drive autoencoding. Once training is complete, the residual decoder can be modified to generate segmentation masks that indicate any regions of a given input image where a unique feature resides.

8.

发明授权
Future object trajectory predictions for autonomous machine applications 有权

公开(公告)号：US11989642B2

公开(公告)日：2024-05-21

申请号：US17952866

申请日：2022-09-26

Applicant: NVIDIA Corporation

Inventor： Ruben Villegas , Alejandro Troccoli , Iuri Frosio , Stephen Tyree , Wonmin Byeon , Jan Kautz

IPC: G06N3/044 , B60W40/02 , G06N3/045 , G06N3/08

CPC classification number: G06N3/044 , B60W40/02 , G06N3/08 , G06N3/045

Abstract: In various examples, historical trajectory information of objects in an environment may be tracked by an ego-vehicle and encoded into a state feature. The encoded state features for each of the objects observed by the ego-vehicle may be used—e.g., by a bi-directional long short-term memory (LSTM) network—to encode a spatial feature. The encoded spatial feature and the encoded state feature for an object may be used to predict lateral and/or longitudinal maneuvers for the object, and the combination of this information may be used to determine future locations of the object. The future locations may be used by the ego-vehicle to determine a path through the environment, or may be used by a simulation system to control virtual objects—according to trajectories determined from the future locations—through a simulation environment.

9.

发明公开
NOVEL METHOD OF TRAINING A NEURAL NETWORK 审中-公开

公开(公告)号：US20230146647A1

公开(公告)日：2023-05-11

申请号：US17520448

申请日：2021-11-05

Applicant: NVIDIA Corporation

Inventor： Wonmin Byeon , Shalini De Mello , Ankur Arjun Mali

IPC: G06N3/04

CPC classification number: G06N3/04

Abstract: Apparatuses, systems, and techniques to perform and facilitate preservation of neural coding network weights over time. In at least one embodiment, a convolutional neural coding network is trained using a set of tasks such that said convolutional neural coding network retains an ability to perform inferencing based on tasks from previous training.

10.

发明申请
IMAGE PROCESSING USING COUPLED SEGMENTATION AND EDGE LEARNING 有权

公开(公告)号：US20230015989A1

公开(公告)日：2023-01-19

申请号：US17365877

申请日：2021-07-01

Applicant: Nvidia Corporation

Inventor： Zhiding Yu , Rui Huang , Wonmin Byeon , Sifei Liu , Guilin Liu , Thomas Breuel , Anima Anandkumar , Jan Kautz

IPC: G06K9/46 , G06N3/04 , G06T7/13 , G06K9/62

Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification