-
1.
公开(公告)号:US20240114170A1
公开(公告)日:2024-04-04
申请号:US17955754
申请日:2022-09-29
Applicant: Nvidia Corporation
Inventor: Aurobinda Maharana , Abhijit Patait
CPC classification number: H04N19/70 , G06T5/50 , G06T7/246 , G06T7/60 , G06V10/25 , G06V10/82 , G06V40/161 , H04N7/15 , G06T2207/20132 , G06T2207/20221 , G06V2201/07
Abstract: Systems and methods relate to facial video encoding and reconstruction, particularly in ultra-low bandwidth settings. In embodiments, a video conferencing or other streaming application uses automatically tracked feature cropping information. A bounding shape size—used to identify the cropped region—varies and is dynamically determined to maintain a proportion for feature reconstruction, such as resizing in the event of a zoom-in on a face (or other feature of interest) or a zoom-out. The tracking scheme may be used to smooth sudden movements, including lateral ones, to generate more natural transitions between frames. Tracking and cropping information (e.g., size and position of the cropped region) may be embedded within an encoded bitstream as supplemental enhancement information (“SEI”), for eventual decoding by a receiver and for compositing a decoded face at a proper location in the applicable stream.
-
公开(公告)号:US20240114162A1
公开(公告)日:2024-04-04
申请号:US17955734
申请日:2022-09-29
Applicant: Nvidia Corporation
Inventor: Aurobinda Maharana , Arun Mallya , Ming-Yu Liu , Abhijit Patait
Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to decode a frame of an encoded video stream that uses an inter-frame depicting an object and an intra-frame depicting the object, the intra-frame being included in a set of intra-frames based at least in part on at least one attribute of the object as depicted in the intra-frame being different from the at least one attribute of the object as depicted in other intra-frames of the set of intra-frames.
-
公开(公告)号:US12047595B2
公开(公告)日:2024-07-23
申请号:US17955734
申请日:2022-09-29
Applicant: Nvidia Corporation
Inventor: Aurobinda Maharana , Arun Mallya , Ming-Yu Liu , Abhijit Patait
Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to decode a frame of an encoded video stream that uses an inter-frame depicting an object and an intra-frame depicting the object, the intra-frame being included in a set of intra-frames based at least in part on at least one attribute of the object as depicted in the intra-frame being different from the at least one attribute of the object as depicted in other intra-frames of the set of intra-frames.
-
公开(公告)号:US20220351392A1
公开(公告)日:2022-11-03
申请号:US17246149
申请日:2021-04-30
Applicant: Nvidia Corporation
Inventor: Aurobinda Maharana , Vignesh Ungrapalli , Abhijit Patait
Abstract: Apparatuses, systems, and techniques are presented to track objects represented in images or video data. In at least one embodiment, motion of one or more objects within a plurality of digital images is determined based, at least in part, on flow information corresponding to the one or more objects.
-
公开(公告)号:US20230410650A1
公开(公告)日:2023-12-21
申请号:US18462287
申请日:2023-09-06
Applicant: NVIDIA Corporation
Inventor: Ambrish Dantrey , Atousa Torabi , Anshul Jain , Ram Ganapathi , Abhijit Patait , Revanth Reddy Nalla , Niranjan Avadhanam
IPC: G08G1/0965 , H04R3/00 , G08G1/0967 , G06N3/048
CPC classification number: G08G1/0965 , H04R3/005 , G08G1/096708 , G06N3/048 , H04R2430/20
Abstract: In various examples, audio alerts of emergency response vehicles may be detected and classified using audio captured by microphones of an autonomous or semi-autonomous machine in order to identify travel directions, locations, and/or types of emergency response vehicles in the environment. For example, a plurality of microphone arrays may be disposed on an autonomous or semi-autonomous machine and used to generate audio signals corresponding to sounds in the environment. These audio signals may be processed to determine a location and/or direction of travel of an emergency response vehicle (e.g., using triangulation). Additionally, to identify siren types—and thus emergency response vehicle types corresponding thereto—the audio signals may be used to generate representations of a frequency spectrum that may be processed using a deep neural network (DNN) that outputs probabilities of alert types being represented by the audio data.
-
公开(公告)号:US11816987B2
公开(公告)日:2023-11-14
申请号:US16951224
申请日:2020-11-18
Applicant: NVIDIA Corporation
Inventor: Ambrish Dantrey , Atousa Torabi , Anshul Jain , Ram Ganapathi , Abhijit Patait , Revanth Reddy Nalla , Niranjan Avadhanam
IPC: G08G1/0965 , H04R3/00 , G08G1/0967 , G06N3/048
CPC classification number: G08G1/0965 , G06N3/048 , G08G1/096708 , H04R3/005 , H04R2430/20
Abstract: In various examples, audio alerts of emergency response vehicles may be detected and classified using audio captured by microphones of an autonomous or semi-autonomous machine in order to identify travel directions, locations, and/or types of emergency response vehicles in the environment. For example, a plurality of microphone arrays may be disposed on an autonomous or semi-autonomous machine and used to generate audio signals corresponding to sounds in the environment. These audio signals may be processed to determine a location and/or direction of travel of an emergency response vehicle (e.g., using triangulation). Additionally, to identify siren types—and thus emergency response vehicle types corresponding thereto—the audio signals may be used to generate representations of a frequency spectrum that may be processed using a deep neural network (DNN) that outputs probabilities of alert types being represented by the audio data. The locations, direction of travel, and/or siren type may allow an ego-vehicle or ego-machine to identify an emergency response vehicle and to make planning and/or control decisions in response.
-
公开(公告)号:US20220157165A1
公开(公告)日:2022-05-19
申请号:US16951224
申请日:2020-11-18
Applicant: NVIDIA Corporation
Inventor: Ambrish Dantrey , Atousa Torabi , Anshul Jain , Ram Ganapathi , Abhijit Patait , Revanth Reddy Nalla , Niranjan Avadhanam
IPC: G08G1/0965 , H04R3/00 , G08G1/0967 , G06N3/04
Abstract: In various examples, audio alerts of emergency response vehicles may be detected and classified using audio captured by microphones of an autonomous or semi-autonomous machine in order to identify travel directions, locations, and/or types of emergency response vehicles in the environment. For example, a plurality of microphone arrays may be disposed on an autonomous or semi-autonomous machine and used to generate audio signals corresponding to sounds in the environment. These audio signals may be processed to determine a location and/or direction of travel of an emergency response vehicle (e.g., using triangulation). Additionally, to identify siren types—and thus emergency response vehicle types corresponding thereto—the audio signals may be used to generate representations of a frequency spectrum that may be processed using a deep neural network (DNN) that outputs probabilities of alert types being represented by the audio data. The locations, direction of travel, and/or siren type may allow an ego-vehicle or ego-machine to identify an emergency response vehicle and to make planning and/or control decisions in response.
-
公开(公告)号:US12283187B2
公开(公告)日:2025-04-22
申请号:US18462287
申请日:2023-09-06
Applicant: NVIDIA Corporation
Inventor: Ambrish Dantrey , Atousa Torabi , Anshul Jain , Ram Ganapathi , Abhijit Patait , Revanth Reddy Nalla , Niranjan Avadhanam
IPC: G08G1/0965 , G06N3/048 , G08G1/0967 , H04R3/00
Abstract: In various examples, audio alerts of emergency response vehicles may be detected and classified using audio captured by microphones of an autonomous or semi-autonomous machine in order to identify travel directions, locations, and/or types of emergency response vehicles in the environment. For example, a plurality of microphone arrays may be disposed on an autonomous or semi-autonomous machine and used to generate audio signals corresponding to sounds in the environment. These audio signals may be processed to determine a location and/or direction of travel of an emergency response vehicle (e.g., using triangulation). Additionally, to identify siren types—and thus emergency response vehicle types corresponding thereto—the audio signals may be used to generate representations of a frequency spectrum that may be processed using a deep neural network (DNN) that outputs probabilities of alert types being represented by the audio data.
-
公开(公告)号:US20240397077A1
公开(公告)日:2024-11-28
申请号:US18780242
申请日:2024-07-22
Applicant: Nvidia Corporation
Inventor: Aurobinda Maharana , Arun Mallya , Ming-Yu Liu , Abhijit Patait
Abstract: Systems and methods herein address reference frame selection in video streaming applications using one or more processing units to decode a frame of an encoded video stream that uses an inter-frame depicting an object and an intra-frame depicting the object, the intra-frame being included in a set of intra-frames based at least in part on at least one attribute of the object as depicted in the intra-frame being different from the at least one attribute of the object as depicted in other intra-frames of the set of intra-frames.
-
-
-
-
-
-
-
-