Patent search ap:("ADOBE INC.") AND inv:"Zeyu JIN" Page 1

1.

发明公开
STUDIO QUALITY AUDIO ENHANCEMENT 审中-公开

公开(公告)号：US20240331720A1

公开(公告)日：2024-10-03

申请号：US18191763

申请日：2023-03-28

Applicant: Adobe Inc. , The Trustees of Princeton University

Inventor： Zeyu JIN , Jiaqi SU , Adam FINKELSTEIN

IPC: G10L21/034 , G06N5/022 , G10L21/0232 , G10L25/18 , G10L25/24 , G10L25/60

CPC classification number: G10L21/034 , G06N5/022 , G10L21/0232 , G10L25/18 , G10L25/24 , G10L25/60 , G10L21/0364 , G10L25/30

Abstract: Embodiments are disclosed for converting audio data to studio quality audio data. The method includes obtaining an audio data having a first quality for conversion to studio quality audio. A first machine learning model predicts a set of acoustic features. A spectral mask is applied to the audio data during the prediction of the set of acoustic features. A second machine learning model generates studio quality audio from the set of acoustic features and the audio data.

2.

发明申请
SECURE AUDIO WATERMARKING BASED ON NEURAL NETWORKS 有权

公开(公告)号：US20210256978A1

公开(公告)日：2021-08-19

申请号：US16790301

申请日：2020-02-13

Applicant: ADOBE INC.

Inventor： Zeyu JIN , Oona Shigeno RISSE-ADAMS

IPC: G10L17/18 , G10L17/04 , G10L19/018 , G10L15/08 , G10L15/06 , G06N3/08

Abstract: Embodiments provide systems, methods, and computer storage media for secure audio watermarking and audio authenticity verification. An audio watermark detector may include a neural network trained to detect a particular audio watermark and embedding technique, which may indicate source software used in a workflow that generated an audio file under test. For example, the watermark may indicate an audio file was generated using voice manipulation software, so detecting the watermark can indicate manipulated audio such as deepfake audio and other attacked audio signals. In some embodiments, the audio watermark detector may be trained as part of a generative adversarial network in order to make the underlying audio watermark more robust to neural network-based attacks. Generally, the audio watermark detector may evaluate time domain samples from chunks of an audio clip under test to detect the presence of the audio watermark and generate a classification for the audio clip.

3.

发明申请
FACE-AWARE SCALE MAGNIFICATION VIDEO EFFECTS 有权

公开(公告)号：US20250140292A1

公开(公告)日：2025-05-01

申请号：US18431103

申请日：2024-02-02

Applicant: ADOBE INC.

Inventor： Anh Lan TRUONG , Deepali ANEJA , Hijung SHIN , Rubaiat HABIB , Jakub FISER , Kishore RADHAKRISHNA , Joel Richard BRANDT , Matthew David FISHER , Zeyu JIN , Kim Pascal PIMMEL , Wilmot LI , Lubomira Assenova DONTCHEVA

IPC: G11B27/036 , G06V20/40 , G06V40/16 , H04N5/262

Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for cutting down a user's larger input video into an edited video comprising the most important video segments and applying corresponding video effects. Some embodiments of the present invention are directed to adding face-aware scale magnification to the trimmed video (e.g., applying scale magnification to simulate a camera zoom effect that hides shot cuts with respect to the subject's face). For example, as the trimmed video transitions from one video segment to the next video segment, a scale magnification may be applied that zooms in on a detected face at a boundary between the video segments to smooth the transition between video segments.

4.

发明公开
SPOKEN LANGUAGE RECOGNITION 审中-公开

公开(公告)号：US20240257798A1

公开(公告)日：2024-08-01

申请号：US18104434

申请日：2023-02-01

Applicant: ADOBE INC.

Inventor： Oriol NIETO-CABALLERO , Zeyu JIN , Justin Jonathan SALAMON , Franck DERNONCOURT

IPC: G10L15/00 , G10L25/30

CPC classification number: G10L15/005 , G10L25/30

Abstract: Some aspects of the technology described herein employ a neural network with an efficient and lightweight architecture to perform spoken language recognition. Given an audio signal comprising speech, features are generated from the audio signal, for instance, by converting the audio signal to a normalized spectrogram. The features are input to the neural network, which has one or more convolutional layers and an output activation layer. Each neuron of the output activation layer corresponds to a language from a set of language and generates an activation value. Based on the activations values, an indication of zero or more languages from the set of languages is provided for the audio signal.

5.

发明申请
CAPTIONING USING GENERATIVE ARTIFICIAL INTELLIGENCE 有权

公开(公告)号：US20250139161A1

公开(公告)日：2025-05-01

申请号：US18431134

申请日：2024-02-02

Applicant: ADOBE INC.

Inventor： Deepali ANEJA , Zeyu JIN , Hijung SHIN , Anh Lan TRUONG , Dingzeyu LI , Hanieh DEILAMSALEHY , Rubaiat HABIB , Matthew David FISHER , Kim Pascal PIMMEL , Wilmot LI , Lubomira Assenova DONTCHEVA

IPC: G06F16/783 , G06F16/738 , G06V20/40 , G06V40/16

Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for cutting down a user's larger input video into an edited video comprising the most important video segments and applying corresponding video effects. Some embodiments of the present invention are directed to adding captioning video effects to the trimmed video (e.g., applying face-aware and non-face-aware captioning to emphasize extracted video segment headings, important sentences, quotes, words of interest, extracted lists, etc.). For example, a prompt is provided to a generative language model to identify portions of a transcript (e.g., extracted scene summaries, important sentences, lists of items discussed in the video, etc.) to apply to corresponding video segments as captions depending on the type of caption (e.g., an extracted heading may be captioned at the start of a corresponding video segment, important sentences and/or extracted list items may be captioned when they are spoken).

6.

发明公开
HIGH FIDELITY AUDIO SUPER RESOLUTION 审中-公开

公开(公告)号：US20230162725A1

公开(公告)日：2023-05-25

申请号：US17534221

申请日：2021-11-23

Applicant: Adobe Inc. , The Trustees of Princeton University

Inventor： Zeyu JIN , Jiaqi SU , Adam FINKELSTEIN

IPC: G10L15/16 , G10L15/06 , G06N3/04

CPC classification number: G10L15/16 , G10L15/063 , G06N3/0454

Abstract: Embodiments are disclosed for generating full-band audio from narrowband audio using a GAN-based audio super resolution model. A method of generating full-band audio may include receiving narrow-band input audio data, upsampling the narrow-band input audio data to generate upsampled audio data, providing the upsampled audio data to an audio super resolution model, the audio super resolution model trained to perform bandwidth expansion from narrow-band to wide-band, and returning wide-band output audio data corresponding to the narrow-band input audio data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification