Patent search ap:("Apple Inc.") AND inv:"Sachin KAJAREKAR" Page 1

1.

发明申请
DEVICE ARBITRATION FOR DIGITAL ASSISTANT-BASED INTERCOM SYSTEMS 有权

公开(公告)号：US20210350810A1

公开(公告)日：2021-11-11

申请号：US17073092

申请日：2020-10-16

Applicant: Apple Inc.

Inventor： Benjamin S. PHIPPS , Sachin KAJAREKAR , Eugene RAY , Mahesh Ramaray SHANBHAG , Kisun YOU , Patrick L. Coffman

IPC: G10L17/22 , G10L17/18

Abstract: Systems and processes for operating an intercom system via a digital assistant are provided. The intercom system is trigger-free, in that users communicate, in real-time, via devices without employing a trigger to speak. Acoustic fingerprints are employed to associate users with devices. Acoustic fingerprints include vector embeddings of speech input in an acoustic-feature vector space. Speech heard at multiple devices, as embedded in a fingerprint, may be clustered in the vector space, and the structure of the clusters is employed to associate users and devices. Based on the fingerprints, a device is mapped to a user, and the user employs that device to participate in a conversation, via the intercom service.

2.

发明申请
USING TEXT FOR AVATAR ANIMATION 有权

公开(公告)号：US20210248804A1

公开(公告)日：2021-08-12

申请号：US17153728

申请日：2021-01-20

Applicant: Apple Inc.

Inventor： Ahmed Serag El Din HUSSEN ABDELAZIZ , Justin BINDER , Sachin KAJAREKAR , Anushree PRASANNA KUMAR , Chloé Ann SEIVWRIGHT

IPC: G06T13/80 , G06F40/20 , G06N3/08 , G06N3/04

Abstract: Systems and processes for animating an avatar are provided. An example process of animating an avatar includes at an electronic device having one or more processors and memory, receiving text, determining an emotional state, and generating, using a neural network, a speech data set representing the received text and a set of parameters representing one or more movements of an avatar based on the received text and the determined emotional state.

3.

发明申请
MULTIMODAL APPROACH FOR AVATAR ANIMATION 有权

公开(公告)号：US20210090314A1

公开(公告)日：2021-03-25

申请号：US16723866

申请日：2019-12-20

Applicant: Apple Inc.

Inventor： Ahmed Serag El Din HUSSEN ABDELAZIZ , Nicholas APOSTOLOFF , Justin BINDER , Paul Richard DIXON , Sachin KAJAREKAR , Reinhard KNOTHE , Sebastian MARTIN , Barry-John THEOBALD , Thibaut WEISE

IPC: G06T13/40 , G06T13/20 , G06F3/16 , G06K9/00 , G06N3/08 , G06N3/04

Abstract: Systems and methods for animating an avatar are provided. An example method of animating an avatar includes at an electronic device having one or more processors and memory, receiving an audio input, receiving a video input including at least a portion of a user's face, wherein the video input is separate from the audio input, determining one or more movements of the user's face based on the received audio input and received video input, and generating, using a neural network separately trained with a set of audio training data and a set of video training data, a set of characteristics for controlling an avatar representing the one or more movements of the user's face.

4.

发明申请
REDUCING DEVICE PROCESSING OF UNINTENDED AUDIO 有权

公开(公告)号：US20220093095A1

公开(公告)日：2022-03-24

申请号：US17123428

申请日：2020-12-16

Applicant: Apple Inc.

Inventor： Pranay DIGHE , Erik MARCHI , Srikanth VISHNUBHOTLA , Sachin KAJAREKAR , Devang K. NAIK

IPC: G10L15/22 , G10L15/26 , G10L15/30

Abstract: An example process includes: receiving an audio stream; determining a plurality of acoustic representations of the audio stream, where each acoustic representation of the plurality of acoustic representations corresponds to a respective frame of the audio stream; obtaining a respective plurality of scores indicating whether each respective frame of the audio stream is directed to an electronic device, where the obtaining includes: determining, using a triggering model operating on the electronic device, for each acoustic representation, a score indicating whether the respective frame of the audio stream is directed to the electronic device; determining, based on the respective plurality of scores, a likelihood that the audio stream is directed to the electronic device; determining whether the likelihood is above or below a threshold; and in response to determining that the likelihood is below the threshold, ceasing to process the audio stream.

5.

发明申请
ROBUST END-POINTING OF SPEECH SIGNALS USING SPEAKER RECOGNITION 审中-公开
Title translation: 使用扬声器识别的语音信号的稳健终点

公开(公告)号：US20150371665A1

公开(公告)日：2015-12-24

申请号：US14701147

申请日：2015-04-30

Applicant: Apple Inc.

Inventor： Devang K. NAIK , Sachin KAJAREKAR

IPC: G10L25/87 , G10L17/22

CPC classification number: G10L25/87 , G10L17/00 , G10L17/22 , G10L25/78

Abstract: Systems and processes for robust end-pointing of speech signals using speaker recognition are provided. In one example process, a stream of audio having a spoken user request can be received. A first likelihood that the stream of audio includes user speech can be determined. A second likelihood that the stream of audio includes user speech spoken by an authorized user can be determined. A start-point or an end-point of the spoken user request can be determined based at least in part on the first likelihood and the second likelihood.

Abstract translation: 提供了使用说话人识别的语音信号的鲁棒终端指向的系统和过程。在一个示例过程中，可以接收具有口头用户请求的音频流。可以确定音频流包括用户语音的第一可能性。可以确定音频流包括授权用户说出的用户语音的第二可能性。可以至少部分地基于第一可能性和第二似然性来确定口头用户请求的起始点或终点。

Patent Agency Ranking