Patent search ap:("Apple Inc.") AND inv:"Sachin S. Kajarekar" Page 1

1.

发明授权
Determining whether speech input is intended for a digital assistant 有权

公开(公告)号：US12190873B2

公开(公告)日：2025-01-07

申请号：US17952005

申请日：2022-09-23

Applicant: Apple Inc.

Inventor： Ahmed S. Hussen Abdelaziz , Saurabh Adya , Alexander W. Churchill , Pranay Dighe , Sachin S. Kajarekar , Chaitanya Mannemala , Erik Marchi , Seyedmahdad Mirsamadi , Ognjen Rudovic , Ahmed H. Tewfik , Barry-John Theobald , Srikanth Vishnubhotla

IPC: G10L15/22 , G06T7/70 , G06V40/16 , G10L15/16 , G10L15/197 , G10L25/78 , G10L15/08

Abstract: An example process includes: receiving a speech input representing a user utterance; determining, based on a textual representation of the speech input, a first score corresponding to a type of the user utterance; determining, based on the textual representation of the speech input, a second score representing a correspondence between the user utterance and a domain recognized by a digital assistant; determining, based on the first score and the second score, whether the speech input is intended for the digital assistant; in accordance with a determination that the speech input is intended for the digital assistant: initiating, by the digital assistant, a task based on the speech input; and providing an output indicative of the initiated task.

2.

发明授权
Training speaker recognition models for digital assistants 有权

公开(公告)号：US10789959B2

公开(公告)日：2020-09-29

申请号：US15997174

申请日：2018-06-04

Applicant: Apple Inc.

Inventor： Sachin S. Kajarekar

IPC: G10L17/22 , G10L15/22 , G10L17/06 , G10L17/24 , G10L17/04 , G10L17/00

Abstract: Techniques for training a speaker recognition model used for interacting with a digital assistant are provided. In some examples, user authentication information is obtained at a first time. At a second time, a user utterance representing a user request is received. A voice print is generated from the user utterance. A determination is made as to whether a plurality of conditions are satisfied. The plurality of conditions includes a first condition that the user authentication information corresponds to one or more authentication credentials assigned to a registered user of an electronic device. The plurality of conditions further includes a second condition that the first time and the second time are not separated by more than a predefined time period. In accordance with a determination that the plurality of conditions are satisfied, a speaker profile assigned to the registered user is updated based on the voice print.

3.

发明申请
SYSTEM AND METHOD OF PERFORMING AUTOMATIC SPEECH RECOGNITION USING END-POINTING MARKERS GENERATED USING ACCELEROMETER-BASED VOICE ACTIVITY DETECTOR 审中-公开

公开(公告)号：US20170365249A1

公开(公告)日：2017-12-21

申请号：US15188861

申请日：2016-06-21

Applicant: Apple Inc.

Inventor： Sorin V. Dusan , Devang K. Naik , Sachin S. Kajarekar

IPC: G10L15/05 , G10L21/0208 , G10L15/30 , G10L25/21 , H04R1/10

CPC classification number: G10L15/05 , G10L15/30 , G10L25/21 , G10L25/78 , H04R1/1016 , H04R3/005 , H04R2201/403 , H04R2410/01 , H04R2420/07 , H04R2430/20

Abstract: A method of performing automatic speech recognition (ASR) using end-pointing markers generated using accelerometer-based voice activity detector starts with a voice activity detector (VAD) generating an accelerometer VAD output (VADa) based on data output by at least one accelerometer that is included in at least one earbud. The at least one accelerometer to detect vibration of the user's vocal chords. A voice processor detects a speech signal based on acoustic signals from at least one microphone. An end-pointer generates the end-pointing markers based on the VADa output and an ASR engine performs ASR on the speech signal based on the end-pointing markers. Other embodiments are also described.

4.

发明授权
Speaker identification and unsupervised speaker adaptation techniques 有权

公开(公告)号：US10127911B2

公开(公告)日：2018-11-13

申请号：US14835169

申请日：2015-08-25

Applicant: Apple Inc.

Inventor： Yoon Kim , Sachin S. Kajarekar

IPC: G10L15/26 , G10L17/26 , G10L17/04 , G10L17/06 , G10L15/18

Abstract: Systems and processes for generating a speaker profile for use in performing speaker identification for a virtual assistant are provided. One example process can include receiving an audio input including user speech and determining whether a speaker of the user speech is a predetermined user based on a speaker profile for the predetermined user. In response to determining that the speaker of the user speech is the predetermined user, the user speech can be added to the speaker profile and operation of the virtual assistant can be triggered. In response to determining that the speaker of the user speech is not the predetermined user, the user speech can be added to an alternate speaker profile and operation of the virtual assistant may not be triggered. In some examples, contextual information can be used to verify results produced by the speaker identification process.

5.

发明授权
Voice identification in digital assistant systems 有权

公开(公告)号：US11423898B2

公开(公告)日：2022-08-23

申请号：US16815984

申请日：2020-03-11

Applicant: Apple Inc.

Inventor： Stephen H. Shum , Corey J. Peterson , Sachin S. Kajarekar , Benjamin S. Phipps , Erik Marchi , Jessica Peck , Anumita Biswas , Chaitanya Mannemala

IPC: G10L15/22 , G10L15/18 , G10L17/14 , G06F3/16 , G06F21/32 , G10L17/00

Abstract: Systems and processes for operating an intelligent automated assistant are provided. An example method includes receiving, from one or more external electronic devices, a plurality of speaker profiles for a plurality of users; receiving a natural language speech input; determining, based on comparing the natural language speech input to the plurality of speaker profiles: a first likelihood that the natural language speech input corresponds to a first user of the plurality of users; and a second likelihood that the natural language speech input corresponds to a second user of the plurality of users; determining whether the first likelihood and the second likelihood are within a first threshold; and in accordance with determining that the first likelihood and the second likelihood are not within the first threshold: providing a response to the natural language speech input, the response being personalized for the first user.

6.

发明申请
ACOUSTIC ENVIRONMENT AWARE STREAM SELECTION FOR MULTI-STREAM SPEECH RECOGNITION 审中-公开

公开(公告)号：US20200312315A1

公开(公告)日：2020-10-01

申请号：US16368403

申请日：2019-03-28

Applicant: Apple Inc.

Inventor： Feipeng Li , Mehrez Souden , Joshua D. Atkins , John Bridle , Charles P. Clark , Stephen H. Shum , Sachin S. Kajarekar , Haiying Xia , Erik Marchi

IPC: G10L15/20

Abstract: An acoustic environment aware method for selecting a high quality audio stream during multi-stream speech recognition. A number of input audio streams are processed to determine if a voice trigger is detected, and if so a voice trigger score is calculated for each stream. An acoustic environment measurement is also calculated for each audio stream. The trigger score and acoustic environment measurement are combined for each audio stream, to select as a preferred audio stream the audio stream with the highest combined score. The preferred audio stream is output to an automatic speech recognizer. Other aspects are also described and claimed.

7.

发明授权
Speaker identification and unsupervised speaker adaptation techniques 有权

公开(公告)号：US10438595B2

公开(公告)日：2019-10-08

申请号：US16155662

申请日：2018-10-09

Applicant: Apple Inc.

Inventor： Yoon Kim , Sachin S. Kajarekar

IPC: G10L17/26 , G10L17/04 , G10L15/26 , G10L17/06 , G10L15/18

Abstract: Systems and processes for generating a speaker profile for use in performing speaker identification for a virtual assistant are provided. One example process can include receiving an audio input including user speech and determining whether a speaker of the user speech is a predetermined user based on a speaker profile for the predetermined user. In response to determining that the speaker of the user speech is the predetermined user, the user speech can be added to the speaker profile and operation of the virtual assistant can be triggered. In response to determining that the speaker of the user speech is not the predetermined user, the user speech can be added to an alternate speaker profile and operation of the virtual assistant may not be triggered. In some examples, contextual information can be used to verify results produced by the speaker identification process.

8.

发明授权
Automatic accent detection using acoustic models 有权

公开(公告)号：US10255907B2

公开(公告)日：2019-04-09

申请号：US14846650

申请日：2015-09-04

Applicant: Apple Inc.

Inventor： Udhyakumar Nallasamy , Sachin S. Kajarekar , Matthias Paulik , Matthew Seigel

IPC: G10L15/07 , G10L15/16 , G10L15/10 , G10L15/06 , G10L25/51

Abstract: Systems and processes for automatic accent detection are provided. In accordance with one example, a method includes, at an electronic device with one or more processors and memory, receiving a user input, determining a first similarity between a representation of the user input and a first acoustic model of a plurality of acoustic models, and determining a second similarity between the representation of the user input and a second acoustic model of the plurality of acoustic models. The method further includes determining whether the first similarity is greater than the second similarity. In accordance with a determination that the first similarity is greater than the second similarity, the first acoustic model may be selected; and in accordance with a determination that the first similarity is not greater than the second similarity, the second acoustic model may be selected.

9.

发明授权
Context-based endpoint detection 有权

公开(公告)号：US10186254B2

公开(公告)日：2019-01-22

申请号：US14846667

申请日：2015-09-04

Applicant: Apple Inc.

Inventor： Shaun E. Williams , Henry G. Mason , Mahesh Krishnamoorthy , Matthias Paulik , Neha Agrawal , Sachin S. Kajarekar , Selen Uguroglu , Ali S. Mohamed

IPC: G10L15/00 , G10L17/00 , G10L21/00 , G10L15/04 , G10L25/87 , G10L17/02 , G10L25/78

Abstract: The present disclosure generally relates to context-based endpoint detection in user speech input. A method for identifying an endpoint of a spoken request by a user may include receiving user input of natural language speech including one or more words; identifying at least one context associated with the user input; generating a probability, based on the at least one context associated with the user input, that a location in the user input is an endpoint; determining whether the probability is greater than a threshold; and in accordance with a determination that the probability is greater than the threshold, identifying the location in the user input as the endpoint.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification