-
公开(公告)号:US11557293B2
公开(公告)日:2023-01-17
申请号:US17321994
申请日:2021-05-17
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Matthew Sharifi , Ondrej Skopek , Justin Lu , Daniel Valcarce , Kevin Kilgour , Mohamad Hassan Rom , Nicolo D'Ercole , Michael Golikov
Abstract: Some implementations process, using warm word model(s), a stream of audio data to determine a portion of the audio data that corresponds to particular word(s) and/or phrase(s) (e.g., a warm word) associated with an assistant command, process, using an automatic speech recognition (ASR) model, a preamble portion of the audio data (e.g., that precedes the warm word) and/or a postamble portion of the audio data (e.g., that follows the warm word) to generate ASR output, and determine, based on processing the ASR output, whether a user intended the assistant command to be performed. Additional or alternative implementations can process the stream of audio data using a speaker identification (SID) model to determine whether the audio data is sufficient to identify the user that provided a spoken utterance captured in the stream of audio data, and determine if that user is authorized to cause performance of the assistant command.
-
2.
公开(公告)号:US20220366911A1
公开(公告)日:2022-11-17
申请号:US17337804
申请日:2021-06-03
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Krishna Sapkota , Behshad Behzadi , Julia Proskurnia , Jacopo Sannazzaro Natta , Justin Lu , Magali Boizot-Roche , Márius Sajgalík , Nicolo D'Ercole , Zaheed Sabur , Luv Kothari
Abstract: Implementations described herein relate to an application and/or automated assistant that can identify arrangement operations to perform for arranging text during speech-to-text operations—without a user having to expressly identify the arrangement operations. In some instances, a user that is dictating a document (e.g., an email, a text message, etc.) can provide a spoken utterance to an application in order to incorporate textual content. However, in some of these instances, certain corresponding arrangements are needed for the textual content in the document. The textual content that is derived from the spoken utterance can be arranged by the application based on an intent, vocalization features, and/or contextual features associated with the spoken utterance and/or a type of the application associated with the document, without the user expressly identifying the corresponding arrangements. In this way, the application can infer content arrangement operations from a spoken utterance that only specifies the textual content.
-
公开(公告)号:US20230061929A1
公开(公告)日:2023-03-02
申请号:US17532315
申请日:2021-11-22
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Antonio Gaetani , Bastiaan Van Eeckhoudt , Daniel Valcarce , Michael Golikov , Justin Lu , Ondrej Skopek , Nicolo D'Ercole , Zaheed Sabur , Behshad Behzadi , Luv Kothari
IPC: G10L17/22
Abstract: Implementations described herein relate to configuring a dynamic warm word button, that is associated with a client device, with particular assistant commands based on detected occurrences of warm word activation events at the client device. In response to detecting an occurrence of a given warm word activation event at the client device, implementations can determine whether user verification is required for a user that actuated the warm word button. Further, in response to determining that the user verification is required for the user that actuated the warm word button, the user verification can be performed. Moreover, in response to determining that the user that actuated the warm word button has been verified, implementations can cause an automated assistant to perform the particular assistant command associated with the warm word activation event. Audio-based and/or non-audio-based techniques can be utilized to perform the user verification.
-
公开(公告)号:US20220366910A1
公开(公告)日:2022-11-17
申请号:US17322765
申请日:2021-05-17
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Alvin Abdagic , Behshad Behzadi , Jacopo Sannazzaro Natta , Julia Proskurnia , Krzysztof Andrzej Goj , Srikanth Pandiri , Viesturs Zarins , Nicolo D'Ercole , Zaheed Sabur , Luv Kothari
IPC: G10L15/26 , G10L15/22 , G10L15/18 , G06F3/0488 , G06N20/00
Abstract: Systems and methods described herein relate to determining whether to incorporate recognized text, that corresponds to a spoken utterance of a user of a client device, into a transcription displayed at the client device, or to cause an assistant command, that is associated with the transcription and that is based on the recognized text, to be performed by an automated assistant implemented by the client device. The spoken utterance is received during a dictation session between the user and the automated assistant. Implementations can process, using automatic speech recognition model(s), audio data that captures the spoken utterance to generate the recognized text. Further, implementations can determine whether to incorporate the recognized text into the transcription or cause the assistant command to be performed based on touch input being directed to the transcription, a state of the transcription, and/or audio-based characteristic(s) of the spoken utterance.
-
公开(公告)号:US12106758B2
公开(公告)日:2024-10-01
申请号:US17322765
申请日:2021-05-17
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Alvin Abdagic , Behshad Behzadi , Jacopo Sannazzaro Natta , Julia Proskurnia , Krzysztof Andrzej Goj , Srikanth Pandiri , Viesturs Zarins , Nicolo D'Ercole , Zaheed Sabur , Luv Kothari
CPC classification number: G10L15/26 , G06F3/0488 , G06N20/00 , G10L15/18 , G10L15/22 , G10L2015/223
Abstract: Systems and methods described herein relate to determining whether to incorporate recognized text, that corresponds to a spoken utterance of a user of a client device, into a transcription displayed at the client device, or to cause an assistant command, that is associated with the transcription and that is based on the recognized text, to be performed by an automated assistant implemented by the client device. The spoken utterance is received during a dictation session between the user and the automated assistant. Implementations can process, using automatic speech recognition model(s), audio data that captures the spoken utterance to generate the recognized text. Further, implementations can determine whether to incorporate the recognized text into the transcription or cause the assistant command to be performed based on touch input being directed to the transcription, a state of the transcription, and/or audio-based characteristic(s) of the spoken utterance.
-
6.
公开(公告)号:US12033637B2
公开(公告)日:2024-07-09
申请号:US17337804
申请日:2021-06-03
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Krishna Sapkota , Behshad Behzadi , Julia Proskurnia , Jacopo Sannazzaro Natta , Justin Lu , Magali Boizot-Roche , Márius {hacek over (S)}ajgalík , Nicolo D'Ercole , Zaheed Sabur , Luv Kothari
CPC classification number: G10L15/26 , G10L15/22 , G10L2015/223
Abstract: Implementations described herein relate to an application and/or automated assistant that can identify arrangement operations to perform for arranging text during speech-to-text operations—without a user having to expressly identify the arrangement operations. In some instances, a user that is dictating a document (e.g., an email, a text message, etc.) can provide a spoken utterance to an application in order to incorporate textual content. However, in some of these instances, certain corresponding arrangements are needed for the textual content in the document. The textual content that is derived from the spoken utterance can be arranged by the application based on an intent, vocalization features, and/or contextual features associated with the spoken utterance and/or a type of the application associated with the document, without the user expressly identifying the corresponding arrangements. In this way, the application can infer content arrangement operations from a spoken utterance that only specifies the textual content.
-
7.
公开(公告)号:US20230402035A1
公开(公告)日:2023-12-14
申请号:US18238898
申请日:2023-08-28
Applicant: GOOGLE LLC
Inventor: Andrea Terwisscha van Scheltinga , Nicolo D'Ercole , Zaheed Sabur , Bibo Xu , Megan Knight , Alvin Abdagic , Jan Lamecki , Bo Zhang
CPC classification number: G10L15/22 , G10L15/083 , G06F3/167 , G10L2015/223
Abstract: Determining whether, upon cessation of a second automated assistant session that interrupted and supplanted a prior first automated assistant session: (1) to automatically resume the prior first automated assistant session, or (2) to transition to an alternative automated assistant state in which the prior first session is not automatically resumed. Implementations further relate to selectively causing, based on the determining and upon cessation of the second automated assistant session, either the automatic resumption of the prior first automated assistant session that was interrupted, or the transition to the state in which the first session is not automatically resumed.
-
公开(公告)号:US11783832B2
公开(公告)日:2023-10-10
申请号:US17552887
申请日:2021-12-16
Applicant: Google LLC
Inventor: Andrea Terwisscha van Scheltinga , Nicolo D'Ercole , Zaheed Sabur , Bibo Xu , Megan Knight , Alvin Abdagic , Jan Lamecki , Bo Zhang
CPC classification number: G10L15/22 , G06F3/167 , G10L15/083 , G10L2015/223
Abstract: Determining whether, upon cessation of a second automated assistant session that interrupted and supplanted a prior first automated assistant session: (1) to automatically resume the prior first automated assistant session, or (2) to transition to an alternative automated assistant state in which the prior first session is not automatically resumed. Implementations further relate to selectively causing, based on the determining and upon cessation of the second automated assistant session, either the automatic resumption of the prior first automated assistant session that was interrupted, or the transition to the state in which the first session is not automatically resumed.
-
公开(公告)号:US11217247B2
公开(公告)日:2022-01-04
申请号:US16618920
申请日:2019-05-01
Applicant: Google LLC
Inventor: Andrea Terwisscha van Scheltinga , Nicolo D'Ercole , Zaheed Sabur , Bibo Xu , Megan Knight , Alvin Abdagic , Jan Lamecki , Bo Zhang
Abstract: Determining whether, upon cessation of a second automated assistant session that interrupted and supplanted a prior first automated assistant session: (1) to automatically resume the prior first automated assistant session, or (2) to transition to an alternative automated assistant state in which the prior first session is not automatically resumed. Implementations further relate to selectively causing, based on the determining and upon cessation of the second automated assistant session, either the automatic resumption of the prior first automated assistant session that was interrupted, or the transition to the state in which the first session is not automatically resumed.
-
公开(公告)号:US20210065701A1
公开(公告)日:2021-03-04
申请号:US16618920
申请日:2019-05-01
Applicant: Google LLC
Inventor: Andrea Terwisscha van Scheltinga , Nicolo D'Ercole , Zaheed Sabur , Bibo Xu , Megan Knight , Alvin Abdagic , Jan Lamecki , Bo Zhang
Abstract: Determining whether, upon cessation of a second automated assistant session that interrupted and supplanted a prior first automated assistant session: (1) to automatically resume the prior first automated assistant session, or (2) to transition to an alternative automated assistant state in which the prior first session is not automatically resumed. Implementations further relate to selectively causing, based on the determining and upon cessation of the second automated assistant session, either the automatic resumption of the prior first automated assistant session that was interrupted, or the transition to the state in which the first session is not automatically resumed.
-
-
-
-
-
-
-
-
-