-
公开(公告)号:US12225338B2
公开(公告)日:2025-02-11
申请号:US18103171
申请日:2023-01-30
Applicant: Samsung Electronics Co., Ltd.
Inventor: Hwa-Sung Kim , Dong Hyun Sohn , Ji-Eun Lee
Abstract: A home appliance includes an electrical equipment compartment disposed in an upper portion of the home appliance, and including an upper side that is open, an electrical equipment compartment cover to cover the open upper side of the electrical equipment compartment, and including a speaker hole, a microphone accommodating portion which protrudes upward from an upper side of the electrical equipment compartment cover and including an accommodating space and a front portion that includes microphone holes laterally spaced apart from each other and which face toward a front of the home appliance, a microphone unit including a printed circuit board (PCB) disposed in the accommodation space behind the microphone holes and including microphone chips mounted on the PCB, and a speaker unit disposed in the electrical equipment compartment to correspond to the speaker hole.
-
公开(公告)号:US12211506B2
公开(公告)日:2025-01-28
申请号:US18501786
申请日:2023-11-03
Applicant: Apple Inc.
Inventor: Timothy J. Millet , Manu Gulati , Michael F. Culbert
IPC: G10L15/22 , G06F1/32 , G06F1/3228 , G06F1/3287 , G06F3/16 , G10L13/00 , G10L15/00 , G10L15/28 , G10L15/08 , G10L25/48
Abstract: In an embodiment, an integrated circuit may include one or more CPUs, a memory controller, and a circuit configured to remain powered on when the rest of the SOC is powered down. The circuit may be configured to receive audio samples from a microphone, and match those audio samples against a predetermined pattern to detect a possible command from a user of the device that includes the SOC. In response to detecting the predetermined pattern, the circuit may cause the memory controller to power up so that audio samples may be stored in the memory to which the memory controller is coupled. The circuit may also cause the CPUs to be powered on and initialized, and the operating system (OS) may boot. During the time that the CPUs are initializing and the OS is booting, the circuit and the memory may be capturing the audio samples.
-
公开(公告)号:US12190889B2
公开(公告)日:2025-01-07
申请号:US18316427
申请日:2023-05-12
Applicant: Sorenson IP Holdings, LLC
Inventor: Michael Holm , Jasper C. Pan
IPC: G10L15/26 , G10L15/01 , G10L15/18 , G10L15/28 , G10L15/30 , H04M1/247 , H04M1/253 , H04M3/42 , H04M7/00 , H04M1/72412
Abstract: A system is provided that includes a first network interface for a first network type and a second network interface for a second network type that is different from the first network type. The system also includes at least one processor configured to cause the system to perform operations. The operations may include obtaining, from the first network interface, audio from a communication session with a remote device established over the first network and obtaining an indication of a communication device available to participate in the communication session and direct audio obtained from the communication session to a remote transcription system. The operations may also include directing the audio to the second network interface for transmission to the communication device, obtaining transcript data from the remote transcription system based on the audio, and directing the transcript data to the second network interface for transmission to the communication device.
-
公开(公告)号:US20240361982A1
公开(公告)日:2024-10-31
申请号:US18765101
申请日:2024-07-05
Applicant: GOOGLE LLC
Inventor: Joseph Lange , Marcin Nowak-Przygodzki
CPC classification number: G06F3/167 , G10L15/22 , G10L15/28 , G10L2015/223
Abstract: Implementations set forth herein relate to an automated assistant that can provide a selectable action intent suggestion when a user is accessing a third party application that is controllable via the automated assistant. The action intent can be initialized by the user without explicitly invoking the automated assistant using, for example, an invocation phrase (e.g., “Assistant . . . ”). Rather, the user can initialize performance of the corresponding action by identifying one or more action parameters. In some implementations, the selectable suggestion can indicate that a microphone is active for the user to provide a spoken utterance that identifies a parameter(s). When the action intent is initialized in response to the spoken utterance from the user, the automated assistant can control the third party application according to the action intent and any identified parameter(s).
-
公开(公告)号:US20240347047A1
公开(公告)日:2024-10-17
申请号:US18298488
申请日:2023-04-11
Applicant: c/o Nuance Communications, inc.
Inventor: Felix Weninger , Marco Gaudesi , Puming Zhan
Abstract: A method, computer program product, and computing system for dividing a speech signal into a plurality of chunks. A context window is defined for processing a chunk of the plurality of chunks using a neural network of a speech processing system. A processing load associated with the speech processing system is determined. The context window is dynamically adjusted based upon, at least in part, the processing load associated with the speech processing system.
-
公开(公告)号:US12080276B2
公开(公告)日:2024-09-03
申请号:US18188238
申请日:2023-03-22
Applicant: Google LLC
Inventor: Matthew Sharifi , Aleksandar Kracun
CPC classification number: G10L15/06 , G10L15/16 , G10L15/22 , G10L15/28 , G10L25/90 , G10L2015/088 , G10L2025/783
Abstract: A method for optimizing speech recognition includes receiving a first acoustic segment characterizing a hotword detected by a hotword detector in streaming audio captured by a user device, extracting one or more hotword attributes from the first acoustic segment, and adjusting, based on the one or more hotword attributes extracted from the first acoustic segment, one or more speech recognition parameters of an automated speech recognition (ASR) model. After adjusting the speech recognition parameters of the ASR model, the method also includes processing, using the ASR model, a second acoustic segment to generate a speech recognition result. The second acoustic segment characterizes a spoken query/command that follows the first acoustic segment in the streaming audio captured by the user device.
-
7.
公开(公告)号:US20240282305A1
公开(公告)日:2024-08-22
申请号:US18649054
申请日:2024-04-29
Applicant: GOOGLE LLC
Inventor: Victor Carbune , Matthew Sharifi
CPC classification number: G10L15/22 , G06F16/63 , G10L15/18 , G10L15/28 , G10L2015/228
Abstract: Systems and methods for providing audio data, from an initially invoked automated assistant to a subsequently invoked automated assistant. An initially invoked automated assistant may be invoked by a user utterance, followed by audio data that includes a query. The query is provided to a secondary automated assistant for processing. Subsequently, the user can submit a query that is related to the first query. In response, the initially invoked automated assistant provides the query to the secondary automated assistant in lieu of providing the query to other secondary automated assistants based on similarity between the first query and the subsequent query.
-
公开(公告)号:US12033632B2
公开(公告)日:2024-07-09
申请号:US17701387
申请日:2022-03-22
Applicant: Amazon Technologies, Inc.
Inventor: Joseph White , Ravi Kiran Rachakonda , Vinodth Kumar Mohanam , Lalithkumar Rajendran , Deepak Uttam Shah , Maziyar Khorasani , Venkata Snehith Cherukuri
CPC classification number: G10L15/22 , G10L15/1815 , G10L15/28 , G10L25/84 , G10L2015/223
Abstract: This disclosure describes, in part, context-based device arbitration techniques to select a voice-enabled device from multiple voice-enabled devices to provide a response to a command included in a speech utterance of a user. In some examples, the context-driven arbitration techniques may include determining a ranked list of voice-enabled devices that are ranked based on audio signal metric values for audio signals generated by each voice-enabled device, and iteratively moving through the list to determine, based on device states of the voice-enabled devices, whether one of the voice-enabled devices can perform an action responsive to the command. If the voice-enabled devices that detected the speech utterance are unable to perform the action responsive to the command, all other voice-enabled devices associated with an account may be analyzed to determine whether one of the other voice-enabled devices can perform the action responsive to the command in the speech utterance.
-
公开(公告)号:US12020683B2
公开(公告)日:2024-06-25
申请号:US17513335
申请日:2021-10-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Tapan Bohra , Akshay Mallipeddi , Amit Srivastava , Ana Karen Parra
IPC: G10L13/08 , G10L13/04 , G10L15/08 , G10L15/187 , G10L15/28
CPC classification number: G10L13/08 , G10L13/04 , G10L15/083 , G10L15/187 , G10L15/285
Abstract: A real-time name mispronunciation detection feature can enable a user to receive instant feedback anytime they have mispronounced another person's name in an online meeting. The feature can receive audio input of a speaker and obtain a transcript of the audio input; identify a name from text of the transcript based on names of meeting participants; and extract a portion of the audio input corresponding to the name identified from the text of the transcript. The feature can obtain a reference pronunciation for the name using a user identifier associated with the name; and can obtain a pronunciation score for the name based on a comparison between the reference pronunciation for the name and the portion of the audio input corresponding to the name. The feature can then determine whether the pronunciation score is below a threshold; and in response, notify the speaker of a pronunciation error.
-
公开(公告)号:US11990126B2
公开(公告)日:2024-05-21
申请号:US17750983
申请日:2022-05-23
Applicant: Google LLC
Inventor: Raunaq Shah , Matt Van Der Staay
IPC: G10L15/22 , G06F3/16 , G10L15/28 , G10L15/30 , H04M1/27 , H04M3/493 , H04N21/20 , H04N21/239 , H04N21/40 , H04N21/41 , H04N21/4147 , H04N21/422 , H04N21/47 , H04N21/4722 , H04N21/45 , H04N21/475
CPC classification number: G10L15/22 , G06F3/167 , G10L15/28 , G10L15/30 , H04M1/271 , H04M3/493 , H04N21/20 , H04N21/2393 , H04N21/40 , H04N21/4104 , H04N21/4112 , H04N21/4147 , H04N21/42203 , H04N21/42204 , H04N21/47 , H04N21/4722 , G10L2015/223 , H04N21/42206 , H04N21/4532 , H04N21/4751
Abstract: A method is implemented to move media content display between two media output devices. A server system determines in a voice message recorded by an electronic device a media transfer request that includes a user voice command to transfer media content to a destination media output device and a user voice designation of the destination media output device. The server system then obtains from a source cast device instant media play information including information of a media play application, the media content that is being played, and a temporal position. The server system further identifies a destination cast device associated in a user domain coupled to the destination media output device, and sends to the destination cast device a media play request including the instant media play information, thereby enabling the destination cast device to execute the media play application for playing the media content from the temporal location.
-
-
-
-
-
-
-
-
-