-
公开(公告)号:US12111657B2
公开(公告)日:2024-10-08
申请号:US17204410
申请日:2021-03-17
IPC分类号: G05D1/00 , B64U10/14 , B64U101/30 , G10L25/48 , H04R1/32
CPC分类号: G05D1/005 , G10L25/48 , H04R1/32 , B64U10/14 , B64U2101/30 , B64U2201/00 , B64U2201/104 , H04R2440/00
摘要: An unmanned aerial vehicle includes: a sensor including at least a microphone that generates sound data; and a processor. The processor determines a quality of a target sound by using the sound data generated by the microphone, acquires a positional relationship between the unmanned aerial vehicle and a sound source of the target sound by using data generated by the sensor, determines a destination to which the sound source is to move based on the quality of the target sound and the positional relationship, and presents target movement information that prompts the sound source to move toward the destination.
-
公开(公告)号:US20240296835A1
公开(公告)日:2024-09-05
申请号:US18664348
申请日:2024-05-15
申请人: Google LLC
发明人: Jason Sanders , Gabriel Taubman , John J. Lee
IPC分类号: G10L15/08 , G06F16/683 , G10L15/18 , G10L15/22 , G10L15/26 , G10L21/0208 , G10L21/0272 , G10L25/48 , H04M3/493
CPC分类号: G10L15/08 , G06F16/685 , G10L15/1815 , G10L15/22 , G10L15/26 , G10L21/0272 , G10L25/48 , H04M3/4936 , G10L2015/225 , G10L21/0208 , H04M2201/40 , H04M2203/352
摘要: Implementations relate to techniques for providing context-dependent search results. A computer-implemented method includes receiving an audio stream at a computing device during a time interval, the audio stream comprising user speech data and background audio, separating the audio stream into a first substream that includes the user speech data and a second substream that includes the background audio, identifying concepts related to the background audio, generating a set of terms related to the identified concepts, influencing a speech recognizer based on at least one of the terms related to the background audio, and obtaining a recognized version of the user speech data using the speech recognizer.
-
公开(公告)号:US20240289840A1
公开(公告)日:2024-08-29
申请号:US18660785
申请日:2024-05-10
申请人: YAHOO AD TECH LLC
发明人: Yugandhar Reddy Boyapally , Janith Kaiprath Valiyalappil , Sreeram Ramji , Rajesh Lalwani , Tianyuan Zhang
IPC分类号: G06Q30/0251 , G06F40/295 , G10L25/03 , G10L25/48
CPC分类号: G06Q30/0251 , G06F40/295 , G10L25/03 , G10L25/48
摘要: The present teaching relates to method and system for evaluating a conversion. The method extracts meta-information including a conversion parameter and a reward. The meta-information corresponds to a conversion associated with an advertisement displayed previously by a plurality of entities. The method receives a plurality of claims for the conversion from one or more entities, and selects a claim corresponding to an entity from the plurality of claims based on the conversion parameter and information included in the plurality of claims. Further, the method transmits information related to the selected claim.
-
公开(公告)号:US12073851B2
公开(公告)日:2024-08-27
申请号:US17811868
申请日:2022-07-11
申请人: BetterUp, Inc.
发明人: Andrew Reece , Peter Bull , Gus Cooney , Casey Fitzpatrick , Gabriella Rosen Kellerman , Ryan Sonnek
IPC分类号: G10L25/00 , G06N5/04 , G06N20/00 , G06V10/764 , G06V10/774 , G10L15/04 , G10L15/16 , G10L15/22 , G10L15/24 , G10L25/48 , G10L25/63 , G06V40/16 , G06V40/18
CPC分类号: G10L25/48 , G06N5/04 , G06N20/00 , G06V10/764 , G06V10/774 , G10L15/04 , G10L15/16 , G10L15/22 , G10L15/24 , G10L25/63 , G06V40/174 , G06V40/18
摘要: Technology is provided for conversation analysis. The technology includes, receiving multiple utterance representations, where each utterance representation represents a portion of a conversation performed by at least two users, and each utterance representation is associated with video data, acoustic data, and text data. The technology further includes generating a first utterance output by applying video data, acoustic data, and text data of the first utterance representation to a respective video processing part of the machine learning system to generate video, text, and acoustic-based outputs. A second utterance output is further generated for a second user. Conversation analysis indicators are generated by applying, to a sequential machine learning system the combined speaker features and a previous state of the sequential machine learning system.
-
公开(公告)号:US20240265933A1
公开(公告)日:2024-08-08
申请号:US18618636
申请日:2024-03-27
IPC分类号: G10L21/043 , G10L21/055 , G10L25/48 , H04N21/432
CPC分类号: G10L21/043 , G10L21/055 , G10L25/48 , H04N21/4325
摘要: Systems and methods for intelligent playback of media content may include an intelligent media playback system that, in response to determining the speech tempo in audio content by measuring syllable density of speech in the audio content, automatically adjusts a playback speed of the audio content as the audio content is being played based on the determined speech tempo. In some embodiments, the system may automatically and dynamically adjust the playback speed to result in a desired target speech tempo. In addition, the system may determine whether to automatically adjust playback speed of the audio content, as the media is being played, based on the detected speech tempo of the speech in the audio content and the determined type of content of media. Such automatic adjustments in playback speed result in more efficient playback of the audio content.
-
公开(公告)号:US12014737B2
公开(公告)日:2024-06-18
申请号:US17530227
申请日:2021-11-18
发明人: Heiko Rahmel , Li-Juan Qin , Xuedong Huang , Wei Xiong
IPC分类号: G10L15/22 , G06F3/01 , G06F3/0484 , G06F3/04842 , G06N20/00 , G10L15/08 , G10L15/30 , G10L25/48 , G10L25/90
CPC分类号: G10L15/22 , G06F3/017 , G06F3/04842 , G06N20/00 , G10L15/08 , G10L25/48 , G10L2015/088 , G10L2015/223 , G10L15/30 , G10L25/90
摘要: Systems, methods, and computer-readable storage devices are disclosed for generating smart notes for a meeting based on participant actions and machine learning. One method including: receiving meeting data from a plurality of participant devices participating in an online meeting; continuously generating text data based on the received audio data from each participant device of the plurality of participant devices; iteratively performing the following steps until receiving meeting data for the meeting has ended, the steps including: receiving an indication that a predefined action has occurred on the first participating device; generating a participant segment of the meeting data for at least the first participant device from a first predetermined time before when the predefined action occurred to when the predefined action occurred; determining whether the receiving meeting data of the meeting has ended; and generating a summary of the meeting.
-
公开(公告)号:US12002486B2
公开(公告)日:2024-06-04
申请号:US17279009
申请日:2019-09-13
发明人: Ryo Masumura , Tomohiro Tanaka
摘要: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t−1)-th utterance sequence information vector ut-1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.
-
公开(公告)号:US11996116B2
公开(公告)日:2024-05-28
申请号:US17000583
申请日:2020-08-24
申请人: Google LLC
发明人: Joel Shor , Ronnie Maor , Oran Lang , Omry Tuval , Marco Tagliasacchi , Ira Shavitt , Felix de Chaumont Quitry , Dotan Emanuel , Aren Jansen
摘要: Examples relate to on-device non-semantic representation fine-tuning for speech classification. A computing system may obtain audio data having a speech portion and train a neural network to learn a non-semantic speech representation based on the speech portion of the audio data. The computing system may evaluate performance of the non-semantic speech representation based on a set of benchmark tasks corresponding to a speech domain and perform a fine-tuning process on the non-semantic speech representation based on one or more downstream tasks. The computing system may further generate a model based on the non-semantic representation and provide the model to a mobile computing device. The model is configured to operate locally on the mobile computing device.
-
公开(公告)号:US11983738B2
公开(公告)日:2024-05-14
申请号:US18061788
申请日:2022-12-05
申请人: YAHOO AD TECH LLC
发明人: Yugandhar Reddy Boyapally , Janith Kaiprath Valiyalappil , Sreeram Ramji , Rajesh Lalwani , Tianyuan Zhang
IPC分类号: G06Q30/00 , G06F40/295 , G06Q30/0251 , G10L25/03 , G10L25/48
CPC分类号: G06Q30/0251 , G06F40/295 , G10L25/03 , G10L25/48
摘要: The present teaching relates to method and system for evaluating a conversion. The method extracts meta-information including a conversion parameter and a reward. The meta-information corresponds to a conversion associated with an advertisement displayed previously by a plurality of entities. The method receives a plurality of claims for the conversion from one or more entities, and selects a claim corresponding to an entity from the plurality of claims based on the conversion parameter and information included in the plurality of claims. Further, the method transmits information related to the selected claim.
-
公开(公告)号:US20240055016A1
公开(公告)日:2024-02-15
申请号:US18384769
申请日:2023-10-27
申请人: Google LLC
CPC分类号: G10L25/48 , G06F9/452 , G06F9/54 , G06F9/4806
摘要: Systems, methods and apparatus for invoking actions at a second user device from a first user device. A method includes determining that a first user device has an associated second user device; accessing specification data that specifies a set of user device actions that the second user device is configured to perform; receiving command inputs for the first user device; for each command input, determining whether the command input resolves to one of the user device actions; for each command input not determined to resolve any of the user device actions, causing the command input to be processed at the first user device; and for each command input determined to resolve one of the user device actions causing the first user device to display in a user interface a dialog by which a user may either accept or deny invoking the user device action at the second user device.
-
-
-
-
-
-
-
-
-