SPATIAL AUDIO CONVERSATIONAL ANALYSIS FOR ENHANCED CONVERSATION DISCOVERY

    公开(公告)号:US20240355331A1

    公开(公告)日:2024-10-24

    申请号:US18760626

    申请日:2024-07-01

    IPC分类号: G10L15/26 G06F3/16 H04M3/56

    CPC分类号: G10L15/26 G06F3/165 H04M3/568

    摘要: Systems and methods for providing enhanced teleconferencing. An example method includes receiving audio streams from a plurality of client devices of participants of a teleconference; converting the audio streams for a first conversation within the teleconference into first text; converting the audio streams for a second conversation within the teleconference into a second text; analyzing the first text to identify one or more topics being discussed in the first conversation; analyzing the second text to identify one or more topics being discussed in the second conversation; and presenting, in a teleconference user interface, at least one of the one or more topics being discussed in the first conversation or the one or more topics being discussed in the second conversation.

    SYSTEM AND METHOD FOR TRANSCRIBING AUDIBLE INFORMATION

    公开(公告)号:US20240355329A1

    公开(公告)日:2024-10-24

    申请号:US18138707

    申请日:2023-04-24

    IPC分类号: G10L15/26 G06F40/166

    CPC分类号: G10L15/26 G06F40/166

    摘要: Embodiments herein include a processing system and method for transcribing audible information, including converting audible information or data received from a user into alphanumeric data. The alphanumeric data can be processed allowing a user to provide input that improves of the accuracy of the alphanumeric data, facilitates the transfer of the alphanumeric data to other electronic devices by a computer or other electronic device, and/or improves the communicative or expressive properties of the alphanumeric data in an electronic communication that is provided to one or more users. In some embodiments, the transcribed and/or translated text can be automatically formatted by a program for use in a software application. More specifically embodiments of the present application disclose a system and program that can embellish transcribed text to alert and provide suggestions for the correction of potentially inaccurate transcribed and/or translated text, and/or provide potential emojis to add into or replace text.

    SYSTEM AND METHOD FOR HYBRID GENERATION OF TEXT FROM AUDIO

    公开(公告)号:US20240355328A1

    公开(公告)日:2024-10-24

    申请号:US18138295

    申请日:2023-04-24

    申请人: Verbit, Inc.

    IPC分类号: G10L15/26 G10L15/02 G10L15/22

    CPC分类号: G10L15/26 G10L15/02 G10L15/22

    摘要: A method, system and computer program product for transcribing audio signals, the method comprising: obtaining a source audio signal; obtaining meta data associated with the audio signal; analyzing the meta data; extracting acoustic features from the source audio signal; determining a difficulty level assessment of transcribing the audio signal, based at least on the meta data and acoustic features; selecting based on the level of transcription difficulty a first transcription option; and providing a related audio signal which is related to the source audio signal to the first transcription option over a communication channel, to obtain a transcription of the related audio signal.

    REAL-TIME INTERACTIVE VOICE CONVERSATION STATE MANAGEMENT IN LARGE LANGUAGE MODELS

    公开(公告)号:US20240347058A1

    公开(公告)日:2024-10-17

    申请号:US18634800

    申请日:2024-04-12

    申请人: Animato, Inc.

    IPC分类号: G10L15/22 G10L13/08 G10L15/26

    CPC分类号: G10L15/22 G10L13/08 G10L15/26

    摘要: A method or system for managing interruptions during oral interactions between users and Large Language Models (LLMs). Initially, a user's spoken input is received and converted to text, which forms a prompt for the LLM. Upon generating a text response by the LLM, the text response is then converted back into speech and played to the user. If the user interrupts while the response is being played, the playback stops, and the interruption is captured as a new spoken input. This interruption is used to generate a new prompt for the LLM. Subsequently, the LLM generates a second text response based on the interruption, which is converted to speech and played back to the user. This process ensures that user interruptions are effectively managed, allowing for a more dynamic and interactive conversation with the LLM and enhancing the user's experience by adapting the conversation flow to real-time inputs.

    Determining multilingual content in responses to a query

    公开(公告)号:US12118981B2

    公开(公告)日:2024-10-15

    申请号:US17475897

    申请日:2021-09-15

    申请人: GOOGLE LLC

    摘要: Implementations relate to determining multilingual content to render at an interface in response to a user submitted query. Those implementations further relate to determining a first language response and a second language response to a query that is submitted to an automated assistant. Some of those implementations relate to determining multilingual content that includes a response to the query in both the first and second languages. Other implementations relate to determining multilingual content that includes a query suggestion in the first language and a query suggestion in a second language. Some of those implementations relate to pre-fetching results for the query suggestions prior to rendering the multilingual content.