Patent search ipc:G10L15/01 Page 1

1.

发明申请
SEMIAUTOMATED RELAY METHOD AND APPARATUS 有权

公开(公告)号：US20250069601A1

公开(公告)日：2025-02-27

申请号：US18943527

申请日：2024-11-11

Applicant: Ultratec, Inc.

Inventor： Robert M. Engelke , Kevin R. Colwell , Christopher Engelke

IPC: G10L15/26 , G10L15/01 , G10L15/18 , G10L25/48 , G10L25/60 , H04M1/247 , H04M3/42

Abstract: A method to transcribe communications includes the steps of obtaining a plurality of hypothesis transcriptions of a voice signal generated by a speech recognition system, determining consistent words that are included in at least first and second of the plurality of hypothesis transcriptions, in response to determining the consistent words, providing the consistent words to a device for presentation of the consistent words to an assisted user, and presenting the consistent words via a display screen on the device, wherein a rate of the presentation of the words on the display screen is variable.

2.

发明申请
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM 有权

公开(公告)号：US20250069377A1

公开(公告)日：2025-02-27

申请号：US18727106

申请日：2023-01-25

Applicant: Sony Group Corporation

Inventor： Junji OTSUKA , Atsushi IRIE , Masakazu YOSHIMURA

IPC: G06V10/776 , G06V10/774 , G06V10/98 , G06V20/70 , G10L15/01 , G10L15/06

Abstract: An information processing apparatus according to an embodiment of the present technology includes a generation unit, an evaluation unit, and an update unit. The generation unit generates input data on the basis of a predetermined parameter. The evaluation unit generates evaluation data on the basis of first output data that includes evaluation target data and is output by inputting first input data generated by the generation unit to a first recognition model, and second output data that includes a pseudo label as a pseudo correct answer of the evaluation target data and is output by inputting second input data generated by the generation unit to a second recognition model. The update unit updates the predetermined parameter on the basis of the evaluation data.

3.

发明授权
Systems and methods for accessible websites and/or applications for people with disabilities 有权

公开(公告)号：US12211487B1

公开(公告)日：2025-01-28

申请号：US18416084

申请日：2024-01-18

Applicant: Morgan Stanley Services Group Inc.

Inventor： Aratrika Sarkar , Ayyapparaj Radhakrishnan Ganesan , Mayank Jain , Mehak Mehta

IPC: G10L15/06 , G09B21/00 , G10L15/01 , G10L15/18 , G10L15/22 , G10L15/30 , G10L25/18 , G10L25/78 , G10L25/84

Abstract: A system and method for creating accessibility of any website or application for people with sight, hearing or speech disabilities. The system and method can include receiving input of the website or the application to be accessed and an indicator as to specific disabilities a user, scoring the website or the application for its accessibility based on the specific disabilities of the user, and if the score is below a threshold, determining an alternative form for the input of the website or the application to accommodate the specific disabilities of the user.

4.

发明申请
METHOD AND APPARATUS FOR TRANSCRIBING AUDIO 有权

公开(公告)号：US20250006187A1

公开(公告)日：2025-01-02

申请号：US18885132

申请日：2024-09-13

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Hongtao ZOU , Si CHEN

IPC: G10L15/183 , G10L15/01 , G10L15/06

Abstract: The present disclosure provides a method and apparatus for transcribing audio, relates to the field of artificial intelligence technology. A specific embodiment of the method includes: receiving audio information uploaded through a scenario entry of a storage service application installed on a client; determining, based on the scenario entry, a scenario type of the audio information; performing speech recognition on the audio information to obtain text information corresponding to the audio information; and inputting the text information and a prompt corresponding to the scenario type into a language model to obtain summary information, where the language model is obtained by performing supervised fine-tuning on a pre-trained model using samples corresponding to various scenario types, and the prompts corresponding to the various scenario types are obtained by tuning initial prompts corresponding to the various scenario types using the language model.

5.

发明授权
Label confidence scoring 有权

公开(公告)号：US12148417B1

公开(公告)日：2024-11-19

申请号：US17354215

申请日：2021-06-22

Applicant: Amazon Technologies, Inc.

Inventor： Aidan Thomas Cardella , Anand Victor , Vipin Gupta , Zheng Du , John Rajiv Malik , Li Erran Li , Jarrett Alegre Bato , Peng Yang , Alejandro Ricardo Mottini D'Oliveira

IPC: G10L15/01 , G10L15/16 , G10L15/18 , G10L15/26

Abstract: Devices and techniques are generally described for confidence score generation for label generation. In some examples, first data may be received from a first computing device. In various further examples, first label data classifying at least one aspect of the first data may be received. First metadata associated with how the first label data was generated may be received. In some cases, the first label data may be generated by a first user. In various examples, a first machine learning model may generate a first confidence score associated with the first label data based at least in part on the first data and second data related to label generation by the first person. In various examples, output data comprising the first confidence score may be sent to the first computing device.

6.

发明授权
Cohort determination in natural language processing 有权

公开(公告)号：US12112752B1

公开(公告)日：2024-10-08

申请号：US17688279

申请日：2022-03-07

Applicant: Amazon Technologies, Inc.

Inventor： Rahul Gupta , Jwala Dhamala , Apurv Verma , Qingwen Ye , Mayur Himmatbhai Dabhi , Srinivasan Rengarajan Veeravanallur , Spyridon Matsoukas , Melanie C B Gens , Seyed Omid Razavi , Avni Khatri , Premkumar Natarajan

IPC: G10L15/22 , G10L15/01 , G10L15/06 , G10L15/08

CPC classification number: G10L15/22 , G10L15/01 , G10L15/063 , G10L15/08 , G10L2015/0631 , G10L2015/223

Abstract: Devices and techniques are generally described for cohort determination in natural language processing. In various examples, a first natural language input to a natural language processing system may be determined. The first natural language input may be associated with a first account identifier. A first machine learning model may determine first data representing one or more words of the first natural language input. A second machine learning model may determine second data representing one or more acoustic characteristics of the first natural language input. Third data may be determined, the third data including a predicted performance for processing the first natural language input by the natural language processing system. The third data may be determined based on the first data representation and the second data representation.

7.

发明授权
Electronic device and control method thereof 有权

公开(公告)号：US12112745B2

公开(公告)日：2024-10-08

申请号：US17292116

申请日：2019-09-09

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor： Jisun Park , Minjin Rho

IPC: G10L15/22 , G10L15/01

CPC classification number: G10L15/22 , G10L15/01 , G10L2015/223

Abstract: An electronic device is disclosed. The present electronic device comprises: a voice receiving unit; and a processor, wherein the processor: when a user's voice is received through the voice receiving unit, determines an accumulation level of utterance history information corresponding to the characteristics of the user's voice; when the accumulation level of utterance history information is below a predetermined threshold level, provides response information corresponding to the user's voice on the basis of user information related to the characteristics of the user's voice; and when the accumulation level of utterance history information is equal to or higher than the predetermined threshold level, provides response information corresponding to the user's voice on the basis of the user information and the utterance history information.

8.

发明授权
Creative work systems and methods thereof 有权

公开(公告)号：US12112740B2

公开(公告)日：2024-10-08

申请号：US17545815

申请日：2021-12-08

Applicant: SOCIETE BIC

Inventor： David Duffy , Bernadette Elliott-Bowman

IPC: G10L15/01 , G10L13/027 , G10L15/02 , G10L15/16 , G10L15/26

CPC classification number: G10L15/01 , G10L13/027 , G10L15/02 , G10L15/16 , G10L15/26 , G10L2015/025 , G10L2015/027

Abstract: A computer-implemented method for measuring cognitive load of a user creating a creative work in a creative work system, may include generating at least one verbal statement capable of provoking at least one verbal response from the user, prompting the user to vocally interact with the creative work system by vocalizing the at least one generated verbal statement to the user via an audio interface of the creative work system, and obtaining the at least one verbal response from the user via the audio interface, and determining the cognitive load of the user based on the at least one verbal response obtained from the user, wherein generating the at least one verbal statement is based on at least one predicted verbal response suitable for determining the cognitive load of the user.

9.

发明授权
Automatic speech recognition word error rate estimation applications, including foreign language detection 有权

公开(公告)号：US12087276B1

公开(公告)日：2024-09-10

申请号：US17155825

申请日：2021-01-22

Applicant: Cisco Technology, Inc.

Inventor： Mohamed Hariri Nokob , Mohamed Gamal Mohamed Mahmoud , Ahmad Abdulkader

IPC: G10L15/00 , G10L15/01 , G10L15/02 , G10L15/22 , G10L15/32 , G10L25/78

CPC classification number: G10L15/005 , G10L15/01 , G10L15/02 , G10L15/22 , G10L15/32 , G10L25/78

Abstract: A plurality of audio datasets associated with captured audio are provided to a plurality of automatic speech recognition engines, wherein each of the automatic speech recognition engines is configured to recognize speech of a first language. Word error rate estimates that comprise at least one word error rate estimate for each of the plurality of audio datasets are determined from outputs of the plurality of automatic speech recognition engines. From the word error rate estimates, audio in the plurality of audio datasets is determined to include speech in a second language.

10.

发明公开
NON-SPEECH INPUT TO SPEECH PROCESSING SYSTEM 审中-公开

公开(公告)号：US20240296829A1

公开(公告)日：2024-09-05

申请号：US18663831

申请日：2024-05-14

Applicant: Amazon Technologies, Inc.

Inventor： Travis Grizzel

IPC: G10L15/01 , G06F3/01 , G10L13/00 , G10L15/08 , G10L15/18 , G10L15/187 , G10L15/24

CPC classification number: G10L15/01 , G06F3/017 , G10L13/00 , G10L15/18 , G10L15/187 , G10L15/24 , G10L2015/088

Abstract: A system and method for associating motion data with utterance audio data for use with a speech processing system. A device, such as a wearable device, may be capable of capturing utterance audio data and sending it to a remote server for speech processing, for example for execution of a command represented in the utterance. The device may also capture motion data using motion sensors of the device. The motion data may correspond to gestures, such as head gestures, that may be interpreted by the speech processing system to determine and execute commands. The device may associate the motion data with the audio data so the remote server knows what motion data corresponds to what portion of audio data for purposes of interpreting and executing commands. Metadata sent with the audio data and/or motion data may include association data such as timestamps, session identifiers, message identifiers, etc.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification