Systems and methods for formatting informal utterances

    公开(公告)号:US12198685B2

    公开(公告)日:2025-01-14

    申请号:US18184432

    申请日:2023-03-15

    Applicant: PAYPAL, INC.

    Abstract: Methods and systems are presented for translating informal utterances into formal texts. Informal utterances may include words in abbreviation forms or typographical errors. The informal utterances may be processed by mapping each word in an utterance into a well-defined token. The mapping from the words to the tokens may be based on a context associated with the utterance derived by analyzing the utterance in a character-by-character basis. The token that is mapped for each word can be one of a vocabulary token that corresponds to a formal word in a pre-defined word corpus, an unknown token that corresponds to an unknown word, or a masked token. Formal text may then be generated based on the mapped tokens. Through the processing of informal utterances using the techniques disclosed herein, the informal utterances are both normalized and sanitized.

    SYSTEMS AND METHODS FOR TRAINING VOICE QUERY MODELS

    公开(公告)号:US20240428779A1

    公开(公告)日:2024-12-26

    申请号:US18823383

    申请日:2024-09-03

    Abstract: Methods for automatically evaluating ASR outputs and providing annotations, including corrections, on the transcriptions—in order to improve recognition—may be based on an analysis of sessions of user voice queries, utilizing time-ordered ASR transcriptions of user voice queries (i.e., user utterances). This utterance-based approach may involve extracting both session-level and query-level characteristics from a voice query sessions and identifying patterns of query reformulation in order to detect erroneous transcriptions and automatically determine an appropriate correction. Alternative, or in addition, ASR outputs may be evaluated based on user behavior. The outcomes may be classified as positive or negative. An ASR transcription may be labeled using the description of the outcome. The labeled transcription may be used as training data to train a model to output improved transcriptions of voice queries.

    METHOD, APPARATUS, AND COMPUTER-READABLE RECORDING MEDIUM FOR CONTROLLING RESPONSE UTTERANCE BEING REPRODUCED AND PREDICTING USER INTENTION

    公开(公告)号:US20240379097A1

    公开(公告)日:2024-11-14

    申请号:US18464240

    申请日:2023-09-10

    Applicant: Saltlux Inc.

    Abstract: A method for controlling a response utterance being reproduced and predicting a user intention includes: a voice signal analysis step of, when a second voice signal is received from a user while a first response utterance for responding to a first voice signal is output, starting analysis on the second voice signal; an utterance control step of, when one of preset keywords is identified to be included in a first sentence corresponding to the second voice signal, controlling the first response utterance being output to correspond to the identified keyword; an intention prediction step of, when a third voice signal is received in a state where the first response utterance is controlled, analyzing the third voice signal to predict an intention of the user, which is reflected in a second sentence corresponding to the third voice signal; and a response utterance output step of outputting a response utterance.

    AUTOMATIC LEARNING OF ENTITIES, WORDS, PRONUNCIATIONS, AND PARTS OF SPEECH

    公开(公告)号:US20240379092A1

    公开(公告)日:2024-11-14

    申请号:US18783423

    申请日:2024-07-25

    Inventor: Anton V. RELIN

    Abstract: Systems for automatic speech recognition and/or natural language understanding automatically learn new words by finding subsequences of phonemes that, if they were a new word, would enable a successful tokenization of a phoneme sequence. Systems can learn alternate pronunciations of words by finding phoneme sequences with a small edit distance to existing pronunciations. Systems can learn the part of speech of words by finding part-of-speech variations that would enable parses by syntactic grammars. Systems can learn what types of entities a word describes by finding sentences that could be parsed by a semantic grammar but for the words not being on an entity list.

    Methods and systems for automatic call data generation

    公开(公告)号:US12014144B2

    公开(公告)日:2024-06-18

    申请号:US17390573

    申请日:2021-07-30

    Applicant: INTUIT INC.

    Abstract: A processor may receive a call transcript including text and form a text string including at least a portion of the text. The processor may generate a situation description of the call transcript, which may comprise processing the text string using a transformer-based machine learning model. The processor may generate a trouble description of the call transcript, which may comprise creating a sentence embedding of the situation description, creating sentence embeddings for a plurality of utterances within the portion of the text, determining respective similarities between the sentence embedding of the situation description and each of the sentence embeddings for each respective one of the plurality of utterances, and selecting at least one of the plurality of utterances having at least one highest determined respective similarity as the trouble description. The processor may store a call summary comprising the situation description and the trouble description in a non-transitory memory.

    TRAINING SPEECH RECOGNITION SYSTEMS USING WORD SEQUENCES

    公开(公告)号:US20240127798A1

    公开(公告)日:2024-04-18

    申请号:US18538957

    申请日:2023-12-13

    Inventor: David Thomson

    Abstract: A method may include obtaining a text string that is a transcription of audio data and selecting a sequence of words from the text string as a first word sequence. The method may further include encrypting the first word sequence and comparing the encrypted first word sequence to multiple encrypted word sequences. Each of the multiple encrypted word sequences may be associated with a corresponding one of multiple counters. The method may also include in response to the encrypted first word sequence corresponding to one of the multiple encrypted word sequences based on the comparison, incrementing a counter of the multiple counters associated with the one of the multiple encrypted word sequences and adapting a language model of an automatic transcription system using the multiple encrypted word sequences and the multiple counters.

    Systems and methods for scripted audio production

    公开(公告)号:US11875797B2

    公开(公告)日:2024-01-16

    申请号:US17355023

    申请日:2021-06-22

    Applicant: Pozotron Inc.

    Abstract: A scripted audio production system in which the scripted audio production computerized process decreases production time by improving computerized processes and technological systems for pronunciation research and script preparation, narration, editing, proofing and mastering. The system enables the user to upload their manuscript and recorded audio of the narration of the manuscript to the system. The system then compares the recorded audio against previously uploaded manuscript and any mistakes or deviations from the manuscript are highlighted or otherwise indicated to the user. The system automatically pieces together the last-read audio into a clean file without the need for significant user interaction. The process may also be performed on the recorded audio by the narrator first uploading the audio and manuscript to the scripted audio production technology system.

Patent Agency Ranking