-
公开(公告)号:US11417330B2
公开(公告)日:2022-08-16
申请号:US16798242
申请日:2020-02-21
Applicant: BetterUp, Inc.
Inventor: Andrew Reece , Peter Bull , Gus Cooney , Casey Fitzpatrick , Gabriella Kellerman , Ryan Sonnek
Abstract: Technology is provided for conversation analysis. The technology includes, receiving multiple utterance representations, where each utterance representation represents a portion of a conversation performed by at least two users, and each utterance representation is associated with video data, acoustic data, and text data. The technology further includes generating a first utterance output by applying video data, acoustic data, and text data of the first utterance representation to a respective video processing part of the machine learning system to generate video, text, and acoustic-based outputs. A second utterance output is further generated for a second user. Conversation analysis indicators are generated by applying, to a sequential machine learning system the combined speaker features and a previous state of the sequential machine learning system.
-
公开(公告)号:US20220254336A1
公开(公告)日:2022-08-11
申请号:US17597548
申请日:2020-08-12
Applicant: 100 BREVETS POUR LA FRENCH TECH
Inventor: Vincent LORPHELIN
Abstract: The method (3000) of enriching digital content representative of a conversation comprises: in an iterative manner: a step (3005) of capturing an audio signal representative of a voice message, a step (3010) of segmenting the voice message into a segment, said segmentation step comprising a silence detection step, the segment being obtained as a function of the detection of a silence a step (3015) of converting the audio segment into text, called “contribution”, and a step (3020) of storing, in a memory, a contribution, then: a step (3025) of detecting user sentiment towards at least one stored contribution a step (3030) of associating, in a memory and in relation to at least one stored contribution, at least one attribute corresponding to at least one detected sentiment and a step (3035) of displaying at least one stored contribution and at least one attribute with respect to said at least one contribution.
-
公开(公告)号:US11410660B2
公开(公告)日:2022-08-09
申请号:US16837250
申请日:2020-04-01
Applicant: Google LLC
Inventor: Petar Aleksic , Pedro J. Moreno Mengibar
IPC: G06F16/00 , G06F16/33 , G10L15/06 , G10L15/26 , G06F16/632 , G10L15/19 , G10L15/197 , G10L15/04 , G10L15/08 , G10L15/22 , G10L15/183
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for voice recognition. In one aspect, a method includes the actions of receiving a voice input; determining a transcription for the voice input, wherein determining the transcription for the voice input includes, for a plurality of segments of the voice input: obtaining a first candidate transcription for a first segment of the voice input; determining one or more contexts associated with the first candidate transcription; adjusting a respective weight for each of the one or more contexts; and determining a second candidate transcription for a second segment of the voice input based in part on the adjusted weights; and providing the transcription of the plurality of segments of the voice input for output.
-
公开(公告)号:US11373655B2
公开(公告)日:2022-06-28
申请号:US17498811
申请日:2021-10-12
Applicant: SAS Institute Inc.
Inventor: Xiaolong Li , Xiaozhuo Cheng , Xu Yang
Abstract: An apparatus includes processor(s) to: perform preprocessing operations of a segmentation technique including divide speech data set into data chunks representing chunks of speech audio, use an acoustic model with each data chunk to identify pauses in the speech audio, and analyze a length of time of each identified pause to identify a candidate set of likely sentence pauses in the speech audio; and perform speech-to-text operations including divide the speech data set into data segments that each representing segments of the speech audio based on the candidate set of likely sentence pauses, use the acoustic model with each data segment to identify likely speech sounds in the speech audio, analyze the identified likely speech sounds to identify candidate sets of words likely spoken in the speech audio, and generate a transcript of the speech data set based at least on the candidate sets of words likely spoken.
-
公开(公告)号:US11367029B2
公开(公告)日:2022-06-21
申请号:US16802538
申请日:2020-02-26
Inventor: James Murison , Johnson Tse , Gaurav Mehrotra , Anthony Lam
Abstract: A system and method are presented for adaptive skill level assignments of agents in contact center environments. A client and a service collaborate to automatically determine the effectiveness of an agent handling an interaction that has been routed using skills-based routing. Evaluation operations may be performed including emotion detection, transcription of audio to text, keyword analysis, and sentiment analysis. The results of the evaluation are aggregated with other information such as the interaction's duration, agent skills and agent skill levels, and call requirement skills and skill levels, to update the agent's profile which is then used for subsequent routing operations.
-
公开(公告)号:US20220180859A1
公开(公告)日:2022-06-09
申请号:US17115158
申请日:2020-12-08
Applicant: QUALCOMM Incorporated
Inventor: Soo Jin PARK , Sunkuk MOON , Lae-Hoon KIM , Erik VISSER
IPC: G10L15/07 , G10L15/16 , G10L15/04 , G06F1/3231
Abstract: A device includes processors configured to determine, in a first power mode, whether an audio stream corresponds to speech of at least two talkers. The processors are configured to, based on determining that the audio stream corresponds to speech of at least two talkers, analyze, in a second power mode, audio feature data of the audio stream to generate a segmentation result. The processors are configured to perform a comparison of a plurality of user speech profiles to an audio feature data set of a plurality of audio feature data sets of a talker-homogenous audio segment to determine whether the audio feature data set matches any of the user speech profiles. The processors are configured to, based on determining that the audio feature data set does not match any of the plurality of user speech profiles, generate a user speech profile based on the plurality of audio feature data sets.
-
公开(公告)号:US20220180857A1
公开(公告)日:2022-06-09
申请号:US17112418
申请日:2020-12-04
Applicant: Google LLC
Inventor: Asaf Aharoni , Yaniv LEVIATHAN , Eyal SEGALIS , Gal ELIDAN , Sasha Goldshtein , Tomer Amiaz , Deborah Cohen
Abstract: Implementations are directed to providing a voice bot development platform that enables a third-party developer to train a voice bot based on training instance(s). The training instance(s) can each include training input and training output. The training input can include a portion of a corresponding conversation and a prior context of the corresponding conversation. The training output can include a corresponding ground truth response to the portion of the corresponding conversation. Subsequent to training, the voice bot can be deployed for conducting conversations on behalf of a third-party. In some implementations, the voice bot is further trained based on a corresponding feature emphasis input that attentions the voice bot to a particular feature of the portion of the corresponding conversation. In some additional or alternative implementations, the voice bot is further trained to interact with third-party system(s) via remote procedure calls (RPCs).
-
公开(公告)号:US11350885B2
公开(公告)日:2022-06-07
申请号:US16784032
申请日:2020-02-06
Applicant: Samsung Electronics Co., Ltd.
Inventor: Korosh Vatanparvar , Viswam Nathan , Ebrahim Nematihosseinabadi , Md Mahbubur Rahman , Jilong Kuang
Abstract: A method includes identifying, by an electronic device, one or more segments within a first audio recording that includes one or more non-speech segments and one or more speech segments. The method also includes generating, by the electronic device, one or more synthetic speech segments that include natural speech audio characteristics and that preserve one or more non-private features of the one or more speech segments. The method also includes generating, by the electronic device, an obfuscated audio recording by replacing the one or more speech segments with the one or more synthetic speech segments while maintaining the one or more non-speech segments, wherein the one or more synthetic speech segments prevent recognition of some content of the obfuscated audio recording.
-
公开(公告)号:US11315561B2
公开(公告)日:2022-04-26
申请号:US16638540
申请日:2018-03-14
Applicant: D&M Holdings, Inc.
Inventor: Yukihiro Yoshida
IPC: G10L15/00 , G10L15/04 , G10L15/22 , H05B47/12 , H05B45/20 , H05B47/155 , H05B45/325 , H05B45/10 , G06F3/16 , G10L15/08 , G10L15/28 , H04R1/08
Abstract: [Problem] To provide an audio device having a voice operation receiving function with which the state of a voice recognition process can be notified in detail without affecting an audio playback environment, and which is inexpensive and has an excellent degree of freedom in design. [Solution] A wireless speaker 1 has a voice operation receiving function that receives an operation by a voice input into a microphone 11. The wireless speaker comprises: an LED 12; an LED control unit 18 that subjects the LED 12 to PWM control; and a lighting pattern storage unit 17 that stores a lighting pattern in which the brightness is changed on a time axis for each state of a voice recognition process. The LED control unit 18 subjects the LED 12 to PWM control in accordance with the lighting pattern stored in the lighting pattern storage unit 17 corresponding to the state of the voice recognition process performed on the voice input into the microphone 11.
-
公开(公告)号:US20220122609A1
公开(公告)日:2022-04-21
申请号:US17567491
申请日:2022-01-03
Applicant: Verint Systems Ltd.
Inventor: Roni Romano , Yair Horesh , Jeremie Dreyfuss
Abstract: A method of zoning a transcription of audio data includes separating the transcription of audio data into a plurality of utterances. A that each word in an utterances is a meaning unit boundary is calculated. The utterance is split into two new utterances at a work with a maximum calculated probability. At least one of the two new utterances that is shorter than a maximum utterance threshold is identified as a meaning unit.
-
-
-
-
-
-
-
-
-