-
公开(公告)号:US20170256262A1
公开(公告)日:2017-09-07
申请号:US15070827
申请日:2016-03-15
Applicant: Wipro Limited
Inventor: Manjunath RAMACHANDRA , Priyanshu SHARMA
IPC: G10L15/26 , G06F17/22 , G10L15/187 , G10L15/25 , G10L15/14
CPC classification number: G10L15/265 , G06F17/2288 , G10L15/14 , G10L15/187 , G10L15/25 , G10L15/26 , G10L2015/025
Abstract: This disclosure relates generally to speech recognition, and more particularly to system and method for speech-to-text conversion using audio as well as video input. In one embodiment, a method is provided for performing speech to text conversion. The method comprises receiving an audio data and a video data of a user while the user is speaking, generating a first raw text based on the audio data via one or more audio-to-text conversion algorithms, generating a second raw text based on the video data via one or more video-to-text conversion algorithms, determining one or more errors by comparing the first raw text and the second raw text, and correcting the one or more errors by applying one or more rules. The one or more rules employ at least one of a domain specific word database, a context of conversation, and a prior communication history.