METHOD AND A SYSTEM FOR GENERATING A CONTEXTUAL SUMMARY OF MULTIMEDIA CONTENT

    公开(公告)号:US20180336417A1

    公开(公告)日:2018-11-22

    申请号:US15638404

    申请日:2017-06-30

    Applicant: WIPRO LIMITED

    CPC classification number: G06K9/00684 G06F17/241 G06K9/00744 G06K9/00751

    Abstract: Disclosed subject matter relates to paraphrasing multimedia content including a method and system for generating a contextual summary of multimedia content. A contextual summary generator retrieves the multimedia content comprising scenes from a multimedia content database and generates scene descriptors, describing a scene, for each scene. Further, an emotion factor is identified in each scene based on each scene descriptor, each speech descriptor and each textual descriptor associated with each of the one or more scenes. Upon identifying the emotion factor, a context descriptor indicating context of each scene is generated for each scene based on analysis of each emotion factor and non-speech descriptors. Finally, the scene descriptors, textual descriptors and context descriptors are correlated based on a dynamically configured threshold value to generate the contextual summary of the multimedia content that saves precious time and efforts instead of watching or hearing the entire multimedia content that may be redundant.

    System and Method for Speech-to-Text Conversion

    公开(公告)号:US20170256262A1

    公开(公告)日:2017-09-07

    申请号:US15070827

    申请日:2016-03-15

    Applicant: Wipro Limited

    Abstract: This disclosure relates generally to speech recognition, and more particularly to system and method for speech-to-text conversion using audio as well as video input. In one embodiment, a method is provided for performing speech to text conversion. The method comprises receiving an audio data and a video data of a user while the user is speaking, generating a first raw text based on the audio data via one or more audio-to-text conversion algorithms, generating a second raw text based on the video data via one or more video-to-text conversion algorithms, determining one or more errors by comparing the first raw text and the second raw text, and correcting the one or more errors by applying one or more rules. The one or more rules employ at least one of a domain specific word database, a context of conversation, and a prior communication history.

Patent Agency Ranking