Training set sufficiency for image analysis

    公开(公告)号:US10902288B2

    公开(公告)日:2021-01-26

    申请号:US15977517

    申请日:2018-05-11

    Abstract: Aspects of the technology described herein improve an object recognition system by specifying a type of picture that would improve the accuracy of the object recognition system if used to retrain the object recognition system. The technology described herein can take the form of an improvement model that improves an object recognition model by suggesting the types of training images that would improve the object recognition model's performance. For example, the improvement model could suggest that a picture of a person smiling be used to retrain the object recognition system. Once trained, the improvement model can be used to estimate a performance score for an image recognition model given the set characteristics of a set of training of images. The improvement model can then select a feature of an image, which if added to the training set, would cause a meaningful increase in the recognition system's performance.

    Multichannel audio speech classification

    公开(公告)号:US11900961B2

    公开(公告)日:2024-02-13

    申请号:US17804606

    申请日:2022-05-31

    CPC classification number: G10L25/69 G10L19/008 G10L19/173 G10L25/06 G10L25/21

    Abstract: Examples of the present disclosure describe systems and methods for multichannel audio speech classification. In examples, an audio signal comprising multiple audio channels is received at a processing device. Each of the audio channels in the audio signal is transcoded to a predefined audio format. For each of the transcoded audio channels, an average power value is calculated for one or more data windows in the audio signal. A correlation value is calculated between the average power value for each audio channel and the combined average power value of the other audio channels in the audio signal. Each of the correlation values (or an aggregated correlation value for the audio channels) is then compared against a threshold value to determine whether the audio signal is to be classified as a speech-based communication. Based on the classification, an action associated with the audio signal may be performed.

    System and method for speaker role determination and scrubbing identifying information

    公开(公告)号:US11182504B2

    公开(公告)日:2021-11-23

    申请号:US16397738

    申请日:2019-04-29

    Abstract: Methods for speaker role determination and scrubbing identifying information are performed by systems and devices. In speaker role determination, data from an audio or text file is divided into respective portions related to speaking parties. Characteristics classifying the portions of the data for speaking party roles are identified in the portions to generate data sets from the portions corresponding to the speaking party roles and to assign speaking party roles for the data sets. For scrubbing identifying information in data, audio data for speaking parties is processed using speech recognition to generate a text-based representation. Text associated with identifying information is determined based on a set of key words/phrases, and a portion of the text-based representation that includes a part of the text is identified. A segment of audio data that corresponds to the identified portion is replaced with different audio data, and the portion is replaced with different text.

    System and method for speaker role determination and scrubbing identifying information

    公开(公告)号:US11062706B2

    公开(公告)日:2021-07-13

    申请号:US16397745

    申请日:2019-04-29

    Abstract: Methods for speaker role determination and scrubbing identifying information are performed by systems and devices. In speaker role determination, data from an audio or text file is divided into respective portions related to speaking parties. Characteristics classifying the portions of the data for speaking party roles are identified in the portions to generate data sets from the portions corresponding to the speaking party roles and to assign speaking party roles for the data sets. For scrubbing identifying information in data, audio data for speaking parties is processed using speech recognition to generate a text-based representation. Text associated with identifying information is determined based on a set of key words/phrases, and a portion of the text-based representation that includes a part of the text is identified. A segment of audio data that corresponds to the identified portion is replaced with different audio data, and the portion is replaced with different text.

    Video segmentation and searching by segmentation dimensions

    公开(公告)号:US10560734B2

    公开(公告)日:2020-02-11

    申请号:US15492972

    申请日:2017-04-20

    Abstract: In various embodiments, methods and systems for implementing video segmentation are provided. A video management system implements a video segment manager that supports generating enhanced segmented video. Enhanced segmented video is a time-based segment of video content. Enhanced segmented video is generated based on a video content cognitive index, segmentation dimensions, segmentation rules and segment reconstruction rules. The video content cognitive index is built for indexing video content. Segmentation rules are applied to the video content to break the video content into time-based segments, the time-based segments are associated with corresponding segmentation dimensions for the video content. Segment reconstruction rules are then applied to the time-based segments to reconstruct the time-based segments into enhanced segmented video. The enhanced segmented video and corresponding values of the segmentation dimensions can be leveraged as distinct portions of the video content for different types of functionality in the video management system.

    Training set sufficiency for image analysis

    公开(公告)号:US12169984B2

    公开(公告)日:2024-12-17

    申请号:US17157427

    申请日:2021-01-25

    Abstract: Aspects of the technology described herein improve an object recognition system by specifying a type of picture that would improve the accuracy of the object recognition system if used to retrain the object recognition system. The technology described herein can take the form of an improvement model that improves an object recognition model by suggesting the types of training images that would improve the object recognition model's performance. For example, the improvement model could suggest that a picture of a person smiling be used to retrain the object recognition system. Once trained, the improvement model can be used to estimate a performance score for an image recognition model given the set characteristics of a set of training of images. The improvement model can then select a feature of an image, which if added to the training set, would cause a meaningful increase in the recognition system's performance.

Patent Agency Ranking