摘要:
An audio privacy system reduces the intelligibility of speech in an audio signal while preserving prosodic information, such as pitch, relative energy and intonation so that a listener has the ability to recognize environmental sounds but not the speech itself. An audio signal is processed to separate non-vocalic information, such as pitch and relative energy of speech, from vocalic regions, after which syllables are identified within the vocalic regions. Representations of the vocalic regions are computed to produce a vocal tract transfer function and an excitation. The vocal tract transfer function for each syllable is then replaced with the vocal tract transfer function from another prerecorded vocalic sound. In one aspect, the identity of the replacement vocalic sound is independent of the identity of the syllable being replaced. A modified audio signal is then synthesized with the original prosodic information and the modified vocal tract transfer function to produce unintelligible speech that preserves the pitch and energy of the speech as well as environmental sounds.
摘要:
An audio privacy system reduces the intelligibility of speech in an audio signal while preserving prosodic information, such as pitch, relative energy and intonation so that a listener has the ability to recognize environmental sounds but not the speech itself. An audio signal is processed to separate non-vocalic information, such as pitch and relative energy of speech, from vocalic regions, after which syllables are identified within the vocalic regions. Representations of the vocalic regions are computed to produce a vocal tract transfer function and an excitation. The vocal tract transfer function for each syllable is then replaced with the vocal tract transfer function from another prerecorded vocalic sound. In one aspect, the identity of the replacement vocalic sound is independent of the identity of the syllable being replaced. A modified audio signal is then synthesized with the original prosodic information and the modified vocal tract transfer function to produce unintelligible speech that preserves the pitch and energy of the speech as well as environmental sounds.
摘要:
Systems and methods for interactive, user-driven detection, creation and completion of form fields in a digital document are provided. A document with form fields that require completion by a user is received, after which form fields are detected at the direction of the user. Once the user selects a possible form field, the system creates the appropriate fillable form field based on size, type, location, related text and other parameters of the form field and surrounding document. Additional levels of interaction include predictive text, pattern development and automatic completion of previously completed fields.
摘要:
The subject invention relates to a system and method for video summarization, and more specifically to a system for segmenting and classifying data from a video in order to create a summary video that preserves and summarizes relevant content. In one embodiment, the system first extracts appearance, motion, and audio features from a video in order to create video segments corresponding to the extracted features. The video segments are then classified as dynamic or static depending on the appearance-based and motion-based features extracted from each video segment. The classified video segments are then grouped into clusters to eliminate redundant content. Select video segments from each cluster are selected as summary segments, and the summary segments are compiled to form a summary video. The parameters for any of the steps in the summarization of the video can be altered so that a user can adapt the system to any type of video, although the system is designed to summarize unstructured videos where the content is unknown. In another aspect, audio features can also be used to further summarize video with certain audio properties.
摘要:
The subject invention relates to a system and method for video summarization, and more specifically to a system for segmenting and classifying data from a video in order to create a summary video that preserves and summarizes relevant content. In one embodiment, the system first extracts appearance, motion, and audio features from a video in order to create video segments corresponding to the extracted features. The video segments are then classified as dynamic or static depending on the appearance-based and motion-based features extracted from each video segment. The classified video segments are then grouped into clusters to eliminate redundant content. Select video segments from each cluster are selected as summary segments, and the summary segments are compiled to form a summary video. The parameters for any of the steps in the summarization of the video can be altered so that a user can adapt the system to any type of video, although the system is designed to summarize unstructured videos where the content is unknown. In another aspect, audio features can also be used to further summarize video with certain audio properties.
摘要:
A system helps filter and correct video captured and streamed from a mobile device. In particular, the system detects and streams content shown on screens, allowing anyone to stream screen content immediately without needing to develop hooks into external software (i.e. without installing a screen recorder software in the computer). The system can use a variety of user-selectable techniques to detect the screen, and utilizes the mobile device's touchscreen to allow users to manually override detected corners. However, some of these approaches could potentially be applied to other types of content, such as identifying TV screens, appliance LCD screens, other mobile devices' screens, multifunction devices. (e.g. a remote technician could help troubleshoot a malfunctioning MFD by having the end-user point his cellphone to the LCD screen of the MFD).
摘要:
Embodiments of the present invention include a video server that can detect and track the image of a pointing indicator in an input video stream representation of a computer display. The video server checks ordered frames of the video signal and determines movements for a pointing indicator such as a mouse arrow. Certain motions by the pointing indicator, such as lingering over a button or menu item or circling a button or menu item can provoke a control action on the server.
摘要:
A method for navigating instructional video presentations is disclosed. The method includes determining a pause mode of a video presentation, and playing the video presentation on a display device. The video presentation has one or more predetermined pause positions. The method also includes, while playing the video presentation, determining that the video presentation has reached one of the one or more pause positions. The method further includes, in accordance with a determination that the video presentation is in a first pause mode, pausing the video presentation at the one of the one or more pause positions and maintaining a display of a paused frame of the video presentation, and, in accordance with a determination that the video presentation is in a second pause mode distinct from the first pause mode, continuing to play the video presentation through the one of the one or more pause positions.
摘要:
A method and system for delivery of targeted advertisement via multifunction document imaging devices. Imaging devices used for copying, scanning, faxing and printing documents are used to deliver advertisements, coupons, and other promotional material to users. The imaging device is capable of delivering targeted promotional material based on analysis of the documents content passing through the device. Targeting is based on device history, user history or user demographics. Device history and user history are compiled from the contents of the documents processed respectively at a device and by a user. Demographics are inferred from a demographics model using user identity or document content input to the model. Advertisements may be delivered via paper, the device display, and other means.