-
公开(公告)号:US08135221B2
公开(公告)日:2012-03-13
申请号:US12574716
申请日:2009-10-07
CPC分类号: G06K9/00765 , G10L25/00
摘要: A method for determining a classification for a video segment, comprising the steps of: breaking the video segment into a plurality of short-term video slices, each including a plurality of video frames and an audio signal; analyzing the video frames for each short-term video slice to form a plurality of region tracks; analyzing each region track to form a visual feature vector and a motion feature vector; analyzing the audio signal for each short-term video slice to determine an audio feature vector; forming a plurality of short-term audio-visual atoms for each short-term video slice by combining the visual feature vector and the motion feature vector for a particular region track with the corresponding audio feature vector; and using a classifier to determine a classification for the video segment responsive to the short-term audio-visual atoms.
摘要翻译: 一种用于确定视频段的分类的方法,包括以下步骤:将视频段分解成多个短视频片段,每个短片段包括多个视频帧和音频信号; 分析每个短期视频片段的视频帧以形成多个区域轨道; 分析每个区域轨迹以形成视觉特征向量和运动特征向量; 分析每个短期视频片段的音频信号以确定音频特征向量; 通过将特定区域轨道的视觉特征向量和运动特征向量与相应的音频特征向量组合,形成每个短期视频片段的多个短期视听原子; 并且使用分类器来确定响应于短期视听原子的视频片段的分类。
-
公开(公告)号:US20120281969A1
公开(公告)日:2012-11-08
申请号:US13099391
申请日:2011-05-03
申请人: Wei Jiang , Alexander C. Loui , Courtenay Cotton
发明人: Wei Jiang , Alexander C. Loui , Courtenay Cotton
IPC分类号: G11B27/00
CPC分类号: G11B27/034 , G11B27/11
摘要: A method for producing an audio-visual slideshow for a video sequence having an audio soundtrack and a corresponding video track including a time sequence of image frames, comprising: segmenting the audio soundtrack into a plurality of audio segments; subdividing the audio segments into a sequence of audio frames; determining a corresponding audio classification for each audio frame; automatically selecting a subset of the audio segments responsive to the audio classification for the corresponding audio frames; for each of the selected audio segments automatically analyzing the corresponding image frames to select one or more key image frames; merging the selected audio segments to form an audio summary; forming an audio-visual slideshow by combining the selected key frames with the audio summary, wherein the selected key frames are displayed synchronously with their corresponding audio segment; and storing the audio-visual slideshow in a processor-accessible storage memory.
摘要翻译: 一种用于产生具有音频声轨和包括图像帧的时间序列的对应视频轨迹的视频序列的视听幻灯片放映方法,包括:将所述音频音轨分割为多个音频段; 将音频段细分成音频帧序列; 确定每个音频帧的相应音频分类; 响应于相应音频帧的音频分类自动选择音频段的子集; 对于每个所选择的音频片段,自动分析对应的图像帧以选择一个或多个关键图像帧; 合并所选音频片段以形成音频摘要; 通过将所选择的关键帧与音频摘要组合来形成视听幻灯片,其中所选择的关键帧与其对应的音频片段同步显示; 以及将视听幻灯片放映在处理器可访问的存储存储器中。
-
公开(公告)号:US20110081082A1
公开(公告)日:2011-04-07
申请号:US12574716
申请日:2009-10-07
CPC分类号: G06K9/00765 , G10L25/00
摘要: A method for determining a classification for a video segment, comprising the steps of: breaking the video segment into a plurality of short-term video slices, each including a plurality of video frames and an audio signal; analyzing the video frames for each short-term video slice to form a plurality of region tracks; analyzing each region track to form a visual feature vector and a motion feature vector; analyzing the audio signal for each short-term video slice to determine an audio feature vector; forming a plurality of short-term audio-visual atoms for each short-term video slice by combining the visual feature vector and the motion feature vector for a particular region track with the corresponding audio feature vector; and using a classifier to determine a classification for the video segment responsive to the short-term audio-visual atoms.
摘要翻译: 一种用于确定视频段的分类的方法,包括以下步骤:将视频段分解成多个短视频片段,每个短片段包括多个视频帧和音频信号; 分析每个短期视频片段的视频帧以形成多个区域轨道; 分析每个区域轨迹以形成视觉特征向量和运动特征向量; 分析每个短期视频片段的音频信号以确定音频特征向量; 通过将特定区域轨道的视觉特征向量和运动特征向量与相应的音频特征向量组合,形成每个短期视频片段的多个短期视听原子; 并且使用分类器来确定响应于短期视听原子的视频片段的分类。
-
公开(公告)号:US10134440B2
公开(公告)日:2018-11-20
申请号:US13099391
申请日:2011-05-03
申请人: Wei Jiang , Alexander C. Loui , Courtenay Cotton
发明人: Wei Jiang , Alexander C. Loui , Courtenay Cotton
IPC分类号: G11B27/034 , G11B27/11
摘要: A method for producing an audio-visual slideshow for a video sequence having an audio soundtrack and a corresponding video track including a time sequence of image frames, comprising: segmenting the audio soundtrack into a plurality of audio segments; subdividing the audio segments into a sequence of audio frames; determining a corresponding audio classification for each audio frame; automatically selecting a subset of the audio segments responsive to the audio classification for the corresponding audio frames; for each of the selected audio segments automatically analyzing the corresponding image frames to select one or more key image frames; merging the selected audio segments to form an audio summary; forming an audio-visual slideshow by combining the selected key frames with the audio summary, wherein the selected key frames are displayed synchronously with their corresponding audio segment; and storing the audio-visual slideshow in a processor-accessible storage memory.
-
-
-