-
公开(公告)号:US11342003B1
公开(公告)日:2022-05-24
申请号:US16711797
申请日:2019-12-12
Applicant: Amazon Technologies, Inc.
Inventor: Christian Garcia Siagian , Christian Ciabattoni , David Niu , Lawrence Kyuil Chang , Gordon Zheng , Ritesh Pase , Shiva Krishnamurthy , Ramakanth Mudumba
Abstract: Disclosed are various embodiments for segmenting and classifying video content using sounds. In one embodiment, a plurality of segments of a video content item are generated by analyzing audio accompanying the video content item. A subset of the plurality of segments that correspond to music segments is selected based at least in part on an audio characteristic of the subset of the plurality of segments. Individual segments of the subset of the plurality of segments are processed to determine whether a classification applies to the individual segments. A list of segments of the video content item to which the classification applies is generated.
-
公开(公告)号:US11120839B1
公开(公告)日:2021-09-14
申请号:US16711841
申请日:2019-12-12
Applicant: Amazon Technologies, Inc.
Inventor: Christian Garcia Siagian , Christian Ciabattoni , David Niu , Lawrence Kyuil Chang , Gordon Zheng , Ritesh Pase , Shiva Krishnamurthy , Ramakanth Mudumba
Abstract: Disclosed are various embodiments for segmenting and classifying video content using conversation. In one embodiment, a plurality of segments of a video content item are generated by analyzing audio accompanying the video content item. A subset of the plurality of segments that correspond to conversation segments are selected. Individual segments of the subset of the plurality of segments are processed to determine whether a classification applies to the individual segments. A list of segments of the video content item to which the classification applies is generated.
-
公开(公告)号:US12190871B1
公开(公告)日:2025-01-07
申请号:US17468415
申请日:2021-09-07
Applicant: Amazon Technologies, Inc.
Inventor: Christian Garcia Siagian , Charles Effinger , Nicholas Ren-Jie Capel , Jobel Kyle Petallana Vecino , Gordon Zheng , Kymry Michael Burwell , Stephen Andrew Low
IPC: G10L15/04 , G06Q30/0241 , G10L15/16 , G10L15/18
Abstract: Techniques and methods are disclosed for detecting long-form audio content in one or more audio files. A computing system receives first audio data corresponding to a first version of an audio file and second audio data corresponding to a second version of the audio file. The computing system generates a first transcript of the first audio data and a second transcript of the second audio data. The computing system compares the first audio data and the second audio data and the first transcript and the second transcript to identify advertisement portions and content portions of the audio data. Using a semantic model based on a machine learning (ML) transformer, the computing system can determine advertisement segments within the advertisement portions, the advertisement segments corresponding to separate advertisements. Information corresponding to the duration and location of the advertisement segments is stored in a data store of the computing system.
-
公开(公告)号:US12026199B1
公开(公告)日:2024-07-02
申请号:US17690931
申请日:2022-03-09
Applicant: Amazon Technologies, Inc.
Inventor: Christian Garcia Siagian , Vedant Ulhas Shete , Timothy William Stephani , Jobel Kyle Petallana Vecino , Gordon Zheng
IPC: G06F16/683 , G06F16/34 , G06F16/68 , G06F40/284 , G06F40/295
CPC classification number: G06F16/685 , G06F16/345 , G06F16/686 , G06F40/284 , G06F40/295
Abstract: Pages describing episodes of podcasts or other media entities are constructed by interpreting content of the media entities. A transcript of an episode is determined by one or more natural language understanding techniques and divided into chapters. For each of the chapters, a summary sentence of the chapter and one or more key phrases are determined from the transcript, and participants in the chapter are identified. A summary of the episode is determined from the summary sentences of each of the chapters. A page that describes the episode of the podcast including the summary of the episode, as well as one or more of the key phrases and identities of the participants is generated and provided to prospective listeners to the episode.
-
-
-