AUTOMATIC STORY GENERATION FOR LIVE MEDIA
    4.
    发明申请

    公开(公告)号:US20190197315A1

    公开(公告)日:2019-06-27

    申请号:US15850697

    申请日:2017-12-21

    申请人: Facebook, Inc.

    摘要: Exemplary embodiments relate to the automatic generation of captions for visual media in the form of a consistent story or narrative. According to some embodiments, story generation may be applied to a live video. As a user records live video, a system may analyze metadata, the frames of the video, and/or the audio to extract context information. The system may integrate this information with information from the user's social network and a personalized language model built using public-facing language from the user. The system may generate multiple captions for the video, where subsequent captions are based at least partially on previous captions. Captions may be generated in a story format so as to be consistent with each other. Information that is inconsistent with the story may be excluded from the captions unless contextual factors indicate that the story should change subject.

    METHOD AND SYSTEM FOR GENERATING A TIME-LAPSE VIDEO

    公开(公告)号:US20190191226A1

    公开(公告)日:2019-06-20

    申请号:US15846351

    申请日:2017-12-19

    摘要: One or more computing devices, systems, and/or methods for generating and/or presenting time-lapse videos and/or live-stream videos are provided. For example, a plurality of video frames may be extracted from a video. A first set of video frames and a second set of video frames may be identified from the plurality of video frames. The first set of video frames may be combined to generate a first time-lapse video frame and the second set of video frames may be combined to generate a second time-lapse video frame. A time-lapse video may be generated based upon the first time-lapse video frame and the second time-lapse video frame. In another example, a time-lapse video may be generated based upon a recorded video associated with a live-stream video. The time-lapse video may be presented. Responsive to a completion of the presenting the time-lapse video, the live-stream video may be presented.

    Method and system for generating a time-lapse video

    公开(公告)号:US10327046B1

    公开(公告)日:2019-06-18

    申请号:US15846351

    申请日:2017-12-19

    申请人: Oath Inc.

    摘要: One or more computing devices, systems, and/or methods for generating and/or presenting time-lapse videos and/or live-stream videos are provided. For example, a plurality of video frames may be extracted from a video. A first set of video frames and a second set of video frames may be identified from the plurality of video frames. The first set of video frames may be combined to generate a first time-lapse video frame and the second set of video frames may be combined to generate a second time-lapse video frame. A time-lapse video may be generated based upon the first time-lapse video frame and the second time-lapse video frame. In another example, a time-lapse video may be generated based upon a recorded video associated with a live-stream video. The time-lapse video may be presented. Responsive to a completion of the presenting the time-lapse video, the live-stream video may be presented.

    System and method for multi-modal fusion based fault-tolerant video content recognition

    公开(公告)号:US10013487B2

    公开(公告)日:2018-07-03

    申请号:US15007872

    申请日:2016-01-27

    IPC分类号: G06K9/00 G06F17/30 G06K9/62

    摘要: A system and a method for multi-modal fusion based fault tolerant video content recognition is disclosed. The method conducts multi-modal recognition on an input video to extract multiple components and their respective appearance time in the video. Next, the multiple components are categorized and recognized respectively via different algorithms. Next, when the recognition confidence of any component is insufficient, a cross-validation with other components is performed to increase the recognition confidence and improve the fault tolerance of the components. Furthermore, when the recognition confidence of an individual component is insufficient, the recognition continues and tracks the component, spatially and temporally when it applies, until frames of high recognition confidence in the continuous time period is reached. Finally, multi-modal fusion is performed to summarize and resolve any recognition discrepancies between the multiple components, and to generate indices for every time frame for the ease of future text-based queries.