Dynamic detection and recognition of media subjects

    公开(公告)号:US11450107B1

    公开(公告)日:2022-09-20

    申请号:US17197478

    申请日:2021-03-10

    摘要: A system for indexing animated content receives detections extracted from a media file, where each one of the detections includes an image extracted from a corresponding frame of the media file that corresponds to a detected instance of an animated character. The system determines, for each of the received detections, an embedding defining a set of characteristics for the detected instance. The embedding associated with each detection is provided to a grouping engine that is configured to dynamically configure at least one grouping parameter based on a total number of the detections received. The grouping engine is also configured to sort the detections into groups using the grouping parameter and the embedding for each detection. A character ID is assigned to each one of the groups of detections, and the system indexes the groups of detections in a database in association with the character ID assigned to each group.

    Head position extrapolation based on a 3D model and image data

    公开(公告)号:US11386609B2

    公开(公告)日:2022-07-12

    申请号:US17138207

    申请日:2020-12-30

    摘要: An approach using 3D algorithms to solve 2D head localization problems is disclosed. A system can extrapolate aspects of one part of an object, e.g., extract characteristics of a person's head, using a 2D input image of another part of the object, e.g., a 2D image of the person's face. The system then selects an appropriate 3D model by the use of facial features detected in an image of a person's face. Using the selected 3D model and the 3D rotation angles provided by a face detector, the system rotates the model and then projects the model to a 2D shape. The system then scales and translates, e.g., transforms, the 2D shape to match the 2D face bounding box. Then, using the transformed 2D shape, the system extracts a bounding box for the extracted portion of an object, e.g., the head of the person depicted in the 2D input image.