-
公开(公告)号:US20240153228A1
公开(公告)日:2024-05-09
申请号:US17980322
申请日:2022-11-03
Applicant: Black Sesame Technologies Inc.
Inventor: HungTing Liu , Bo Li , Shuen Lyu
IPC: G06V10/25 , G06T7/13 , G06V10/44 , G06V10/74 , G06V10/762 , G06V10/771
CPC classification number: G06V10/25 , G06T7/13 , G06V10/44 , G06V10/761 , G06V10/762 , G06V10/771 , G06T2207/20132 , G06T2207/30201 , G06V2201/07
Abstract: Disclosed is a system for automatic cropping of an image of interest from a video sample using smart systems. The image of interest is an image representative of the video sample, which includes desirable characteristics as required by the user, such as a person or object of focus, a specific aspect-ratio, preferred landmarks, information/time-stamps etc. The system for automatic cropping analyzes the video sample and its content to detect at least one image feature. The image feature is then classified based on importance and a potential test cropping area is determined based on the cumulative importance of features detected within each frame. The smart cropping systems and methods disclosed ensure that the most relevant aspects of a video sample are included within the image of interest.
-
公开(公告)号:US20230281843A1
公开(公告)日:2023-09-07
申请号:US17688694
申请日:2022-03-07
Applicant: Black Sesame Technologies Inc.
Inventor: Tiecheng Wu , Bo Li
CPC classification number: G06T7/50 , G06T7/13 , G06T2207/20081
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model configured to generate a predicted depth image, comprising receiving data representing training samples that include a plurality of image pairs, each image pair includes a target image and a reference image both capturing a particular scene from different orientations; for each of the plurality of image pairs, generating a compressed cost volume for the image pair; providing the compressed cost volume as an input to the machine learning model; generating, using the machine learning model, output data representing a predicted disparity map for the compressed cost volume; and generating a total loss using the predicted disparity map for the compressed cost volume, the total loss includes a boundary loss, an occlusion loss, and a transfer loss; and updating the plurality of parameters of the machine learning model by minimizing the total losses.
-
公开(公告)号:US11636683B2
公开(公告)日:2023-04-25
申请号:US17474774
申请日:2021-09-14
Applicant: Black Sesame Technologies Inc.
Inventor: Fangwen Tu , Bo Li
Abstract: The present invention discloses a system for precise representation of object segmentation with multi-modal input for real-time video applications. The multi-modal segmentation system takes advantage of optical, temporal as well as spatial information to enhance the segmentation for AR and VR or other entrainment purpose with accurate details. The system can segment foreground objects such as human and salient objects within a video frame and allows locating object-of-interest for multiple-purposes.
-
公开(公告)号:US20230084980A1
公开(公告)日:2023-03-16
申请号:US17474965
申请日:2021-09-14
Applicant: Black Sesame Technologies Inc.
Abstract: The present invention discloses a liveliness detection technique. The technique is described for identifying facial attributes. The technique identifies the presented face in the image as real or deceptive. The system and method includes identifying the facial attributes and utilizing a multi task learning network. The neural network includes segmentation and classification functionalities. The final output is used to get pixel level semantic information and high level semantic information.
-
公开(公告)号:US12190535B2
公开(公告)日:2025-01-07
申请号:US17688694
申请日:2022-03-07
Applicant: Black Sesame Technologies Inc.
Inventor: Tiecheng Wu , Bo Li
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a machine learning model configured to generate a predicted depth image, comprising receiving data representing training samples that include a plurality of image pairs, each image pair includes a target image and a reference image both capturing a particular scene from different orientations; for each of the plurality of image pairs, generating a compressed cost volume for the image pair; providing the compressed cost volume as an input to the machine learning model; generating, using the machine learning model, output data representing a predicted disparity map for the compressed cost volume; and generating a total loss using the predicted disparity map for the compressed cost volume, the total loss includes a boundary loss, an occlusion loss, and a transfer loss; and updating the plurality of parameters of the machine learning model by minimizing the total losses.
-
-
-
-