-
公开(公告)号:US20240169725A1
公开(公告)日:2024-05-23
申请号:US18240419
申请日:2023-08-31
Applicant: Spherex, Inc.
Inventor: Teresa Ann Phillips , Pranav Anand Joshi , Kira Michelle McStay
IPC: G06V20/40 , G06F18/214 , G06F18/241 , G06F18/40 , G06T7/00 , H04N21/466 , H04N21/472
CPC classification number: G06V20/41 , G06F18/214 , G06F18/241 , G06F18/40 , G06T7/0002 , H04N21/4662 , H04N21/472 , G06T2207/10016 , G06V20/44 , G06V2201/10
Abstract: Various embodiments described herein support or provide for annotation of a media asset, such as an audio asset or a video asset, based on one or more events identified within content of the media asset. In particular, some embodiments can determine one or more of the following details with respect to content of a given media asset, which can represent annotations that enable determination of contextual information for the given media asset: events; event classification labels for events; subclassifications labels for events; scenes comprising events; attributes of scenes; themes presented by the content; and title-level attributes of the given media asset.
-
公开(公告)号:US11983923B1
公开(公告)日:2024-05-14
申请号:US18063107
申请日:2022-12-08
Applicant: NETFLIX, INC.
Inventor: Yadong Wang , Kyle Tacke , Shilpa Jois Rao
IPC: G06V20/40 , G10L25/57 , G10L25/60 , G11B27/031
CPC classification number: G06V20/41 , G06V20/48 , G10L25/57 , G10L25/60 , G11B27/031 , G06V2201/10
Abstract: The disclosed computer-implemented method may include receiving, as input, an audio/video data object; isolating a video stream of a visible potential speaker over a plurality of frames of the audio/video data object; isolating an audio stream over the plurality of frames; providing the isolated video stream and the isolated audio stream to a machine learning model trained with contrastive learning, the contrastive learning using (i) a corpus of video segments of visible speakers with corresponding original audio for positive samples; and (ii) a corpus of video segments of visible speakers with corresponding dubbed audio for negative samples; and evaluating a match between the isolated audio stream and the isolated video stream based at least in part on an output of the machine learning model. Various other methods, systems, and computer-readable media are also disclosed.
-
公开(公告)号:US20240144708A1
公开(公告)日:2024-05-02
申请号:US18404746
申请日:2024-01-04
Applicant: DELTA ELECTRONICS, INC. , NATIONAL CHENG KUNG UNIVERSITY
Inventor: Chih-Yang CHEN , Pau-Choo CHUNG CHAN , Sheng-Hao TSENG
IPC: G06V30/142 , A61B5/00 , G06F18/2433 , G06T7/00 , G06T7/11 , G06V10/25 , G06V10/54 , G06V10/72 , G16H50/20
CPC classification number: G06V30/142 , A61B5/0013 , A61B5/0071 , A61B5/0088 , A61B5/7264 , G06F18/2433 , G06T7/0012 , G06T7/11 , G06V10/25 , G06V10/54 , G06V10/72 , G16H50/20 , A61B2576/02 , G06F2218/08 , G06F2218/12 , G06T2207/10064 , G06T2207/10152 , G06T2207/30036 , G06T2207/30088 , G06T2207/30096 , G06V2201/03 , G06V2201/07 , G06V2201/10
Abstract: An examination system is provided. The examination system includes an optical detector and analyzer. The optical detector emits a detection light source toward a target object and detects a respondent light which is induced from the target object in response to the detection light source to generate image data. The image data indicates a detection image. The analyzer receives the image data and determines which region of the target object the detection image belongs to according to the image data. When the analyzer determines that the detection image belongs to a specific region of the target object, the analyzer extracts at least one feature of the image data to serve as a basis for classification of the specific region.
-
274.
公开(公告)号:US20240144676A1
公开(公告)日:2024-05-02
申请号:US17976528
申请日:2022-10-28
Applicant: Nielsen Consumer LLC
Inventor: Roberto Arroyo , Sergio Álvarez Pardo , Aitor Aller , Miguel Eduardo Ortiz , Luis Miguel Bergasa
CPC classification number: G06V20/36 , G06F40/40 , G06V10/82 , G06V2201/07 , G06V2201/10
Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for providing responses to queries regarding store observation images. An example computer readable medium includes instructions that, when executed, cause a machine to at least obtain first metadata associated with a set of store dictionaries, select ones of the set of store dictionaries for use based on the associated first metadata, obtain second metadata associated with a set of question templates, select ones of the set of question templates for use based on the associated second metadata, generate question-answer pairs using the selected ones of the set of store dictionaries and the selected ones of the set of question templates, train a machine-learning model using the question-answer pairs, and provide query responses using the trained machine-learning model.
-
公开(公告)号:US11966986B2
公开(公告)日:2024-04-23
申请号:US17878778
申请日:2022-08-01
Applicant: Meta Platforms, Inc.
Inventor: Shivani Poddar , Seungwhan Moon , Paul Anthony Crook , Rajen Subba
IPC: H04L67/306 , G06F3/01 , G06F9/451 , G06F9/48 , G06F9/54 , G06F16/332 , G06F16/9032 , G06F16/9536 , G06F18/2321 , G06F40/205 , G06F40/242 , G06F40/253 , G06F40/30 , G06F40/35 , G06F40/56 , G06N3/045 , G06N3/047 , G06N3/08 , G06N20/00 , G06Q10/109 , G06Q50/00 , G06V10/20 , G06V10/764 , G06V10/82 , G06V20/00 , G06V20/20 , G06V20/30 , G06V40/16 , G06V40/20 , G10L15/06 , G10L15/08 , G10L15/16 , G10L15/18 , G10L15/22 , G10L15/30 , G10L15/32 , H04L51/18 , H04L51/212 , H04L51/222 , H04L51/224 , H04L51/52 , H04L67/75 , H04N7/14 , G06F3/16 , G06V20/40
CPC classification number: G06Q50/01 , G06F3/011 , G06F3/013 , G06F9/453 , G06F9/485 , G06F9/4881 , G06F9/547 , G06F16/3329 , G06F16/90332 , G06F16/9536 , G06F18/2321 , G06F40/205 , G06F40/242 , G06F40/253 , G06F40/30 , G06F40/35 , G06F40/56 , G06N3/045 , G06N3/047 , G06N3/08 , G06N20/00 , G06Q10/109 , G06V10/255 , G06V10/764 , G06V10/82 , G06V20/00 , G06V20/20 , G06V20/30 , G06V40/16 , G06V40/25 , G10L15/063 , G10L15/08 , G10L15/16 , G10L15/1815 , G10L15/1822 , G10L15/22 , G10L15/30 , G10L15/32 , H04L51/18 , H04L51/212 , H04L51/222 , H04L51/224 , H04L51/52 , H04L67/306 , H04L67/75 , H04N7/147 , G06F3/017 , G06F3/167 , G06V20/41 , G06V40/174 , G06V2201/10 , G10L2015/088 , G10L2015/223 , G10L2015/227
Abstract: In one embodiment, a method includes receiving, at a client system, an audio input, where the audio input comprises a coreference to a target object, accessing visual data from one or more camera associated with the client system, where the visual data comprises images portraying one or more objects, resolving the coreference to the target object from among the one or more objects, resoling the target object to a specific entity, and providing, at the client system, a response to the audio input, where the response comprises information about the specific entity.
-
公开(公告)号:US20240112450A1
公开(公告)日:2024-04-04
申请号:US18537951
申请日:2023-12-13
Applicant: OLYMPUS CORPORATION
Inventor: Koichi SHINTANI , Akira TANI , Osamu NONAKA , Manabu ICHIKAWA , Tomoko GOCHO
IPC: G06V10/774 , G06V10/764
CPC classification number: G06V10/774 , G06V10/764 , G06V2201/03 , G06V2201/10
Abstract: An information processing device of the present invention is capable of collaboration with a learning device to determine whether an image group that has been obtained in time series by the first endoscope is an image group obtained at a first time or at a second time, and to create a first inference model for image feature determination of images for the first endoscope by performing learning with results of having performed annotation on the image group that was obtained at the second time as training data, the information processing device comprising at least one or a plurality of classifying processors that classify image groups constituting training data candidates, within an image group from the first endoscope that has been newly acquired, or an image group from a second endoscope, using the image group that has been obtained at the first time, when the first inference model was created.
-
公开(公告)号:US20240112217A1
公开(公告)日:2024-04-04
申请号:US17937342
申请日:2022-09-30
Applicant: Shutterstock, Inc.
Inventor: Emily Teoh , Diarmaid Finnerty , Lauren Sarah Burnham-King , Veronica Darling , Alessandra Sala
IPC: G06Q30/02 , G06F3/0482 , G06V20/30
CPC classification number: G06Q30/0242 , G06F3/0482 , G06V20/30 , G06V2201/10
Abstract: Methods and systems for predicting performance of creative content are disclosed. Exemplary implementations may: receive a collection of images; provide a context to a user; serially cause display of pairs of images on a computer interface; receive user responses indicating which image of each pair is preferred given the context; determine a resonance value for each image based on a number of times the user responses indicate each image is preferred when displayed in a pair of images; determine a confidence score for each image; generate one or more models for predicting image performance based on one or more of the resonance value and the confidence score for each image; receive a plurality of candidate images; determine, using at least one model, a first metric set for each candidate; and cause display of a listing of the candidate images, the listing including the first metric set for each candidate image.
-
278.
公开(公告)号:US20240096070A1
公开(公告)日:2024-03-21
申请号:US17932813
申请日:2022-09-16
Applicant: WENEW, Inc.
Inventor: Michael De Leon Figge , Randy Chia-Wei Chung , Joseph Young Jim Kim
IPC: G06V10/774 , G06F16/583 , G06T7/00 , G06T11/00 , G06V10/77 , G06V10/94
CPC classification number: G06V10/7747 , G06F16/583 , G06T7/0002 , G06T11/00 , G06V10/7715 , G06V10/94 , G06T2207/30168 , G06V2201/10
Abstract: A technique is directed to methods and systems for generating and processing digital images associated with non-fungible tokens. The image generation system can download a group of related images from a blockchain network and inspect the images to verify attributes and traits of the images meet a quality threshold. The system can generate and add a rarity item, such as a unique feature, to one or more of the images.
-
公开(公告)号:US20240096063A1
公开(公告)日:2024-03-21
申请号:US18078402
申请日:2022-12-09
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ganesh ANANTHANARAYANAN , Yuanchao SHU , Paramvir BAHL , Tsuwang HSIEH
IPC: G06V10/77
CPC classification number: G06V10/7715 , G06V2201/10
Abstract: Systems and methods are provided for reusing and retraining an image recognition model for video analytics. The image recognition model is used for inferring a frame of video data that is captured at edge devices. The edge devices periodically or under predetermined conditions transmits a captured frame of video data to perform inferencing. The disclosed technology is directed to select an image recognition model from a model store for reusing or for retraining. A model selector uses a gating network model to determine ranked candidate models for validation. The validation includes iterations of retraining the image recognition model and stopping the iteration when a rate of improving accuracy by retraining becomes smaller than the previous iteration step. Retraining a model includes generating reference data using a teacher model and retraining the model using the reference data. Integrating reuse and retraining of models enables improvement in accuracy and efficiency.
-
公开(公告)号:US20240089552A1
公开(公告)日:2024-03-14
申请号:US18514215
申请日:2023-11-20
Applicant: Videokawa, Inc.
Inventor: Steven Selfors
IPC: H04N21/485 , G06F16/78 , G06V10/70 , G06V20/40 , H04N21/44 , H04N21/472 , H04N21/488 , H04N21/84 , H04N21/858
CPC classification number: H04N21/4856 , G06F16/7867 , G06V10/768 , G06V20/49 , H04N21/44 , H04N21/47217 , H04N21/4882 , H04N21/84 , H04N21/8586 , G06V2201/10
Abstract: Aspects described herein may provide systems, methods, and device for facilitating language learning using videos. Subtitles may be displayed in a first, target language or a second, native language during display of the video. On a pause event, both the target language subtitle and the native language subtitle may be displayed simultaneously to facilitate understanding. While paused, a user may select an option to be provided with additional contextual information indicating usage and context associated with one or more words of the target language subtitle. The user may navigate through previous and next subtitles with additional contextual information while the video is paused. Other aspects may allow users to create auto-continuous video loops of definable duration, and may allow users to generate video segments by searching an entire database of subtitle text, and may allow users create, save, share, and search video loops.
-
-
-
-
-
-
-
-
-