摘要:
A method for selecting a crop score threshold for enhancing tracking of objects in a scene captured in a video sequence is disclosed. A respective track is obtained for two different objects, each track comprising crops of object instances of the objects in in a video sequence, each crop having a crop score and a feature vector. Each track is split into respective more tracklets thereby forming four or more tracklets. For each candidate crop score threshold a respective difference between each tracklet and each other tracklet is determined based on differences between feature vectors of crops having a crop score above the candidate crop score threshold of each tracklet, and each other tracklet. A crop score threshold is selected from the set of crop score thresholds resulting in a maximum difference between the differences between tracklets of different tracks and the differences between tracklets of the same track.
摘要:
A method and an image processing entity for applying a convolutional neural network to an image are disclosed. The image processing entity processes the image while using the convolutional kernel to render a feature map, whereby a second feature map size of the feature map is greater than a first feature map size of the feature maps with which the feature kernel was trained. Furthermore, the image processing entity repeatedly applies the feature kernel to the feature map in a stepwise manner, wherein the feature kernel was trained to identify the feature based on the feature maps of the first feature maps, wherein the feature kernel has the first feature map size.
摘要:
A method and system for determining a position of a camera is disclosed. The method and system includes determining and registering geographical coordinates of a mobile device in the mobile device itself, presenting on a display of the mobile device a pattern representing the geographical coordinates of the mobile device, capturing by the camera an image of the display of the mobile device when presenting the geographical coordinates, translating in the camera the pattern in the captured image of the display of the mobile device into geographical coordinates, and determining in the camera the position of the camera based on the geographical coordinates translated from the pattern in the captured image.
摘要:
In a method for tracking an object in video-monitoring scenes, multiple feature vectors are extracted (722) and assembled (724) in point clouds, wherein a point cloud may be assembled for each tracklet, i.e. for each separate part of a track. In order to determine if different tracklets relate to the same or different objects the point clouds of each tracklet is compared (734). Based on the outcome of the comparison it is deduced if the first object and the second object may be considered to be the same object and, if so, the first object is associated (738) with the second object.
摘要:
A method and an image processing entity for applying a convolutional neural network to an image are disclosed. The image processing entity processes the image while using the convolutional kernel to render a feature map, whereby a second feature map size of the feature map is greater than a first feature map size of the feature maps with which the feature kernel was trained. Furthermore, the image processing entity repeatedly applies the feature kernel to the feature map in a stepwise manner, wherein the feature kernel was trained to identify the feature based on the feature maps of the first feature maps, wherein the feature kernel has the first feature map size.
摘要:
A method for finding one or more candidate digital images being likely candidates for depicting a specific object comprising: receiving an object digital image depicting the specific object; determining, using a classification subnet of a convolutional neural network, a class for the specific object depicted in the object digital image; selecting, based on the determined class for the specific object depicted in the object digital image, a feature vector generating subnet from a plurality of feature vector generating subnets; determining, by the selected feature vector generating subnet, a feature vector of the specific object depicted in the object digital image; locating one or more candidate digital images being likely candidates for depicting the specific object depicted in the object digital image by comparing the determined feature vector and feature vectors registered in a database, wherein each registered feature vector is associated with a digital image.
摘要:
Methods and apparatus, including computer program products, for creating a quality annotated training data set of images for training a quality estimating neural network. A set of images depicting a same object is received. The images in the set of images have varying image quality. A probe image whose quality is to be estimated is selected from the set of images. A gallery of images is selected from the set of images. The gallery of images does not include the probe image. The probe image is compared to each image in the gallery and a match score is generated for each image comparison. Based on the match scores, a quality value is determined for the probe image. The probe image and its associated quality value are added to a quality annotated training data set for the neural network.
摘要:
A method and encoder for video encoding a sequence of frames is provided. The method comprises: receiving a sequence of frames depicting a moving object, predicting a movement of the moving object in the sequence of frames between a first time point and a second time point; defining, on basis of the predicted movement of the moving object, a region of interest (ROI) in the frames which covers the moving object during its entire predicted movement between the first time point and the second time point; and encoding a first frame, corresponding to the first time point, in the ROI and one or more intermediate frames, corresponding to time points being intermediate to the first and the second time point, in at least a subset of the ROI using a common encoding quality pattern defining which encoding quality to use in which portion of the ROI.
摘要:
A method and system for determining a position of a camera is disclosed. The method and system includes determining and registering geographical coordinates of a mobile device in the mobile device itself, presenting on a display of the mobile device a pattern representing the geographical coordinates of the mobile device, capturing by the camera an image of the display of the mobile device when presenting the geographical coordinates, translating in the camera the pattern in the captured image of the display of the mobile device into geographical coordinates, and determining in the camera the position of the camera based on the geographical coordinates translated from the pattern in the captured image.
摘要:
A method, device and computer program product for training neural networks being adapted to process image data and output a vector of values forming a feature vector for the processed image data. The training is performed using feature vectors from a reference neural network as ground truth. A system of devices for tracking an object using feature vectors outputted by neural networks running on the devices.