Abstract:
A device and method merge a first candidate area relating to a candidate feature in a first image and a second candidate area relating to a candidate feature in a second image. The first and second images have an overlapping region, and at least a portion of the first and second candidate areas are located in the overlapping region. An image overlap size is determined indicating a size of the overlapping region of the first and second images, and a candidate area overlap ratio is determined indicating a ratio of overlap between the first and second candidate areas. A merging threshold is then determined based on the image overlap size, and, on condition that the candidate area overlap ratio is larger than the merging threshold, the first candidate area and the second candidate area are merged, thereby forming a merged candidate area.
Abstract:
A method may include determining a value indicative of an average intensity of blocks in an image. The blocks include a primary and outer blocks. Each of the outer blocks may have three, five, or more than five pixels. The image may describe an external pixel lying between the primary and at least one of the outer blocks. The external pixel may not contribute to the value indicative of the average intensity of any of the blocks. The image may also describe a common internal pixel lying within two of the blocks. The common pixel may contribute to the value indicative of the average intensity of the two of the blocks. The method may include comparing the value indicative of the average intensity of the primary block to the values of the outer blocks, and quantifying a feature represented by the image by generating a characteristic number.
Abstract:
A method and encoder for video encoding a sequence of frames is provided. The method comprises: receiving a sequence of frames depicting a moving object, predicting a movement of the moving object in the sequence of frames between a first time point and a second time point; defining, on basis of the predicted movement of the moving object, a region of interest (ROI) in the frames which covers the moving object during its entire predicted movement between the first time point and the second time point; and encoding a first frame, corresponding to the first time point, in the ROI and one or more intermediate frames, corresponding to time points being intermediate to the first and the second time point, in at least a subset of the ROI using a common encoding quality pattern defining which encoding quality to use in which portion of the ROI.
Abstract:
A method of monitoring a scene by a camera (7) comprises marking a part (14) of the scene with light having a predefined spectral content and a spatial verification pattern. An analysis image is captured of the scene by a sensor sensitive to the predefined spectral content. The analysis image is segmented based on the predefined spectral content, to find a candidate image region. A spatial pattern is detected in the candidate image region, and a characteristic of the detected spatial pattern is compared to a corresponding characteristic of the spatial verification pattern. If the characteristics match, the candidate image region is identified as a verified image region corresponding to the marked part (14) of the scene. Image data representing the scene is obtained, and image data corresponding to the verified image region is processed in a first manner, and remaining image data is processed in a second manner.
Abstract:
A method for updating a coordinate of an annotated point in a digital image due to camera movement is performed by an image processing device, which obtains a current digital image of a scene. The current digital image has been captured by a camera subsequent to movement of the camera relative the scene. The current digital image is associated with at least one annotated point. Each at least one annotated point has a respective coordinate in the current digital image. The method comprises identifying an amount of the movement by comparing position indicative information in the current digital image to position indicative information in a previous digital image of the scene. The previous digital image has been captured prior to movement of the camera. The method comprises updating the coordinate of each at least one annotated point in accordance with the identified amount of movement and a camera homography.
Abstract:
A method for encoding training data for training of a neural network comprises: obtaining training data including multiple datasets, each dataset comprises images annotated with at least one respective object class, forming , each dataset having an individual background class associated with the object class; encoding the images of the datasets to be associated with their respective individual background class; encoding image patches belonging to annotated object classes to be associated with their respective object class; encoding each of the datasets, to include an ignore attribute (“ignore”) to object classes that are annotated only in the other datasets and to background classes formed for the other datasets of the multiple datasets, the ignore attribute indicating that the assigned object class and background classes do not contribute in adapting the neural network in training using the respective dataset; and providing the encoded training data for training of a neural network.
Abstract:
Methods and apparatus, including computer program products, implementing and using techniques for classifying an object occurring in a sequence of images. The object is tracked through the sequence of images. A set of temporally distributed image crops including the object is generated from the sequence of images. The set of image crops is fed to an artificial neural network trained for classifying an object. The artificial network determines a classification result for each image crop. A quality measure of each classification result is determined based on one or more of: a confidence measure of a classification vector output from the artificial neural network, and a resolution of the image crop. The classification result for each image crop is weighed by its quality measure, and an object class for the object is determined by combining the weighted output from the artificial neural network for the set of images.
Abstract:
A method and encoder for video encoding a sequence of frames is provided. The method comprises: receiving a sequence of frames depicting a moving object, predicting a movement of the moving object in the sequence of frames between a first time point and a second time point; defining, on basis of the predicted movement of the moving object, a region of interest (ROI) in the frames which covers the moving object during its entire predicted movement between the first time point and the second time point; and encoding a first frame, corresponding to the first time point, in the ROI and one or more intermediate frames, corresponding to time points being intermediate to the first and the second time point, in at least a subset of the ROI using a common encoding quality pattern defining which encoding quality to use in which portion of the ROI.
Abstract:
Methods and apparatus, including computer program products, implementing and using techniques for configuring an artificial neural network to a particular surveillance situation. A number of object classes characteristic for the surveillance situation are selected. The object classes form a subset of the total number of object classes for which the artificial neural network is trained. A database is accessed that includes activation frequency values for the neurons within the artificial neural network. The activation frequency values are a function of the object class. Those neurons having activation frequency values lower than a threshold value for the subset of selected object classes are removed from the artificial neural network.
Abstract:
Methods and apparatus, including computer program products, implementing and using techniques for classifying an object occurring in a sequence of images. The object is tracked through the sequence of images. A set of temporally distributed image crops including the object is generated from the sequence of images. The set of image crops is fed to an artificial neural network trained for classifying an object. The artificial network determines a classification result for each image crop. A quality measure of each classification result is determined based on one or more of: a confidence measure of a classification vector output from the artificial neural network, and a resolution of the image crop. The classification result for each image crop is weighed by its quality measure, and an object class for the object is determined by combining the weighted output from the artificial neural network for the set of images.