-
11.
公开(公告)号:US12108050B2
公开(公告)日:2024-10-01
申请号:US17760017
申请日:2021-02-12
Applicant: Nokia Technologies Oy
Inventor: Jani Lainema , Francesco Cricri , Emre Baris Aksu , Alireza Zare , Miska Matias Hannuksela
IPC: H04N19/149 , G06N3/045 , H04N19/159 , H04N19/176
CPC classification number: H04N19/149 , G06N3/045 , H04N19/159 , H04N19/176
Abstract: The embodiments relate to a method for encoding and a decoding, and apparatuses for the same. The method for encoding comprises receiving a block of a video frame for encoding (1510); making a decision on whether or not a learning-based model is to be applied as a processing step for encoding the block (1520); applying the learning-based model for said input block according to the decision, where the learning-based model has been selectively fine-tuned according to information relating to activation of the learning-based model of previously-decoded blocks (1530); encoding a signal corresponding to the decision on usage of the learning-based model into a bitstream (1540); and encoding the block into a bitstream with an information whether the block is to be used for finetuning (1550).
-
公开(公告)号:US11457244B2
公开(公告)日:2022-09-27
申请号:US17043925
申请日:2019-03-29
Applicant: Nokia Technologies Oy
Inventor: Miska Hannuksela , Jani Lainema , Francesco Cricri
IPC: H04N7/12 , H04N19/85 , H04N19/124 , H04N19/172 , H04N19/176 , G06N3/08
Abstract: A method comprising: obtaining a block of a picture or a picture in an encoder; determining if the block/picture is used for on-line learning; if affirmative, encoding the block/picture; reconstructing a coarse version of the block/picture or the respective prediction error block/picture; enhancing the coarse version using a neural net; fine-tuning the neural net with a training signal based on the coarse version; determining if the block/picture is enhanced using the neural net; and if affirmative, encoding the block/picture with enhancing using the neural net.
-
公开(公告)号:US11228767B2
公开(公告)日:2022-01-18
申请号:US16771115
申请日:2018-12-03
Applicant: Nokia Technologies Oy
Inventor: Miska Hannuksela , Mikko Honkala , Jani Lainema , Francesco Cricri , Emre Aksu
IPC: H04N19/149 , H04N19/176 , H04N19/436 , H04N19/65 , G06N3/08
Abstract: A method comprising: deriving a first prediction block (608) at least partly based on an output of a neural net (602) using a first set of parameters; deriving a first encoded prediction error block (614-620) through encoding a difference of the first prediction block and a first input block; encoding (620) the first encoded prediction error block into a bitstream; deriving a first reconstructed prediction error block (624) from the first encoded prediction error block; deriving a training signal (628) from one or both of the first encoded prediction error block and/or the first reconstructed prediction error block (624); retraining (630) the neural net (602) with the training signal (628) to obtain a second set of parameters for the neural net (602); deriving a second prediction block (608) at least partly based on an output of the neural net using the second set of parameters; deriving a second encoded prediction error block (614-620) through encoding a difference of the second prediction block and a second input block; and encoding (620) the second encoded prediction error block into a bitstream. The invention relates to image or video encoding or decoding, especially by online training a neural network (602) that is in the prediction loop.
-
公开(公告)号:US20210314573A1
公开(公告)日:2021-10-07
申请号:US17218967
申请日:2021-03-31
Applicant: Nokia Technologies Oy
Inventor: Honglei Zhang , Hamed Rezazadegan Tavakoli , Francesco Cricri , Miska Matias Hannuksela , Emre Aksu , Nam Le
IPC: H04N19/146 , H04N19/85 , H04N19/436 , H04N19/103
Abstract: An apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: decode encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extract features from the decoded data; decode encoded residual features to generate decoded residual features; and generate enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data.
-
公开(公告)号:US20210218997A1
公开(公告)日:2021-07-15
申请号:US17137609
申请日:2020-12-30
Applicant: Nokia Technologies Oy
Inventor: Hamed Rezazadegan Tavakoli , Francesco Cricri , Miska Matias Hannuksela , Emre Baris Aksu , Honglei Zhang , Nam Le
IPC: H04N19/61 , H04N19/124 , H04N19/134 , H04N19/176 , H04N19/192
Abstract: Data may be encoded to minimize distortion after decoding, but the quality required for presentation of the decoded data to a machine and the quality required for presentation to a human may be different. To accommodate different quality requirements, video data may be encoded to produce a first set of encoded data and a second set of encoded data, where the first set may be decoded for use by one of a machine consumer or a human consumer, and a combination of the first set and the second set may be decoded for use by the other of a machine consumer or a human consumer. The first and second set may be produced with a neural encoder and a neural decoder, and/or may be produced with the use of prediction and transform neural network modules. A human-targeted structure and a machine-targeted structure may produce the sets of encoded data.
-
公开(公告)号:US11062210B2
公开(公告)日:2021-07-13
申请号:US16589620
申请日:2019-10-01
Applicant: Nokia Technologies Oy
Inventor: Caglar Aytekin , Francesco Cricri , Xingyang Ni
Abstract: A method, apparatus and computer program product provide an automated neural network training mechanism. The method, apparatus and computer program product receive a decoded noisy image and a set of input parameters for a neural network configured to optimize the decoded noisy image. A denoised image is generated based on the decoded noisy image and the set of input parameters. A denoised noisy error is computed representing an error between the denoised image and the decoded noisy image. The neural network is trained using the denoised noisy error and the set of input parameters and a ground truth noisy error value is received representing an error between the original image and the encoded image. The ground truth noisy error value is compared with the denoised noisy error to determine whether a difference between the ground truth noisy error value and the denoised noisy error is within a pre-determined threshold.
-
公开(公告)号:US20210195358A1
公开(公告)日:2021-06-24
申请号:US16077856
申请日:2017-02-15
Applicant: Nokia Technologies Oy
Inventor: Francesco Cricri , Arto Lehtiniemi , Antti Eronen
Abstract: A method comprising: remotely sensing a real acoustic environment, in which multiple audio signals are captured; and enabling automatic control of mixing of the multiple captured audio signals based on the remote sensing of the real acoustic environment in which the multiple audio signals were captured.
-
公开(公告)号:US10831443B2
公开(公告)日:2020-11-10
申请号:US16326306
申请日:2017-08-22
Applicant: Nokia Technologies Oy
Inventor: Francesco Cricri , Arto Lehtiniemi , Antti Eronen , Jussi Leppänen
Abstract: A method, apparatus and computer program code is provided. The method comprises: causing display of a virtual object at a first position in virtual space, the virtual object having a visual position and an aural position at the first position; processing positional audio data based on the aural position of the virtual object being at the first position; causing positional audio to be output to a user based on the processed positional audio data; changing the aural position of the virtual object from the first position to a second position in the virtual space, while maintaining the visual position of the virtual object at the first position; further processing positional audio data based on the aural position of the virtual object being at the second position; and causing positional audio to be output to the user based on the further processed positional audio data, while maintaining the visual position of the virtual object at the first position.
-
公开(公告)号:US09946957B2
公开(公告)日:2018-04-17
申请号:US15185550
申请日:2016-06-17
Applicant: Nokia Technologies Oy
Inventor: Francesco Cricri
CPC classification number: G06K9/6218 , G06K9/3275 , G06K9/4604 , G06K9/4676 , G06K9/62 , G06K9/6255 , G06K9/6267 , G06K9/629
Abstract: Examples of the present disclosure relate to a method, apparatus, computer program and system for image analysis. According to certain examples, there is provided method comprising causing, at least in part, actions that result in: receiving orientation information of an image capturing device; receiving one or more features detected from an image captured by the image capturing device; and selecting a clustering model for clustering the features, wherein the clustering model is selected, at least in part, in dependence upon the orientation information.
-
公开(公告)号:US20170263032A1
公开(公告)日:2017-09-14
申请号:US15452154
申请日:2017-03-07
Applicant: Nokia Technologies Oy
Inventor: Francesco Cricri , Jukka Saarinen
Abstract: A method, apparatus, and computer product for: determining that the location of a user satisfies at least one spatial boundary condition; and in response to said determination, causing the presentation of an avatar to the user, wherein the presentation of the avatar comprises presenting an instruction given by the avatar to the user.
-
-
-
-
-
-
-
-
-