Abstract:
Disclosed herein are a video decoding method and apparatus and a video encoding method and apparatus. In quantization and dequantization, multiple quantization methods and multiple dequantization methods may be used. The multiple quantization methods include a variable-rate step quantization method and a fixed-rate step quantization method. The variable-rate step quantization method may be a quantization method in which an increment in a quantization step depending on an increase in a value of a quantization parameter by 1 is not fixed. The fixed-rate step quantization method may be a quantization method in which the increment in the quantization step depending on the increase of the value of the quantization parameter by 1 is fixed.
Abstract:
Methods for Joint Photographic Experts Group (JPEG) 2000 encoding and decoding based on a graphic processing unit (GPU) are provided. The method for JPEG2000 encoding based on a GPU includes receiving input image data from a central processing unit (CPU), encoding the image data, and transferring the encoded image data to the CPU.
Abstract:
Disclosed herein are a method and apparatus for compressing learning parameters for training of a deep-learning model and transmitting the compressed parameters in a distributed processing environment. Multiple electronic devices in the distributed processing system perform training of a neural network. By performing training, parameters are updated. The electronic device may share the updated parameter thereof with additional electronic devices. In order to efficiently share the parameter, the residual of the parameter is provided to the additional electronic devices. When the residual of the parameter is provided, the additional electronic devices update the parameter using the residual of the parameter.
Abstract:
Disclosed herein are an apparatus and method for machine learning based on monotonically increasing quantization resolution. The method, in which a quantization coefficient is defined as a monotonically increasing function of time, includes initially setting the monotonically increasing function of time, performing machine learning based on a quantized learning equation using the quantization coefficient defined by the monotonically increasing function of time, determining whether the quantization coefficient satisfies a predetermined condition after increasing the time, newly setting the monotonically increasing function of time when the quantization coefficient satisfies the predetermined condition, and updating the quantization coefficient using the newly set monotonically increasing function of time. Here, performing the machine learning, determining whether the quantization coefficient satisfies the predetermined condition, newly setting the monotonically increasing function of time, and updating the quantization coefficient may be repeatedly performed.
Abstract:
Disclosed herein are a method and apparatus for deriving motion prediction information and performing encoding and/or decoding on a video using the derived motion prediction information. Each of an encoding apparatus and a decoding apparatus generates a list for inter prediction of a target block. In the generation of the list, whether motion information of a candidate block is to be added to a list is determined based on information about the target block and the motion information. When the motion information passes a motion prediction boundary check, the motion information is added to the list. By means of the motion prediction boundary check, available motion information for prediction of the target block is selectively added to the list.
Abstract:
Disclosed herein are a video decoding method and apparatus and a video encoding method and apparatus. A transformed block is generated by performing a first transformation that uses a prediction block for a target block. A reconstructed block for the target block is generated by performing a second transformation that uses the transformed block. The prediction block may be a block present in a reference image, or a reconstructed block present in a target image. The first transformation and the second transformation may be respectively performed by neural networks. Since each transformation is automatically performed by the corresponding neural network, information required for a transformation may be excluded from a bitstream.
Abstract:
Disclosed herein are a method and apparatus for video decoding and a method and apparatus for video encoding. A prediction block for a target block is generated by predicting the target block using a prediction network, and a reconstructed block for the target block is generated based on the prediction block and a reconstructed residual block. The prediction network includes an intra-prediction network and an inter-prediction network and uses a spatial reference block and/or a temporal reference block when it performs prediction. For learning in the prediction network, a loss function is defined, and learning in the prediction network is performed based on the loss function.
Abstract:
Disclosed herein are an inter-prediction method and apparatus using a reference frame generated based on deep learning. In the inter-prediction method and apparatus, a reference frame is selected, and a virtual reference frame is generated based on the selected reference frame. A reference picture list is configured to include the generated virtual reference frame, and inter prediction for a target block is performed based on the virtual reference frame. The virtual reference frame may be generated based on a deep-learning network architecture, and may be generated based on video interpolation and/or video extrapolation that use the selected reference frame.
Abstract:
Disclosed herein are an apparatus and method for performing rate-distortion optimization based on cost. The encoding apparatus selects an encoding mode to be used to encode a target block from among multiple modes and performs computation for rate-distortion optimization in the encoding mode. The encoding apparatus calculates a cost of at least one of the multiple modes in relation to the encoding of the target block, and selects the encoding mode from among the multiple modes, based on the cost.
Abstract:
Disclosed herein are a federated learning method and apparatus. The federated learning method includes receiving a feature vector extracted from a client side and label data corresponding to the feature vector, outputting a feature vector with phase information preserved therein by applying the feature vector as input of a Self-Organizing Feature Map (SOFM), and training a neural network model by applying both the feature vector with the phase information preserved therein and the label data as input of a neural network model.