Abstract:
Methods for processing 360-degree virtual reality images are disclosed. According to one method, coding flags for the target block are skipped for inactive blocks at an encoder side or pixels for the target block are derived based on information identifying the target block being the inactive block at a decoder side. According to another method, when a target block is partially filled with inactive pixels, the best predictor is selected using rate-distortion optimization, where distortion associated with the rate-distortion optimization is measured by excluding inactive pixels of the target block. According to another method, the inactive pixels of a residual block are padded with values to achieve the best rate-distortion optimization. According to another method, active pixels of the residual block are rearranged into a smaller block and coding is applied to the smaller block, or shape-adaptive transform coding is applied to the active pixels of the residual block.
Abstract:
According to one method, at a source side or an encoder side, a selected viewport associated with the 360-degree virtual reality images is determined. One or more parameters related to the selected pyramid projection format are then determined. According to the present invention, one or more syntax elements for said one or more parameters are included in coded data of the 360-degree virtual reality images. The coded data of the 360-degree virtual reality images are provided as output data. At a receiver side or a decoder side, one or more syntax elements for one or more parameters are parsed from the coded data of the 360-degree virtual reality images. A selected pyramid projection format associated with the 360-degree virtual reality images is determined based on information including said one or more parameters. The 360-degree virtual reality images are then recovered according to the selected viewport.
Abstract:
A method and apparatus of video coding using Non-Local (NL) denoising filter are disclosed. According to the present invention, the decoded picture or the processed-decoded picture is divided into multiple blocks. The NL loop-filter is applied to a target block with NL on/off control to generate a filtered output. The NL loop-filter process comprises determining, for the target block, a patch group consisting of K nearest reference blocks within a search window located in one or more reference regions and deriving one filtered output which could be one block for the target block or one filtered patch group based on pixel values of the target block and pixel values of the patch group. The filtered output is provided for further loop-filter processing if there is any further loop-filter processing or the filtered output is provided for storing in a reference picture buffer if there is no further loop-filter processing.
Abstract:
A method and apparatus for video coding utilizing a motion vector predictor (MVP) for a motion vector (MV) for a block are disclosed. According to an embodiment, a mean candidate is derived from at least two candidates in the current candidate list. The mean candidate includes two MVs for the bi-prediction or one MV for the uni-prediction, and at least one MV of the mean candidate is derived as a mean of the MVs of said at least two candidates in one of list 0 and list 1. The mean candidate is added to the current candidate list to form a modified candidate list, and one selected candidate is determined as a MVP or MVPs from the modified candidate list, for current MV or MVs of the current block. The current block is then encoded or decoded in Inter, Merge, or Skip mode utilizing the MVP or MVPs selected.
Abstract:
A method and apparatus of video coding incorporating Deep Neural Network are disclosed. A target signal is processed using DNN (Deep Neural Network), where the target signal provided to DNN input corresponds to the reconstructed residual, output from the prediction process, the reconstruction process, one or more filtering processes, or a combination of them. The output data from DNN output is provided for the encoding process or the decoding process. The DNN can be used to restore pixel values of the target signal or to predict a sign of one or more residual pixels between the target signal and an original signal. An absolute value of one or more residual pixels can be signalled in the video bitstream and used with the sign to reduce residual error of the target signal.
Abstract:
Method and apparatus of video coding using decoder derived motion information based on bilateral matching or template matching are disclosed. According to one method, an initial motion vector (MV) index is signalled in a video bitstream at an encoder side or determined from the video bitstream at a decoder side. A selected MV is then derived using bilateral matching, template matching or both to refine an initial MV associated with the initial MV index. In another method, when both MVs for list 0 and list 1 exist in template matching, the smallest-cost MV between the two MVs may be used for uni-prediction template matching if the cost is lower than the bi-prediction template matching. According to yet another method, the refinement of the MV search is dependent on the block size. According to yet another method, merge candidate MV pair is always used for bilateral matching or template matching.
Abstract:
A method and apparatus of video encoding or decoding for a video encoding or decoding system applied to multi-face sequences corresponding to a 360-degree virtual reality sequence are disclosed. According the present invention, one or more multi-face sequences representing the 360-degree virtual reality sequence are derived. If Inter prediction is selected for a current block in a current face, one virtual reference frame is derived for each face of said one or more multi-face sequences by assigning one target reference face to a center of said one virtual reference frame and connecting neighboring faces of said one target reference face to said one target reference face at boundaries of said one target reference face. Then, the current block in the current face is encoded or decoded using a current virtual reference frame derived for the current face to derive an Inter predictor for the current block.
Abstract:
Methods and apparatus of processing omnidirectional images are disclosed. According to one method, a current set of omnidirectional images converted from each spherical image in a 360-degree panoramic video sequence using a selected projection format is received, where the selected projection format belongs to a projection format group comprising a cubicface format, and the current set of omnidirectional images with the cubicface format consists of six cubic faces. If the selected projection format corresponds to the cubicface format, one or more mapping syntax elements to map the current set of omnidirectional images into a current cubemap image are signaled. The coded data are then provided in a bitstream including said one or more mapping syntax elements for the current set of omnidirectional images.
Abstract:
Aspects of the disclosure provide a method for denoising a reconstructed picture. The method can include receiving reconstructed video data corresponding to a picture, dividing the picture into current patches, forming patch groups each including a current patch and a number of reference patches that are similar to the current patch, denoising the patch groups to modify pixel values of the patch groups to create a filtered picture, and generating a reference picture based on the filtered picture for encoding or decoding a picture. The operation of denoising the patch groups includes deriving a variance of compression noise in the respective patch group based on a compression noise model. A selection of model parameters is determined based on coding unit level information.
Abstract:
A method and apparatus for sample-based Simplified Depth Coding (SDC) are disclosed. The system determines prediction samples for the current depth block based on reconstructed neighboring depth samples according to a selected Intra mode and determines an offset value for the current depth block. The final reconstructed samples are derived by adding the offset value to each of the prediction samples. The offset value corresponds to a difference between a reconstructed depth value and a predicted depth value for the current depth block. The offset value can be derived from the residual value, and the residual value can be derived implicitly at a decoder side or transmitted in the bitstream. The selected Intra mode may correspond to Planar mode, the prediction samples are derived according to the Planar mode.