Abstract:
A method and apparatus of deriving a motion vector predictor (MVP) for a current block in an Inter, Merge, or Skip mode are disclosed. Embodiments according to the present invention determine redundant MVP candidates according to a non-MV-value based criterion. The redundant MVP candidates are then removed from the MVP candidate set. In other embodiments according to the present invention, motion IDs are assigned to MVP candidates to follow the trail of motion vectors associated with the MVP candidate. An MVP candidate having a same motion ID as a previous MVP is redundant and can be removed from the MVP candidate set. In yet another embodiment, redundant MVP candidates correspond to one or more of the MVP candidates that cause the second 2N×N or N×2N PU to be merged into a 2N×2N PU are removed from the MVP candidate set.
Abstract:
A video processing method includes: decoding apart of a bitstream to generate a decoded frame, where the decoded frame is a projection-based frame that includes projection faces in a projection layout; and remapping sample locations of the projection-based frame to locations on the sphere, where a sample location within the projection-based frame is converted into a local sample location within a projection face packed in the projection-based frame; in response to adjustment criteria being met, an adjusted local sample location within the projection face is generated by applying adjustment to at least one coordinate value of the local sample location within the projection face, and the adjusted local sample location within the projection face is remapped to a location on the sphere; and in response to the adjustment criteria not being met, the local sample location within the projection face is remapped to a location on the sphere.
Abstract:
A video processing method includes a step of receiving a bitstream, and a step of decoding a part of the bitstream to generate a decoded frame, including parsing a plurality of syntax elements from the bitstream. The decoded frame is a projection-based frame that includes a plurality of projection faces packed at a plurality of face positions with different position indexes in a hemisphere cubemap projection layout. A portion of a 360-degree content of a sphere is mapped to the plurality of projection faces via hemisphere cubemap projection. Values of the plurality of syntax elements are indicative of face indexes of the plurality of projection faces packed at the plurality of face positions, respectively, and are constrained to meet a requirement of bitstream conformance.
Abstract:
A video processing method includes: decoding a part of a bitstream to generate a decoded frame, where the decoded frame is a projection-based frame that includes projection faces in a hemisphere cubemap projection layout; and remapping sample locations of the projection-based frame to locations on the sphere, where a sample location within the projection-based frame is converted into a local sample location within a projection face packed in the projection-based frame; in response to adjustment criteria being met, an adjusted local sample location within the projection face is generated by applying adjustment to one coordinate value of the local sample location within the projection face, and the adjusted local sample location within the projection face is remapped to a location on the sphere; and in response to the adjustment criteria not being met, the local sample location within the projection face is remapped to a location on the sphere.
Abstract:
A method and apparatus of video coding incorporating Deep Neural Network are disclosed. A target signal is processed using DNN (Deep Neural Network), where the target signal provided to DNN input corresponds to the reconstructed residual, output from the prediction process, the reconstruction process, one or more filtering processes, or a combination of them. The output data from DNN output is provided for the encoding process or the decoding process. The DNN can be used to restore pixel values of the target signal or to predict a sign of one or more residual pixels between the target signal and an original signal. An absolute value of one or more residual pixels can be signalled in the video bitstream and used with the sign to reduce residual error of the target signal.
Abstract:
A video decoding method includes: decoding a part of a bitstream to generate a decoded frame, including parsing a syntax element from the bitstream. The decoded frame is a projection-based frame that includes at least one projection face and at least one guard band packed in a projection layout with padding, and at least a portion of a 360-degree content of a sphere is mapped to the at least one projection face via projection. The syntax element specifies a guard band type of the at least one guard band.
Abstract:
Methods and apparatus of processing 360-degree virtual reality images are disclosed. According to one method, the method receives coded data for an extended 2D (two-dimensional) frame including an encoded 2D frame with one or more encoded guard bands, wherein the encoded 2D frame is projected from a 3D (three-dimensional) sphere using a target projection, wherein said one or more encoded guard bands are based on a blending of one or more guard bands with an overlapped region when the overlapped region exists. The method then decodes the coded data into a decoded extended 2D frame including a decoded 2D frame with one or more decoded guard bands, and derives a reconstructed 2D frame from the decoded extended 2D frame.
Abstract:
A video processing method includes: obtaining a plurality of projection faces from an omnidirectional content of a sphere, wherein the omnidirectional content of the sphere is mapped onto the projection faces via cubemap projection, and the projection faces comprise a first projection face; obtaining, by a re-sampling circuit, a first re-sampled projection face by re-sampling at least a portion of the first projection face through non-uniform mapping; generating a projection-based frame according to a projection layout of the cubemap projection, wherein the projection-based frame comprises the first re-sampled projection face packed in the projection layout; and encoding the projection-based frame to generate a part of a bitstream.
Abstract:
A sample adaptive offset (SAO) filtering method for a reconstructed projection-based frame includes: obtaining at least one padding pixel in a padding area that acts as an extension of a face boundary of a first projection face, and applying SAO filtering to a block that has at least one pixel included in the first projection face. In the reconstructed projection-based frame, there is image content discontinuity between the face boundary of the first projection face and a face boundary of a second projection face. The at least one padding pixel is involved in the SAO filtering of the block.
Abstract:
According to one method, at a source side or an encoder side, a selected viewport associated with the 360-degree virtual reality images is determined. One or more parameters related to the selected pyramid projection format are then determined. According to the present invention, one or more syntax elements for said one or more parameters are included in coded data of the 360-degree virtual reality images. The coded data of the 360-degree virtual reality images are provided as output data. At a receiver side or a decoder side, one or more syntax elements for one or more parameters are parsed from the coded data of the 360-degree virtual reality images. A selected pyramid projection format associated with the 360-degree virtual reality images is determined based on information including said one or more parameters. The 360-degree virtual reality images are then recovered according to the selected viewport.