摘要:
Spatial distortion (i.e., when a frame is viewed independently of other frames in a video sequence) may be quite different from temporal distortion (i.e., when frames are viewed continuously). To estimate temporal distortion, a sliding window approach is used. Specifically, multiple sliding windows around a current frame are considered. Within each sliding window, a large distortion density is calculated and a sliding window with the highest large distortion density is selected. A distance between the current frame and the closest frame with large distortion in the selected window is calculated. Subsequently, the temporal distortion is estimated as a function of the highest large distortion ratio, the spatial distortion for the current frame, and the distance. In another embodiment, a median of spatial distortion values is calculated for each sliding window and the maximum of median spatial distortion values is used to estimate the temporal distortion.
摘要:
A particular implementation decomposes an image into a structure component and a texture component. An edge strength map is calculated for the structure component, and a texture strength map is calculated for the texture component. Using the edge strength and the texture strength, texture masking weights are calculated. The stronger the texture strength is, or the weaker the edge strength is, the more distortion can be tolerated by human eyes, and thus, the smaller the texture masking weight is. The local distortions are then weighted by the texture masking weights to generate an overall distortion level or an overall quality metric.
摘要:
A method and an apparatus for generating or decoding a bitstream representing a 3D model are described. The method comprises the steps of: accessing a first quantization parameter indicating the quality of the reconstructed 3D model; determining a second quantization parameter used for encoding or decoding patterns associated with the 3D model as a function of the first quantization parameter; performing the encoding or decoding of a pattern in response to the second quantization parameter; determining a third and a fourth quantization parameters for the transformation information for an instance being represented as a transformation of the pattern as a function of the second quantization parameter; and performing encoding or decoding of the transformation for the instance in response to the third and fourth quantization parameter. A corresponding apparatus, a computer readable storage medium having stored thereon instructions for generating or decoding a bitstream and a computer readable storage medium having stored thereon a bitstream generated according to the method are also provided.
摘要:
When a scene moves homogeneously or fast, human eyes become sensitive to freezing artifacts. To measure the strength of motion homogeneity, a panning homogeneity parameter is estimated to account for isotropic motion vectors, for example, caused by camera panning, tilting, and translation, a zooming homogeneity 5 parameter is estimated for radial symmetric motion vectors, for example, caused by camera zooming, and a rotation homogeneity parameter is estimated for rotational symmetric motion vectors, for example, caused by camera rotation. Subsequently, an overall motion homogeneity parameter is estimate based on the panning, zooming, and rotation homogeneity parameters. A freezing distortion factor can then 10 be estimated using the overall motion homogeneity parameter. The freezing distortion factor, combined with compression and slicing distortion factors, can be used to estimate a video quality metric. parameter
摘要:
Spatial distortion (i.e., when a frame is viewed independently of other frames in a video sequence) may be quite different from temporal distortion (i.e., when frames are viewed continuously). To estimate temporal distortion, a sliding window approach is used. Specifically, multiple sliding windows around a current frame are considered. Within each sliding window, a large distortion density is calculated and a sliding window with the highest large distortion density is selected. A distance between the current frame and the closest frame with large distortion in the selected window is calculated. Subsequently, the temporal distortion is estimated as a function of the highest large distortion ratio, the spatial distortion for the current frame, and the distance. In another embodiment, a median of spatial distortion values is calculated for each sliding window and the maximum of median spatial distortion values is used to estimate the temporal distortion.
摘要:
Because neighboring frames may affect how a current frame is perceived, we examine different neighborhoods of the current frame and select a neighborhood that impacts the perceived temporal distortion (i.e., when frames are viewed continuously) of the current frame most significantly. Based on spatial distortion (i.e., when a frame is viewed independently of other frames in a video sequence) of frames in the selected neighborhood, we can estimate initial temporal distortion. To refine the initial temporal distortion, we also consider the distribution of distortion in the selected neighborhood, for example, the distance between the current frame and a closest frame with large distortion, or whether distortion occurs in consecutive frames.
摘要:
To estimate content complexity of a video, energy of prediction residuals is calculated. The prediction residuals are usually smaller when the video is less complex and more predictable. Scales of prediction residuals also depend on encoding configurations, for example, I pictures usually have larger prediction residuals than P and B pictures even when the contents are very similar and thus have similar perceived content complexity. To more closely reflect the content complexity, alignment scaling factors are estimated for different encoding configurations. Based on the energy of prediction residuals and alignment scaling factors, an overall content unpredictability parameter can be estimated to compute a compression distortion factor for the video. The compression distortion factor, combined with slicing and freezing distortion factors, can be used to estimate a video quality metric for the video.
摘要:
Accuracy and efficiency of video quality measurement are major problems to be solved. According to the invention, a method for accurately predicting video quality uses a rational function of the quantization parameter QP, which is corrected by a correction function that depends on content unpredictability CU. Exemplarily, the correction function is a power function of the CU. Both QP and CU can be computed from the video elementary stream, without full decoding the video. This ensures high efficiency.
摘要:
Because neighboring frames may affect how a current frame is perceived, we examine different neighborhoods of the current frame and select a neighborhood that impacts the perceived temporal distortion (i.e., when frames are viewed continuously) of the current frame most significantly. Based on spatial distortion (i.e., when a frame is viewed independently of other frames in a video sequence) of frames in the selected neighborhood, we can estimate initial temporal distortion. To refine the initial temporal distortion, we also consider the distribution of distortion in the selected neighborhood, for example, the distance between the current frame and a closest frame with large distortion, or whether distortion occurs in consecutive frames.
摘要:
A method and an apparatus for image filtering are described. Structural information is employed during the calculation of filtering coefficients. The structural information is described by the regions defined through an edge map of the image. In one embodiment, the region correlation between the target pixel and a contributing pixel is selected as a structural filtering coefficient. The region correlation, which indicates the possibility of two pixels being in the same regions, is calculated by evaluating the strongest edge cut by a path between the target pixel and a contributing pixel. The structural filtering coefficient is further combined with spatial information and intensity information to form a spatial-intensity-region (SIR) filter. The structural information based filter is applicable to applications such as denoising, tone mapping, and exposure fusion.