-
公开(公告)号:US12132919B2
公开(公告)日:2024-10-29
申请号:US17987844
申请日:2022-11-15
发明人: Yang Yang , Hoang Cong Minh Le , Yinhao Zhu , Reza Pourreza , Amir Said , Yizhe Zhang , Taco Sebastiaan Cohen
IPC分类号: H04N19/124 , H04N19/119 , H04N19/147 , H04N19/17 , H04N19/436
CPC分类号: H04N19/436 , H04N19/119 , H04N19/124 , H04N19/147 , H04N19/17
摘要: A processor-implemented method for image compression using an artificial neural network (ANN) includes receiving, at an encoder of the ANN, an image and a spatial segmentation map corresponding to the image. The spatial segmentation map indicates one or more regions of interest. The encoder compresses the image according to a controllable spatial bit allocation. The controllable spatial bit allocation is based on a learned quantization bin size.
-
公开(公告)号:US11798197B2
公开(公告)日:2023-10-24
申请号:US17200694
申请日:2021-03-12
发明人: Hoang Cong Minh Le , Reza Pourreza , Yang Yang , Yinhao Zhu , Amir Said , Yizhe Zhang , Taco Sebastiaan Cohen
CPC分类号: G06T9/002 , G06N3/08 , G06T3/4046
摘要: A method of image compression includes receiving an image. Multiple quantized latent representations are generated to represent features of the image. Each of the quantized latent representations has a different resolution and is generated at staggered timings. Each of the later generated quantized latent representations is conditioned on each of the prior generated quantized latent representations. The multiple quantized latent representations are decoded to reconstruct the image.
-
公开(公告)号:US12118810B2
公开(公告)日:2024-10-15
申请号:US17408779
申请日:2021-08-23
IPC分类号: G06V20/40 , G06F18/21 , G06F18/22 , G06F18/2411 , G06N3/045 , G06T1/00 , G06T7/136 , G06T7/174 , G06T7/215 , G06V10/40 , G06V10/75 , G06V10/94 , G06V30/262
CPC分类号: G06V30/274 , G06F18/21 , G06F18/22 , G06F18/2411 , G06N3/045 , G06T1/0007 , G06T7/136 , G06T7/174 , G06T7/215 , G06V10/40 , G06V10/751 , G06V10/95 , G06V20/46 , G06V20/49 , G06T2207/10016 , G06T2207/20084 , G06V10/759
摘要: Systems, methods, and non-transitory media are provided for providing spatiotemporal recycling networks (e.g., for video segmentation). For example, a method can include obtaining video data including a current frame and one or more reference frames. The method can include determining, based on a comparison of the current frame and the one or more reference frames, a difference between the current frame and the one or more reference frames. Based on the difference being below a threshold, the method can include performing semantic segmentation of the current frame using a first neural network. The semantic segmentation can be performed based on higher-spatial resolution features extracted from the current frame by the first neural network and lower-resolution features extracted from the one or more reference frames by a second neural network. The first neural network has a smaller structure and/or a lower processing cost than the second neural network.
-
-