-
公开(公告)号:US12225221B2
公开(公告)日:2025-02-11
申请号:US17779380
申请日:2019-12-23
Applicant: Google LLC
Inventor: Shan Li , Claudionor Coelho , In Suk Chong , Aki Kuusela
IPC: H04N19/107 , H04N19/11 , H04N19/124 , H04N19/149 , H04N19/159 , H04N19/176 , H04N19/436 , H04N19/593
Abstract: Ultra light models and decision fusion for increasing the speed of intra-prediction are described. Using a machine-learning (ML) model, an ML intra-prediction mode is obtained. A most-probable intra-prediction mode is obtained from amongst available intra-prediction modes for encoding the current block. As an encoding intra-prediction mode, one of the ML intra-prediction mode or the most-probable intra-prediction mode is selected, and the encoding intra-prediction mode is encoded in a compressed bitstream. A current block is encoded using the encoding intra-prediction mode. Selection of the encoding intra-prediction mode is based on relative reliabilities of the ML intra-prediction mode and the most-probable intra-prediction mode.
-
公开(公告)号:US11310501B2
公开(公告)日:2022-04-19
申请号:US16868729
申请日:2020-05-07
Applicant: GOOGLE LLC
Inventor: Claudionor Coelho , Dake He , Aki Kuusela , Shan Li
IPC: H04N19/124 , H04N19/164 , H04N19/176 , H04N19/96
Abstract: Encoding an image block using a quantization parameter includes presenting, to an encoder that includes a machine-learning model, the image block and a value derived from the quantization parameter, where the value is a result of a non-linear function using the quantization parameter as input, where the non-linear function relates to a second function used to calculate, using the quantization parameter, a Lagrange multiplier that is used in a rate-distortion calculation, and where the machine-learning model is trained to output mode decision parameters for encoding the image block; obtaining the mode decision parameters from the encoder; and encoding, in a compressed bitstream, the image block using the mode decision parameters.
-
公开(公告)号:US11025907B2
公开(公告)日:2021-06-01
申请号:US16289149
申请日:2019-02-28
Applicant: GOOGLE LLC
Inventor: Shan Li , Claudionor Coelho , Aki Kuusela , Dake He
IPC: H04N19/107 , H04N19/119 , H04N19/176 , H04N19/96 , G06N3/04 , G06N3/08
Abstract: Convolutional neural networks (CNN) that determine a mode decision (e.g., block partitioning) for encoding a block include feature extraction layers and multiple classifiers. A non-overlapping convolution operation is performed at a feature extraction layer by setting a stride value equal to a kernel size. The block has a N×N size, and a smallest partition output for the block has a S×S size. Classification layers of each classifier receive feature maps having a feature dimension. An initial classification layer receives the feature maps as an output of a final feature extraction layer. Each classifier infers partition decisions for sub-blocks of size (αS)×(αS) of the block, wherein α is a power of 2 and α=2, . . . , N/S, by applying, at some successive classification layers, a 1×1 kernel to reduce respective feature dimensions; and outputting by a last layer of the classification layers an output corresponding to a N/(αS)×N/(αS)×1 output map.
-
4.
公开(公告)号:US20230229895A1
公开(公告)日:2023-07-20
申请号:US18007871
申请日:2021-06-02
Applicant: Google LLC
Inventor: Claudionor Jose Nunes Coelho, Jr. , Piotr Zielinski , Aki Kuusela , Shan Li , Hao Zhuang
IPC: G06N3/0495 , G06N3/092
CPC classification number: G06N3/0495 , G06N3/092
Abstract: Systems and methods for producing a neural network architecture with improved energy consumption and performance tradeoffs are disclosed, such as would be deployed for use on mobile or other resource-constrained devices. In particular, the present disclosure provides systems and methods for searching a network search space for joint optimization of a size of a layer of a reference neural network model (e.g., the number of filters in a convolutional layer or the number of output units in a dense layer) and of the quantization of values within the layer. By defining the search space to correspond to the architecture of a reference neural network model, examples of the disclosed network architecture search can optimize models of arbitrary complexity. The resulting neural network models are able to be run using relatively fewer computing resources (e.g., less processing power, less memory usage, less power consumption, etc.), all while remaining competitive with or even exceeding the performance (e.g., accuracy) of current state-of-the-art, mobile-optimized models.
-
公开(公告)号:US20220201316A1
公开(公告)日:2022-06-23
申请号:US17601639
申请日:2019-03-21
Applicant: Google LLC
Inventor: Claudionor Coelho , Aki Kuusela , Joseph Young , Shan Li , Dake He
IPC: H04N19/147 , H04N19/176 , H04N19/96 , G06T9/00
Abstract: An apparatus for encoding an image block includes a processor that presents, to a machine-learning model, the image block, obtains the partition decision for encoding the image block from the model, and encodes the image block using the partition decision. The model is trained to output a partition decision for encoding the image block by using training data for a plurality of training blocks as input, the training data including for a training block, partition decisions for encoding the training block, and, for each partition decision, a rate-distortion value resulting from encoding the training block using the partition decision. The model is trained using a loss function combining a partition loss function based upon a relationship between the partition decisions and respective predicted partitions, and a rate-distortion cost loss function based upon a relationship between the rate-distortion values and respective predicted rate-distortion values.
-
公开(公告)号:US20200275101A1
公开(公告)日:2020-08-27
申请号:US16868729
申请日:2020-05-07
Applicant: GOOGLE LLC
Inventor: Claudionor Coelho , Dake He , Aki Kuusela , Shan Li
IPC: H04N19/124 , H04N19/164 , H04N19/176 , H04N19/96
Abstract: Encoding an image block using a quantization parameter includes presenting, to an encoder that includes a machine-learning model, the image block and a value derived from the quantization parameter, where the value is a result of a non-linear function using the quantization parameter as input, where the non-linear function relates to a second function used to calculate, using the quantization parameter, a Lagrange multiplier that is used in a rate-distortion calculation, and where the machine-learning model is trained to output mode decision parameters for encoding the image block; obtaining the mode decision parameters from the encoder; and encoding, in a compressed bitstream, the image block using the mode decision parameters.
-
公开(公告)号:US10674152B2
公开(公告)日:2020-06-02
申请号:US16134134
申请日:2018-09-18
Applicant: GOOGLE LLC
Inventor: Claudionor Coelho , Dake He , Aki Kuusela , Shan Li
IPC: H04N19/124 , H04N19/164 , H04N19/96 , H04N19/176
Abstract: A method for encoding an image block includes presenting, to a machine-learning model, the image block and a first value corresponding to a first quantization parameter; obtaining first mode decision parameters from the machine-learning model; and encoding the image block using the first mode decision parameters. The first value results from a non-linear function using the first quantization parameter as input. The machine-learning model is trained to output mode decision parameters by using training data. Each training datum includes a training block that is encoded by a second encoder, second mode decision parameters used by the second encoder for encoding the training block, and a second value corresponding to a second quantization parameter. The second encoder used the second quantization parameter for encoding the training block and the second value results from the non-linear function using the second quantization parameter as input.
-
公开(公告)号:US20230007284A1
公开(公告)日:2023-01-05
申请号:US17779380
申请日:2019-12-23
Applicant: Google LLC
Inventor: Shan Li , Claudionor Coelho , In Suk Chong , Aki Kuusela
IPC: H04N19/436 , H04N19/176 , H04N19/593 , H04N19/159 , H04N19/11 , H04N19/124 , H04N19/149
Abstract: Ultra light models and decision fusion for increasing the speed of intra-prediction are described. Using a machine-learning (ML) model, an ML intra-prediction mode is obtained. A most-probable intra-prediction mode is obtained from amongst available intra-prediction modes for encoding the current block. As an encoding intra-prediction mode, one of the ML intra-prediction mode or the most-probable intra-prediction mode is selected, and the encoding intra-prediction mode is encoded in a compressed bitstream. A current block is encoded using the encoding intra-prediction mode. Selection of the encoding intra-prediction mode is based on relative reliabilities of the ML intra-prediction mode and the most-probable intra-prediction mode.
-
公开(公告)号:US20200280717A1
公开(公告)日:2020-09-03
申请号:US16289149
申请日:2019-02-28
Applicant: GOOGLE LLC
Inventor: Shan Li , Claudionor Coelho , Aki Kuusela , Dake He
IPC: H04N19/107 , H04N19/119 , H04N19/176 , H04N19/96 , G06N3/04 , G06N3/08
Abstract: Convolutional neural networks (CNN) that determine a mode decision (e.g., block partitioning) for encoding a block include feature extraction layers and multiple classifiers. A non-overlapping convolution operation is performed at a feature extraction layer by setting a stride value equal to a kernel size. The block has a N×N size, and a smallest partition output for the block has a S×S size. Classification layers of each classifier receive feature maps having a feature dimension. An initial classification layer receives the feature maps as an output of a final feature extraction layer. Each classifier infers partition decisions for sub-blocks of size (αS)×(αS) of the block, wherein α is a power of 2 and α=2, . . . , N/S, by applying, at some successive classification layers, a 1×1 kernel to reduce respective feature dimensions; and outputting by a last layer of the classification layers an output corresponding to a N/(αS)×N/(αS)×1 output map.
-
公开(公告)号:US20200092556A1
公开(公告)日:2020-03-19
申请号:US16134134
申请日:2018-09-18
Applicant: GOOGLE LLC
Inventor: Claudionor Coelho , Dake He , Aki Kuusela , Shan Li
IPC: H04N19/124 , H04N19/176 , H04N19/96 , H04N19/164
Abstract: A method for encoding an image block includes presenting, to a machine-learning model, the image block and a first value corresponding to a first quantization parameter; obtaining first mode decision parameters from the machine-learning model; and encoding the image block using the first mode decision parameters. The first value results from a non-linear function using the first quantization parameter as input. The machine-learning model is trained to output mode decision parameters by using training data. Each training datum includes a training block that is encoded by a second encoder, second mode decision parameters used by the second encoder for encoding the training block, and a second value corresponding to a second quantization parameter. The second encoder used the second quantization parameter for encoding the training block and the second value results from the non-linear function using the second quantization parameter as input.
-
-
-
-
-
-
-
-
-