-
公开(公告)号:US20230038891A1
公开(公告)日:2023-02-09
申请号:US17469853
申请日:2021-09-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Jun FANG , David Philip Lloyd THORSLEY , Chengyao SHEN , Joseph H. HASSOUN
Abstract: A method is disclosed for reducing computation of a differentiable architecture search. An output node is formed having a channel dimension that is one-fourth of a channel dimension of a normal cell of a neural network architecture by averaging channel outputs of intermediate nodes of the normal cell. The output node is preprocessed using a 1×1 convolution to form channels of input nodes for a next layer of the cells in the neural network architecture. Forming the output node includes forming s groups of channel outputs of the intermediate nodes by dividing the channel outputs of the intermediate nodes by a splitting parameter s. An average channel output for each group of channel outputs is formed, and the output node is formed by concatenating the average channel output for each group of channels with channel outputs of the intermediate nodes of the normal cell.
-
公开(公告)号:US20230028226A1
公开(公告)日:2023-01-26
申请号:US17475330
申请日:2021-09-14
Applicant: Samsung Electronics Co., Ltd.
Inventor: David Philip Lloyd THORSLEY , Joseph H. HASSOUN , Jun FANG , Chengyao SHEN
Abstract: A method is disclosed to reduce computation in a self-attention deep-learning model. A feature-map regularization term is added to a loss function while training the self-attention model. At least one low-magnitude feature is removed from at least one feature map of the self-attention model during inference. Weights of the self-attention model are quantized after the self-attention model has been trained. Adding the feature-map regularization term reduces activation values of feature maps, and removing the at least one low-magnitude feature from at least one feature map may be performed by setting the low-magnitude feature to be equal to zero based on the low-magnitude feature having a value that is less than a predetermined threshold. Feature maps of the self-attention model quantized and compressed.
-