PROCESSING IMAGES USING MIXTURE OF EXPERTS
    1.
    发明公开

    公开(公告)号:US20240289926A1

    公开(公告)日:2024-08-29

    申请号:US18564915

    申请日:2022-05-27

    Applicant: Google LLC

    CPC classification number: G06T5/60 G06T2207/20084

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating predictions about images. One of the systems includes a neural network comprising a sequence of one or more network blocks that are each configured to perform operations comprising: obtaining a block input that represents an intermediate representation of an input image; determining a plurality of patches of the block input or of an updated representation of the block input, wherein each patch comprises a different subset of elements of the block input or of the updated representation of the block input; assigning each patch to one or more respective expert modules of a plurality of expert modules of the network block; for each patch of the plurality of patches, processing the patch using the corresponding expert modules to generate respective module outputs; and generating a block output by combining the module outputs.

    Locked-Model Multimodal Contrastive Tuning
    4.
    发明公开

    公开(公告)号:US20240153256A1

    公开(公告)日:2024-05-09

    申请号:US18051106

    申请日:2022-10-31

    Applicant: Google LLC

    CPC classification number: G06V10/778

    Abstract: A method may include obtaining a pretrained image encoder and a training sample comprising a training image and a training text string corresponding to the training image. The method may also include initializing a text encoder in an untrained state, determining, using the pretrained image encoder and based on the training image, a first latent representation of the training image, and determining, using the text encoder and based on the training text string, a second latent representation of the training text string. The method may further include determining a loss value based on the first latent representation and the second latent representation, updating, based on the loss value, one or more parameters of the text encoder while holding fixed parameters of the pretrained image encoder, and outputting the text encoder in a trained state.

    TRAINING ULTRA-LARGE-SCALE VISION TRANSFORMER NEURAL NETWORKS

    公开(公告)号:US20240256835A1

    公开(公告)日:2024-08-01

    申请号:US18424420

    申请日:2024-01-26

    Applicant: Google LLC

    CPC classification number: G06N3/0455 G06N3/088

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing an input through each of a plurality of layers of a neural network to generate an output using a plurality of hardware accelerators. The plurality of layers comprise a fully connected layer having a plurality of parameters arranged in a row dimension and a column dimension. One of the methods comprises: generating a plurality of parameter blocks by partitioning the plurality of parameters along the row dimension and the column dimension; determining a ratio of a number of parameters along the row dimension relative to a number of parameters along the column dimension; and determining whether to use row sharding or column sharding with the plurality of hardware accelerators to calculate an output for the fully connected layer and then calculating the output for the fully connected layer using either row sharding or column sharding.

    TRAINING NEURAL NETWORKS USING TRANSFER LEARNING

    公开(公告)号:US20220108171A1

    公开(公告)日:2022-04-07

    申请号:US17488166

    申请日:2021-09-28

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training neural networks using transfer learning. One of the methods includes training a neural network to perform a first prediction task, including: obtaining trained model parameters for each of a plurality of candidate neural networks, wherein each candidate neural network has been pre-trained to perform a respective second prediction task that is different from the first prediction task; obtaining a plurality of training examples corresponding to the first prediction task; selecting a proper subset of the plurality of candidate neural networks using the plurality of training examples; generating, for each candidate neural network, one or more fine-tuned neural networks, wherein each fine-tuned neural network is generated by updating the model parameters of the candidate neural network using the plurality of training examples; and determining model parameters for the neural network using the respective fine-tuned neural networks.

Patent Agency Ranking