Method and apparatus for acquiring pre-trained model

    公开(公告)号:US12277401B2

    公开(公告)日:2025-04-15

    申请号:US17502108

    申请日:2021-10-15

    Abstract: The present disclosure discloses a method and apparatus for acquiring a pre-trained model, and relates to natural language processing and deep learning technologies in the field of artificial intelligence technologies. An implementation includes: acquiring training data, the training data including a single-modal language material and a multi-modal language material, and the multi-modal language material including a language material pair formed by a first-modal language material and a second-modal language material; and performing a multi-task training operation on a pre-trained model using the training data, the multi-task including at least one cross-modal contrastive learning task and at least one single-modal learning task; the pre-trained language model obtained in the present disclosure may learn from different forms of language materials, i.e., the single-modal language material and the multi-modal language material, such that the pre-trained language model may effectively process information in various modals.

    Summary generation model training method and apparatus, device and storage medium

    公开(公告)号:US12093297B2

    公开(公告)日:2024-09-17

    申请号:US17577561

    申请日:2022-01-18

    CPC classification number: G06F16/345 G06F40/51 G06F40/56

    Abstract: The present disclosure provides a summary generation model training method and apparatus, a device and a storage medium, and relates to the field of computer technologies, and in particular, to the field of artificial intelligence such as natural language processing and deep learning. The summary generation model training method includes: acquiring a document representation corresponding to a document sample; constructing, based on the document representation, a summary representation corresponding to the document representation, the summary representation including a positive summary representation and a negative summary representation; and constructing a total contrastive loss function based on the document representation, the positive summary representation and the negative summary representation, and training a summary generation model based on the total contrastive loss function. The present disclosure may improve accuracy of the summary generation model.

Patent Agency Ranking