SYSTEMS AND METHODS FOR EDITING A LARGE LANGUAGE MODEL

    公开(公告)号:US20250124233A1

    公开(公告)日:2025-04-17

    申请号:US18428530

    申请日:2024-01-31

    Abstract: Systems and methods for editing a large language model are provided. The large language model generates a sequence of tokens, a first probability of a pre-edit output based on the sequence of tokens, and a second probability of a target output based on the sequence of tokens. A loss function is provided based on the first probability and the second probability. A plurality of gradients of the large language model with respect to the loss function is computed. An edit location of the large language model is determined based on the plurality of gradients. The large language model is edited by editing weights at the edit location of the large language model, such that the updated large language model generates the target output for an input including the sequence of words.

Patent Agency Ranking