-
公开(公告)号:US20250045567A1
公开(公告)日:2025-02-06
申请号:US18498257
申请日:2023-10-31
Applicant: Salesforce, Inc.
Inventor: Weiran Yao , Shelby Heinecke , Juan Carlos Niebles Duque , Zhiwei Liu , Yihao Feng , Le Xue , Rithesh Murthy , Zeyuan Chen , Jianguo Zhang , Devansh Arpit , Ran Xu , Lik Mui , Huan Wang , Caiming Xiong , Silvio Savarese
IPC: G06N3/0455 , G06N3/092
Abstract: Embodiments described herein provide for optimizing a language model (LM) agent. In at least one embodiment, and LM agent comprises an “actor” LM and a “retrospective LM which provides reflections on attempts by the actor LM. The reflections are used to update subsequent prompts to the actor LM. Optimizing the LM agent comprises fine-tuning parameters of the retrospective LM while keeping parameters of the actor LM frozen. A gradient may be determined by a change in reward from the environment based on actions taken by the actor LM with and without a reflection of the retrospective LM. Using this gradient, parameters of the retrospective LM may be updated via backpropagation.
-
公开(公告)号:US20250139411A1
公开(公告)日:2025-05-01
申请号:US18498229
申请日:2023-10-31
Applicant: Salesforce, Inc.
Inventor: Rithesh Murthy , Shelby Heinecke , Juan Carlos Niebles Duque , Zhiwei Liu , Le Xue , Weiran Yao , Yihao Feng , Zeyuan Chen , Akash Gokul , Devansh Arpit , Ran Xu , Lik Mui , Huan Wang , Caiming Xiong , Silvio Savarese
IPC: G06N3/0455 , G06N3/084
Abstract: Embodiments described herein provide a large language model (LLM) based AI agent that adopts Monte-Carlo Tree Search (MCTS) to execute a task. The LLM is prompted with a task description and it responds with its first attempted list of actions. Based on the success or failure of the first attempt, the LLM is prompted with an updated prompt which includes feedback from the first attempt based on a determined reward. The prompt may include a relative “score” for each action taken at each step. A numeric score may be mapped to a set of pre-defined text labels, such as “high” or “low” value putting the score in a form more suited for an LLM prompt. In this way, the LLM is iteratively given prompts which are updated with the scores from each action taken at each previous iterations so that it traverses different paths on the tree in each iteration.
-
公开(公告)号:US20250053793A1
公开(公告)日:2025-02-13
申请号:US18494393
申请日:2023-10-25
Applicant: Salesforce, Inc.
Inventor: Zhiwei Liu , Weiran Yao , Jianguo Zhang , Le Xue , Shelby Heinecke , Rithesh Murthy , Yihao Feng , Zeyuan Chen , Juan Carlos Niebles Duque , Devansh Arpit , Ran Xu , Lik Mui , Huan Wang , Caiming Xiong , Silvio Savarese
Abstract: Embodiments described herein provide a method of predicting an action by a plurality of language model augmented agents (LAAs). In at least one embodiment, a controller receives a task instruction to be performed using an environment. The controller receives an observation of a first state from the environment. The controller selects a LAA from the plurality of LAAs based on the task instruction and the observation. The controller obtains an output from the selected LAA generated using an input combining the task instruction, the observation, and an LAA-specific prompt template. The controller determines the action based on the output. The controller causes the action to be performed on the environment thereby causing the first state of the environment to change to a second state.
-
公开(公告)号:US20240412000A1
公开(公告)日:2024-12-12
申请号:US18373132
申请日:2023-09-26
Applicant: Salesforce, Inc.
Inventor: Lik Mui
IPC: G06F40/35 , G06F16/953 , G06F40/279
Abstract: An application server or other processing entity may receive, via a cloud-based platform, user input that may include at least one request for data. The application server may classify the user input into a first of a plurality of deterministic-stochastic spectrum classifications based on the user input and a probability of mapping the at least one request for data to at least one data location. The application server may retrieve the data from the at least one data location and based on the first deterministic-stochastic spectrum classification. The application server may transmit, based on the first deterministic-stochastic spectrum classification and the user input, an input to a large language model. The application server may present a response to the user input, where the response is based on a combination of an output of the large language model and the data retrieved from the at least one data location.
-
-
-