-
公开(公告)号:US12204847B2
公开(公告)日:2025-01-21
申请号:US17938572
申请日:2022-10-06
Applicant: Salesforce, Inc.
Inventor: Alexander R. Fabbri , Prafulla Kumar Choubey , Jesse Vig , Chien-Sheng Wu , Caiming Xiong
IPC: G06F17/00 , G06F40/166 , G06F40/284 , G06N20/00
Abstract: Embodiments described herein provide a method for text summarization. The method includes receiving a training dataset having at least an uncompressed text, a compressed text, and one or more information entities accompanying the compressed text. The method also includes generating, using a perturber model, a perturbed text with the one or more information entities being inserted into the compressed text. The method further includes training the perturber model based on a first training objective, and generating, using the trained perturber model, a perturbed summary in response to an input of a reference summary. The method further includes generating, via an editor model, a predicted summary by removing information from the perturbed summary conditioned on a source document of the reference summary, and training the editor model based on a second training objective.
-
公开(公告)号:US20240370640A1
公开(公告)日:2024-11-07
申请号:US18774375
申请日:2024-07-16
Applicant: Salesforce, Inc.
Inventor: Wojciech Kryscinski , Alexander R. Fabbri , Jesse Vig
IPC: G06F40/166 , G06F16/332 , G06F16/34
Abstract: Embodiments described herein provide a query-focused summarization model that employs a single or dual encoder model. A two-step approach may be adopted that first extracts parts of the source document and then synthesizes the extracted segments into a final summary. In another embodiment, an end-to-end approach may be adopted that splits the source document into overlapping segments, and then concatenates encodings into a single embedding sequence for the decoder to output a summary.
-
公开(公告)号:US20240242022A1
公开(公告)日:2024-07-18
申请号:US18156043
申请日:2023-01-18
Applicant: Salesforce, Inc.
Inventor: Victor Yee , Chien-Sheng Wu , Na Cheng , Alexander R. Fabbri , Zachary Alexander , Nicholas Feinig , Sameer Abhinkar , Shashank Harinath , Sitaram Asur , Jacob Nathaniel Huffman , Wojciech Kryscinski , Caiming Xiong
IPC: G06F40/174 , G06F16/34
CPC classification number: G06F40/174 , G06F16/345
Abstract: Embodiments described herein provide a structured conversation summarization framework. A user interface may be provided which allows an agent to perform a conversation with a customer, for example regarding resolving a customer support issue. Utterances by both the agent and customer may be stored, and at the end of the conversation, the utterances may be used to generate a structured summary. The structured summary may include components such as a general summary, an issue summary, and a resolution summary. Using neural network models and heuristics, each component of the summary may be automatically generated.
-
公开(公告)号:US12050855B2
公开(公告)日:2024-07-30
申请号:US17749837
申请日:2022-05-20
Applicant: Salesforce, Inc.
Inventor: Wojciech Kryscinski , Alexander R. Fabbri , Jesse Vig
IPC: G06F16/00 , G06F16/332 , G06F16/34 , G06F40/166
CPC classification number: G06F40/166 , G06F16/3329 , G06F16/345
Abstract: Embodiments described herein provide a query-focused summarization model that employs a single or dual encoder model. A two-step approach may be adopted that first extracts parts of the source document and then synthesizes the extracted segments into a final summary. In another embodiment, an end-to-end approach may be adopted that splits the source document into overlapping segments, and then concatenates encodings into a single embedding sequence for the decoder to output a summary.
-
公开(公告)号:US20220277135A1
公开(公告)日:2022-09-01
申请号:US17749837
申请日:2022-05-20
Applicant: Salesforce, Inc.
Inventor: Wojciech Kryscinski , Alexander R. Fabbri , Jesse Vig
IPC: G06F40/166 , G06F16/34 , G06F16/332
Abstract: Embodiments described herein provide a query-focused summarization model that employs a single or dual encoder model. A two-step approach may be adopted that first extracts parts of the source document and then synthesizes the extracted segments into a final summary. In another embodiment, an end-to-end approach may be adopted that splits the source document into overlapping segments, and then concatenates encodings into a single embedding sequence for the decoder to output a summary.
-
公开(公告)号:US20240394539A1
公开(公告)日:2024-11-28
申请号:US18474949
申请日:2023-09-26
Applicant: Salesforce, Inc.
Inventor: Wojciech Kryscinski , Alexander R. Fabbri , Caiming Xiong , Shafiq Rayhan Joty , Chien-Sheng Wu , Divyansh Agarwal , Philippe Laban
IPC: G06N3/084 , G06F40/166 , G06F40/40
Abstract: Embodiments described herein provide systems and methods for training neural network based language models using human feedback. An existing (or generated) summary of a document is provided, and that summary may be used to generate a number of other summaries. A human annotator may reject the summary if there is any factuality issue with the summary. Summaries which are agreed to have no factuality problems are used as baseline summaries. Small atomic edits are made to the baseline summaries (e.g., replacing a single word or phrase) to create a group of summaries. Human annotators label each of these summaries as factual or not. The annotated summaries are used to train a summarization model and/or a factual detector model.
-
公开(公告)号:US20230419017A1
公开(公告)日:2023-12-28
申请号:US17938572
申请日:2022-10-06
Applicant: Salesforce, Inc.
Inventor: Alexander R. Fabbri , Prafulla Kumar Choubey , Jesse Vig , Chien-Sheng Wu , Caiming Xiong
IPC: G06F40/166 , G06F40/284 , G06N20/00
CPC classification number: G06F40/166 , G06F40/284 , G06N20/00
Abstract: Embodiments described herein provide a method for text summarization. The method includes receiving a training dataset having at least an uncompressed text, a compressed text, and one or more information entities accompanying the compressed text. The method also includes generating, using a perturber model, a perturbed text with the one or more information entities being inserted into the compressed text. The method further includes training the perturber model based on a first training objective, and generating, using the trained perturber model, a perturbed summary in response to an input of a reference summary. The method further includes generating, via an editor model, a predicted summary by removing information from the perturbed summary conditioned on a source document of the reference summary, and training the editor model based on a second training objective.
-
8.
公开(公告)号:US20230376677A1
公开(公告)日:2023-11-23
申请号:US17880502
申请日:2022-08-03
Applicant: Salesforce, Inc.
Inventor: Prafulla Kumar Choubey , Alexander R. Fabbri , Jesse Vig , Chien-Sheng Wu , Wenhao Liu , Nazneen Rajani
IPC: G06F40/166 , G06N20/00
CPC classification number: G06F40/166 , G06N20/00
Abstract: Embodiments described herein provide a document summarization framework that employs an ensemble of summarization models, each of which is a modified version of a base summarization model to control hallucination. For example, a base summarization model may first be trained on a full training data set. The trained base summarization model is then fine-tuned using a first filtered subset of the training data which contains noisy data, resulting in an “anti-expert” model. The parameters of the anti-expert model are subtracted from the parameters of the trained base model to produce a final summarization model which yields robust factual performance.
-
-
-
-
-
-
-