Systems and methods for text summarization

    公开(公告)号:US12204847B2

    公开(公告)日:2025-01-21

    申请号:US17938572

    申请日:2022-10-06

    Abstract: Embodiments described herein provide a method for text summarization. The method includes receiving a training dataset having at least an uncompressed text, a compressed text, and one or more information entities accompanying the compressed text. The method also includes generating, using a perturber model, a perturbed text with the one or more information entities being inserted into the compressed text. The method further includes training the perturber model based on a first training objective, and generating, using the trained perturber model, a perturbed summary in response to an input of a reference summary. The method further includes generating, via an editor model, a predicted summary by removing information from the perturbed summary conditioned on a source document of the reference summary, and training the editor model based on a second training objective.

    SYSTEMS AND METHODS FOR QUERY-FOCUSED SUMMARIZATION

    公开(公告)号:US20220277135A1

    公开(公告)日:2022-09-01

    申请号:US17749837

    申请日:2022-05-20

    Abstract: Embodiments described herein provide a query-focused summarization model that employs a single or dual encoder model. A two-step approach may be adopted that first extracts parts of the source document and then synthesizes the extracted segments into a final summary. In another embodiment, an end-to-end approach may be adopted that splits the source document into overlapping segments, and then concatenates encodings into a single embedding sequence for the decoder to output a summary.

    SYSTEMS AND METHODS FOR QUERY-FOCUSED SUMMARIZATION

    公开(公告)号:US20240370640A1

    公开(公告)日:2024-11-07

    申请号:US18774375

    申请日:2024-07-16

    Abstract: Embodiments described herein provide a query-focused summarization model that employs a single or dual encoder model. A two-step approach may be adopted that first extracts parts of the source document and then synthesizes the extracted segments into a final summary. In another embodiment, an end-to-end approach may be adopted that splits the source document into overlapping segments, and then concatenates encodings into a single embedding sequence for the decoder to output a summary.

    SYSTEMS AND METHODS FOR TEXT SIMPLIFICATION WITH DOCUMENT-LEVEL CONTEXT

    公开(公告)号:US20240249082A1

    公开(公告)日:2024-07-25

    申请号:US18460373

    申请日:2023-09-01

    CPC classification number: G06F40/40 G06F40/166 G06F40/284 G06F40/30

    Abstract: A method of training a text simplification model is provided. A training dataset including a first set of original textual samples and original revision histories and a second set of simplified textual samples and simplified revision histories is received via a data interface. A training pair including an original textual sample and corresponding original revision history from the first set and a counterpart simplified textual sample and corresponding simplified revision history from the second set are identified. An alignment label for a first revision in the corresponding original revision history and a second revision in the corresponding simplified revision history are generated using a neural network-based alignment model from a score. A revision category label for each of the first revision and second revision is generated using a neural network-based classification model. A neural network-based text simplification model is trained based on the updated training dataset.

    SYSTEMS AND METHODS FOR TEXT SUMMARIZATION
    6.
    发明公开

    公开(公告)号:US20230419017A1

    公开(公告)日:2023-12-28

    申请号:US17938572

    申请日:2022-10-06

    CPC classification number: G06F40/166 G06F40/284 G06N20/00

    Abstract: Embodiments described herein provide a method for text summarization. The method includes receiving a training dataset having at least an uncompressed text, a compressed text, and one or more information entities accompanying the compressed text. The method also includes generating, using a perturber model, a perturbed text with the one or more information entities being inserted into the compressed text. The method further includes training the perturber model based on a first training objective, and generating, using the trained perturber model, a perturbed summary in response to an input of a reference summary. The method further includes generating, via an editor model, a predicted summary by removing information from the perturbed summary conditioned on a source document of the reference summary, and training the editor model based on a second training objective.

Patent Agency Ranking