AUTOMATED DATA EXTRACTION PIPELINE FOR LARGE LANGUAGE MODEL TRAINING

    公开(公告)号:US20250060944A1

    公开(公告)日:2025-02-20

    申请号:US18449498

    申请日:2023-08-14

    Abstract: An automated data extraction pipeline for large language model (LLM) training may include extracting a set of code segments from a set of natural language question-answer (Q&A) combinations that each include a provided input, a provided output, and a provided code segment formatted to transform the provided input into the provided output. The data extraction pipeline may then generate a predicted output from a question portion of a first natural language Q&A combination using a first LLM. A first extracted code segment from the extracted set of code segments may then be executed to generate a first actual output of the first extracted code segment. One or more data samples may then be generated for training a second LLM based on a comparison of the first actual output to the predicted output. The second LLM may then be trained using the one or more data samples.

    INTEGRATION FLOW GENERATION USING LARGE LANGUAGE MODELS

    公开(公告)号:US20250086212A1

    公开(公告)日:2025-03-13

    申请号:US18530026

    申请日:2023-12-05

    Abstract: Methods, systems, apparatuses, and computer program products are described. A system may receive, via a cloud-based platform, user input comprising a request for generation of the integration flow. The system may generate a query based on the request and a query template including one or more example integration flows and a request to generate a natural language description of the integration flow. The system may transmit the query to the LLM and may receive, from the LLM, a response including the integration flow and the natural language description. The system may extract the integration flow and the natural language description from the response. The system may perform a validation process on the integration flow based at least in part on one or more integration flow validation rules.

Patent Agency Ranking