SYNTHETIC DATA GENERATION FOR MACHINE LEARNING MODELS

    公开(公告)号:US20240420453A1

    公开(公告)日:2024-12-19

    申请号:US18216271

    申请日:2023-06-29

    Abstract: Techniques for generating synthetic data for machine learning (ML) models are described. A system includes a language model that processes a task and a corresponding set of example inputs to generate another input, referred to herein as a machine-generated data. The machine-generated data is processed using a ML, model (that data is being generated for) to determine a model output, and the model output is analyzed to determine whether it corresponds to a target output. If the model output corresponds to the target output, then the machine-generated data is added to the set of example inputs and one of the original example inputs is removed to generate an updated set of example inputs. The updated set can be used for various training techniques.

    Multi-device speech processing
    5.
    发明授权

    公开(公告)号:US11900921B1

    公开(公告)日:2024-02-13

    申请号:US17080189

    申请日:2020-10-26

    CPC classification number: G10L15/16 G10L15/22

    Abstract: Techniques for partially processing an input on a device and completing processing at a remote system are provided. The device may process an input using an on-device machine learning (ML) model, and determine to cease processing at an intermediary node of the (ML) model based on the output of the intermediary node. Based on the output of the intermediary node satisfying a condition, the device may use the output of the intermediary node to generate an output responsive to the input. Conversely, if the output of the intermediary node does not satisfy a condition, the device may send the output of the intermediary node to the remote system, so the remote system can use another machine learning model to complete processing with respect to the input.

    GENERATIVE LANGUAGE MODELS
    7.
    发明申请

    公开(公告)号:US20240428783A1

    公开(公告)日:2024-12-26

    申请号:US18341412

    申请日:2023-06-26

    Abstract: Systems and techniques for moderating responses of a generative language model are described herein. Some user inputs to a generative language model may include biases, misinformation, and other references to moderated content. To prevent the generative language model from generating responses that promote these forms of moderated content, the techniques described determine a policy corresponding to the determined moderated content category of the user input. The determined policy may correspond to a template of instructions for how the generative language model is to respond to such moderated content. The output of the generative language model may also be moderated before being presented to the user.

    MULTI-DEVICE SPEECH PROCESSING
    8.
    发明公开

    公开(公告)号:US20240221730A1

    公开(公告)日:2024-07-04

    申请号:US18420937

    申请日:2024-01-24

    CPC classification number: G10L15/16 G10L15/22

    Abstract: Techniques for partially processing an input on a device and completing processing at a remote system are provided. The device may process an input using an on-device machine learning (ML) model, and determine to cease processing at an intermediary node of the (ML) model based on the output of the intermediary node. Based on the output of the intermediary node satisfying a condition, the device may use the output of the intermediary node to generate an output responsive to the input. Conversely, if the output of the intermediary node does not satisfy a condition, the device may send the output of the intermediary node to the remote system, so the remote system can use another machine learning model to complete processing with respect to the input.

    User data processing
    10.
    发明授权

    公开(公告)号:US11645468B2

    公开(公告)日:2023-05-09

    申请号:US17009026

    申请日:2020-09-01

    CPC classification number: G06F40/295 G06F40/30 G10L15/08

    Abstract: Techniques for determining attributable data in a natural language user input that can be used to identify a specific user are described. A system may use various data signals determined using different components. The system may process the various signals to make a final determination on whether the input includes attributable data. The system may use a first component to detect user-identifiable data in the input. The system may use a second component to determine whether the input is potentially attributable to a particular user.

Patent Agency Ranking