On-device learning in a hybrid speech processing system

    公开(公告)号:US11676575B2

    公开(公告)日:2023-06-13

    申请号:US17386078

    申请日:2021-07-27

    CPC classification number: G10L15/063 G10L15/18 G10L15/30 G10L2015/0635

    Abstract: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as remote NLU data. The response data from the remote speech processing system may be compared to local NLU data to improve a speech processing model on the device. Thus, the device may perform supervised on-device learning based on the remote NLU data. The device may determine differences between the updated speech processing model and an original speech processing model received from the remote system and may send data indicating these differences to the remote system. The remote system may aggregate data received from a plurality of devices and may generate an improved speech processing model.

    ON-DEVICE LEARNING IN A HYBRID SPEECH PROCESSING SYSTEM

    公开(公告)号:US20220020357A1

    公开(公告)日:2022-01-20

    申请号:US17386078

    申请日:2021-07-27

    Abstract: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as remote NLU data. The response data from the remote speech processing system may be compared to local NLU data to improve a speech processing model on the device. Thus, the device may perform supervised on-device learning based on the remote NLU data. The device may determine differences between the updated speech processing model and an original speech processing model received from the remote system and may send data indicating these differences to the remote system. The remote system may aggregate data received from a plurality of devices and may generate an improved speech processing model.

    Voice profile updating
    24.
    发明授权

    公开(公告)号:US11004454B1

    公开(公告)日:2021-05-11

    申请号:US16182021

    申请日:2018-11-06

    Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.

    CONTENT OUTPUT MANAGEMENT BASED ON SPEECH QUALITY

    公开(公告)号:US20200251104A1

    公开(公告)日:2020-08-06

    申请号:US16786629

    申请日:2020-02-10

    Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.

    System command processing
    28.
    发明授权

    公开(公告)号:US10600419B1

    公开(公告)日:2020-03-24

    申请号:US15712676

    申请日:2017-09-22

    Abstract: Techniques for performing command processing are described. A system receives, from a device, input data corresponding to a command. The system determines NLU processing results associated with multiple applications. The system also determines NLU confidences for the NLU processing results for each application. The system sends NLU processing results to a portion of the multiple applications, and receives output data or instructions from the portion of the applications. The system ranks the portion of the applications based at least in part on the NLU processing results associated with the portion of the applications as well as the output data or instructions provided by the portion of the applications. The system may also rank the portion of the applications using other data. The system causes content corresponding to output data or instructions provided by the highest ranked application to be output to a user.

    Generation of automated message responses

    公开(公告)号:US10339925B1

    公开(公告)日:2019-07-02

    申请号:US15276316

    申请日:2016-09-26

    Abstract: Systems, methods, and devices for computer-generating responses and sending responses to communications when the recipient of the communication is unavailable are disclosed. An individual may send a message (either audio or text) to a recipient. The recipient may be unavailable to contemporaneously respond to the message (e.g., the recipient may be performing an action that makes is difficult or impractical for the recipient to contemporaneously respond to the audio message). When the recipient is unavailable, a response to the message is generated and sent without receiving an instruction from the recipient to do so. The response may be sent to the message originating individual, and content of the response may thereafter be sent to the recipient to receive feedback regarding the correctness of the response. Alternatively, the response content may first be sent to the recipient to receive the feedback, and thereafter the response may be sent to the message originating individual.

    Text orientation estimation in camera captured OCR
    30.
    发明授权
    Text orientation estimation in camera captured OCR 有权
    相机拍摄的OCR中的文本方向估计

    公开(公告)号:US09224061B1

    公开(公告)日:2015-12-29

    申请号:US14464365

    申请日:2014-08-20

    CPC classification number: G06K9/3208 G06K9/3258 G06K2209/01

    Abstract: A system estimates text orientation in images captured using a handheld camera prior detecting text in the image. Text orientation is estimated based on edges detected within the image, and the image is rotated based on the estimated orientation. Text detection and processing is then performed on the rotated image. Non-text features along a periphery of the image may be sampled to assure that clutter will not undermine the estimation of orientation.

    Abstract translation: 系统估计在检测图像中的文本之前使用手持相机拍摄的图像中的文本方向。 基于在图像内检测到的边缘估计文本取向,并且基于估计的方向旋转图像。 然后对旋转的图像执行文本检测和处理。 可以对图像周边的非文本特征进行采样,以确保杂波不会破坏取向的估计。

Patent Agency Ranking