SPEECH RECOGNITION FOR MULTIPLE USERS USING SPEECH PROFILE COMBINATION

    公开(公告)号:US20230386478A1

    公开(公告)日:2023-11-30

    申请号:US17939805

    申请日:2022-09-07

    Applicant: Apple Inc.

    CPC classification number: G10L17/22 G10L17/06 G10L17/02

    Abstract: Systems and processes for speech recognition for multiple users are provided. For example, in response to receiving speech input from a user, a combined speech profile is obtained from a plurality of speech profiles. The speech input is interpreted based on the combined speech profile to obtain a plurality of speech recognition results. The plurality of speech recognition results includes a first speech recognition result corresponding to a first speech profile of the plurality of speech profiles, wherein the first speech profile corresponds to a first user, and a second speech recognition result corresponding to a second speech profile of the plurality of speech profiles, wherein the second speech profile corresponds to a second user different from the first user. A respective speech recognition result based on an identified voice profile is then selected from the plurality of speech recognition results.

    UNIT-SELECTION TEXT-TO-SPEECH SYNTHESIS USING CONCATENATION-SENSITIVE NEURAL NETWORKS

    公开(公告)号:US20170092259A1

    公开(公告)日:2017-03-30

    申请号:US14961370

    申请日:2015-12-07

    Applicant: Apple Inc.

    Inventor: Woojay JEON

    CPC classification number: G10L13/07 G10L13/047 G10L13/08

    Abstract: Systems and processes for performing unit-selection text-to-speech synthesis are provided. In one example process, a sequence of target units can represent a spoken pronunciation of text. A set of predicted acoustic model parameters of a second target unit can be determined using a set of acoustic features of a first candidate speech segment of a first target unit and a set of linguistic features of the second target unit. A likelihood score of the second candidate speech segment with respect to the first candidate speech segment can be determined using the set of predicted acoustic model parameters of the second target unit and a set of acoustic features of the second candidate speech segment of the second target unit. The second candidate speech segment can be selected for speech synthesis based on the determined likelihood score. Speech corresponding to the received text can be generated using the selected second candidate speech segment.

Patent Agency Ranking