-
公开(公告)号:US20170092278A1
公开(公告)日:2017-03-30
申请号:US15163392
申请日:2016-05-24
Applicant: Apple Inc.
Inventor: Gunnar EVERMANN , Donald R. MCALLASTER
Abstract: A non-transitory computer-readable storage medium stores one or more programs including instructions, which when executed by an electronic device, cause the electronic device to receive natural-language speech input from one of a plurality of users, the natural-language speech input having a set of acoustic properties; and determine whether the natural-language speech input corresponds to both a user-customizable lexical trigger and a set of acoustic properties associated with the user; where in accordance with a determination that the natural language speech input corresponds to both a user-customizable lexical trigger and a set of acoustic properties associated with the user, invoke a virtual assistant; and in accordance with a determination that either the natural language speech input fails to correspond to a user-customizable lexical trigger or the natural-language speech input fails to have a set of acoustic properties associated with the user, forego invocation of a virtual assistant.
-
公开(公告)号:US20190278841A1
公开(公告)日:2019-09-12
申请号:US16024425
申请日:2018-06-29
Applicant: Apple Inc.
Inventor: Ernest J. PUSATERI , Bharat Ram AMBATI , Elizabeth S. BROOKS , Donald R. MCALLASTER , Venkatesh NAGESHA , Ondrej PLATEK
Abstract: Techniques for inverse text normalization are provided. In some examples, speech input is received and a spoken-form text representation of the speech input is generated. The spoken-form text representation includes a token sequence. A feature representation is determined for the spoken-form text representation and a sequence of labels is determined based on the feature representation. The sequence of labels is assigned to the token sequence and specifies a plurality of edit operations to perform on the token sequence. Each edit operation of the plurality of edit operations corresponds to one of a plurality of predetermined types of edit operations. A written-form text representation of the speech input is generated by applying the plurality of edit operations to the token sequence in accordance with the sequence of labels. A task responsive to the speech input is performed using the generated written-form text representation.
-