Methods and systems for performing end-to-end spoken language analysis

    公开(公告)号:US11107462B1

    公开(公告)日:2021-08-31

    申请号:US16175086

    申请日:2018-10-30

    Applicant: Facebook, Inc.

    Abstract: Exemplary embodiments relate to improvements in spoken language understanding (SLU) systems. Conventionally, SLU systems include an automatic speech recognition (ASR) component configured to receive an input of audio data and to generate a textual representation of the audio data. Conventional SLU systems also include a natural language understanding (NLU) component configured to receive a text-based transcript and perform language-based tasks such as domain classification, intent determination, and slot-filling. However, these two components are typically trained separately based on different metrics. In real-world situations, errors in the ASR component propagate to the NLU component, which degrades the performance of the overall system. Exemplary embodiments described herein perform SLU in an end-to-end manner that infers semantic meaning directly from audio features without an intermediate text representation. This may allow for more a more accurate translation performed in a more resource-efficient manner (particularly in terms of processing resources).

Patent Agency Ranking