Voice recognition method, apparatus, device and storage medium

    公开(公告)号:US11087763B2

    公开(公告)日:2021-08-10

    申请号:US16236295

    申请日:2018-12-28

    Abstract: A voice recognition method is provided by embodiments of the present application. The method includes: obtaining a voice signal to be recognized; and recognizing a current frame in the voice signal using a pre-trained causal acoustic model, according to the current frame in the voice signal and a frame within a preset time period before the current frame, the causal acoustic model being derived based on a causal convolutional neural network training. In the method provided by the embodiments of the present application, only the information of the current frame and the frame before the current frame is used when performing the recognition of the current frame, thereby solving a problem in voice recognition technologies based on prior art convolutional neural network where a hard delay is created because there is a need to wait for the frames after the current frame, improving the timeliness of the voice recognition.

    INFORMATION PROCESSING DEVICE AND INFORMATION PROCESSING METHOD

    公开(公告)号:US20210225363A1

    公开(公告)日:2021-07-22

    申请号:US16972420

    申请日:2019-03-12

    Abstract: The present invention has an issue of effectively reducing the input load related to a voice trigger. There is provided an information processing device comprising a registration control unit that dynamically controls registration of startup phrases used as start triggers of a voice interaction session, in which the registration control unit temporarily additionally registers at least one of the startup phrases based on input voice. There is also provided an information processing method comprising dynamically controlling, by a processor, registration of startup phrases used as start triggers of a voice interaction session, in which the controlling further includes temporarily additionally registering at least one of the startup phrases based on input voice.

    Speaker Assembly in a Display Assistant Device

    公开(公告)号:US20210191456A1

    公开(公告)日:2021-06-24

    申请号:US17196060

    申请日:2021-03-09

    Applicant: Google LLC

    Abstract: In a display assistant device, a speaker is mounted in a waveguide structure which is at least partially disposed beneath a display screen. The waveguide structure is mounted in an exterior housing which includes speaker grills distributed on a plurality of surfaces of the exterior housing, permitting sound waves from the speaker to be projected outside the exterior housing. A cover structure is disposed on top of the waveguide structure to conceal the waveguide structure and speaker within the exterior housing. The cover structure has a tilted bottom surface configured to be suspended above the waveguide structure and to be separated by a first space. Sound waves projected from an upper portion of the speaker are reflected by the tilted bottom surface and are guided through the first space to exit the device from a speaker grill portion located on a rear side of the exterior housing.

    MULTI-COMMAND SINGLE UTTERANCE INPUT METHOD

    公开(公告)号:US20210151041A1

    公开(公告)日:2021-05-20

    申请号:US17127394

    申请日:2020-12-18

    Applicant: Apple Inc.

    Abstract: Systems and processes are disclosed for handling a multi-part voice command for a virtual assistant. Speech input can be received from a user that includes multiple actionable commands within a single utterance. A text string can be generated from the speech input using a speech transcription process. The text string can be parsed into multiple candidate substrings based on domain keywords, imperative verbs, predetermined substring lengths, or the like. For each candidate substring, a probability can be determined indicating whether the candidate substring corresponds to an actionable command. Such probabilities can be determined based on semantic coherence, similarity to user request templates, querying services to determine manageability, or the like. If the probabilities exceed a threshold, the user intent of each substring can be determined, processes associated with the user intents can be executed, and an acknowledgment can be provided to the user.

    VOCALLY ACTUATED SURGICAL CONTROL SYSTEM

    公开(公告)号:US20210141597A1

    公开(公告)日:2021-05-13

    申请号:US17064549

    申请日:2020-10-06

    Abstract: The following invention is a vocally activated control system for controlling an apparatus in a surgical setting, the system comprises: a. a voice sensor configured to detect vocal commands generated by surgeons during surgery; b. a signal transmitter connected to the voice sensor, the transmitter is configured to convert a vocal command into a transmittable signal and transmit it; c. a processor connected to a signal transmitter configured to receive a transmittable vocal signal, the processor is configured to convert a vocal signal to a predetermined set of operative instructions associated with the apparatus, the predetermined set of operative instructions comprising at least one instruction; and d. control means connected to the processor and apparatus; the control means is configured to receive a predetermined set of operative instructions and to cause the apparatus to operate accordingly; Said voice sensor and said transmitter are integrated within a wearable element.

    Speaker assembly in a display assistant device

    公开(公告)号:US10996717B2

    公开(公告)日:2021-05-04

    申请号:US16597745

    申请日:2019-10-09

    Applicant: GOOGLE LLC

    Abstract: In a display assistant device, a speaker is mounted in a waveguide structure which is at least partially disposed beneath a display screen. The waveguide structure is mounted in an exterior housing which includes speaker grills distributed on a plurality of surfaces of the exterior housing, permitting sound waves from the speaker to be projected outside the exterior housing. A cover structure is disposed on top of the waveguide structure to conceal the waveguide structure and speaker within the exterior housing. The cover structure has a tilted bottom surface configured to be suspended above the waveguide structure and to be separated by a first space. Sound waves projected from an upper portion of the speaker are reflected by the tilted bottom surface and are guided through the first space to exit the device from a speaker grill portion located on a rear side of the exterior housing.

    ELECTRONIC DEVICE AND SCREEN CONTROL METHOD FOR PROCESSING USER INPUT BY USING SAME

    公开(公告)号:US20210109703A1

    公开(公告)日:2021-04-15

    申请号:US16499197

    申请日:2018-03-27

    Abstract: Various embodiments of the present invention relate to an electronic device and a screen control method for processing a user input by using the same, and according to the various embodiments of the present invention, the electronic device comprises: a housing; a touchscreen display located inside the housing and exposed through a first part of the housing; a microphone located inside the housing and exposed through a second part of the housing; at least one speaker located inside the housing and exposed through a third part of the housing; a wireless communication circuit located inside the housing; a processor located inside the housing and electrically connected to the touchscreen display, the microphone, the at least one speaker, and the wireless communication circuit; and a memory located inside the housing and electrically connected to the processor, wherein the memory stores a first application program including a first user interface and a second application program including a second user interface, wherein the memory stores instructions, and when the memory is executed, cause the processor to: display the first user interface on the display, while displaying the first user interface, receive a user input through at least one of the display or the microphone, wherein the user input includes a request for performing a task using the second application program, transmit data associated with the user input to an external server via the communication circuit, receive a response from the external server via the communication circuit, wherein the response includes information on a sequence of states of the electronic device to perform the task, and after receiving the response, display the second user interface on a first region of the display, based on the sequence of the states, while displaying a portion of the first user interface on a second region of the display. Other various embodiments, in addition to the various embodiments disclosed in the present invention, are possible.

Patent Agency Ranking