Abstract:
A method for providing object information for a scene in a wearable computer is disclosed. In this method, an image of the scene is captured. Further, the method includes determining a current location of the wearable computer and a view direction of an image sensor of the wearable computer and extracting at least one feature from the image indicative of at least one object. Based on the current location, the view direction, and the at least one feature, information on the at least one object is determined Then, the determined information is output.
Abstract:
A method for controlling voice activation by a target keyword in a mobile device is disclosed. The method includes receiving an input sound stream. When the input sound stream indicates speech, the voice activation unit is activated to detect the target keyword and at least one sound feature is extracted from the input sound stream. Further, the method includes deactivating the voice activation unit when the at least one sound feature indicates a non-target keyword.
Abstract:
Certain aspects of the present disclosure provide a method of processing video data. In one example, the method includes receiving input video data; sampling a first subset of clips from the input video data; providing the first subset of clips to a first component of a machine learning model to generate first output; sampling a second subset of clips from the input video data, wherein the second subset of clips comprises fewer clips than the first subset of clips; providing the second subset of clips to a second component of the machine learning model to generate a second output; aggregating the first output from the first component of the machine learning model with the second output from the second component of the machine learning model to generate aggregated output; and determining a characteristic of the input video data based on the aggregated output.
Abstract:
Embodiment systems and methods for presenting a facial expression in a virtual meeting may include detecting a user facial expression of a user based on information received from a sensor of the computing device, determining whether the detected user facial expression is approved for presentation on an avatar in a virtual meeting, generating an avatar exhibiting a facial expression consistent with the detected user facial expression in response to determining that the detected user facial expression is approved for presentation on an avatar in the virtual meeting, generating an avatar exhibiting a facial expression that is approved for presentation in response to determining that the detected user facial expression is not approved for presentation on an avatar in the virtual meeting, and presenting the generated avatar in the virtual meeting.
Abstract:
Certain aspects of the present disclosure provide a method of processing video data. In one example, the method includes receiving input video data; sampling a first subset of clips from the input video data; providing the first subset of clips to a first component of a machine learning model to generate first output; sampling a second subset of clips from the input video data, wherein the second subset of clips comprises fewer clips than the first subset of clips; providing the second subset of clips to a second component of the machine learning model to generate a second output; aggregating the first output from the first component of the machine learning model with the second output from the second component of the machine learning model to generate aggregated output; and determining a characteristic of the input video data based on the aggregated output.
Abstract:
A method for controlling an electronic device in response to speech spoken by a user is disclosed. The method may include receiving an input sound by a sound sensor. The method may also detect the speech spoken by the user in the input sound, determine first characteristics of a first frequency range and second characteristics of a second frequency range of the speech in response to detecting the speech in the input sound, and determine whether a direction of departure of the speech spoken by the user is toward the electronic device based on the first and second characteristics.
Abstract:
A method for controlling voice activation by a target keyword in a mobile device is disclosed. The method includes receiving an input sound stream. When the input sound stream indicates speech, the voice activation unit is activated to detect the target keyword and at least one sound feature is extracted from the input sound stream. Further, the method includes deactivating the voice activation unit when the at least one sound feature indicates a non-target keyword.
Abstract:
An electronic device for generating video data is disclosed. The electronic device may include a communication unit configured to wirelessly receive a video stream captured by a camera, wherein the camera is located in an unmanned aerial vehicle. The electronic device may also include at least one sound sensor configured to receive an input sound stream. In addition, the electronic device may include an audio control unit configured to generate an audio stream associated with the video stream based on the input sound stream. Further, the electronic device may include a synthesizer unit configured to generate the video data based on the video stream and the audio stream.
Abstract:
A method, which is performed by an electronic device, for obtaining a speaker-independent keyword model of a keyword designated by a user is disclosed. The method may include receiving at least one sample sound from the user indicative of the keyword. The method may also generate a speaker-dependent keyword model for the keyword based on the at least one sample sound, send a request for the speaker-independent keyword model of the keyword to a server in response to generating the speaker-dependent keyword model, and receive the speaker-independent keyword model adapted for detecting the keyword spoken by a plurality of users from the server.
Abstract:
According to an aspect of the present disclosure, a method for adjusting a view mirror in a vehicle is disclosed. The method includes obtaining a first angle of the view mirror, capturing an image of a head of a driver, determining, from the captured image, a viewing distance and a perpendicular distance between a location in the head and the view mirror, calculating a view angle based on the viewing distance and the perpendicular distance, wherein the view angle is an angle between a direction orthogonal to the view mirror and a view direction associated with the location in the head and the view mirror, determining a second angle of the view mirror based on the first angle and the view angle, and adjusting the view mirror from the first angle to the second angle.