Retrieval and management of spoken language understanding personalization data
    3.
    发明授权
    Retrieval and management of spoken language understanding personalization data 有权
    口语理解个性化数据的检索和管理

    公开(公告)号:US09361289B1

    公开(公告)日:2016-06-07

    申请号:US14015697

    申请日:2013-08-30

    CPC classification number: G10L15/18 G06F17/30684 G10L15/07

    Abstract: Features are disclosed for maintaining data that can be used to personalize spoken language processing, such as automatic speech recognition (“ASR”), natural language understanding (“NLU”), natural language processing (“NLP”), etc. The data may be obtained from various data sources, such as applications or services used by the user. User-specific data maintained by the data sources can be retrieved and stored for use in generating personal models. Updates to data at the data sources may be reflected by separate data sets in the personalization data, such that other processes can obtain the update data sets separate from other data.

    Abstract translation: 公开了用于维护可用于个性化口语处理的数据的特征,例如自动语音识别(“ASR”),自然语言理解(“NLU”),自然语言处理(“NLP”)等。数据可以 可以从各种数据源获得,例如用户使用的应用程序或服务。 可以检索和存储由数据源维护的用户特定数据,以用于生成个人模型。 数据源上的数据更新可以通过个性化数据中的单独的数据集来反映,使得其他进程可以获得与其他数据分离的更新数据集。

    Systems and methods for activating subtitles

    公开(公告)号:US09852773B1

    公开(公告)日:2017-12-26

    申请号:US14313723

    申请日:2014-06-24

    CPC classification number: G11B27/22

    Abstract: According to one or more embodiments of the disclosure, a method is provided. The method may include executing playback of a video. The method may also include receiving user input to rewind at least one portion of the video. Further, the method may include restarting playback of the video at a previous position before the at least one portion of the video. The method may also include activating subtitles associated with the video during playback of the video from the previous position, wherein the subtitles are displayed during playback of the at least one portion of the video. Additionally, the method may include deactivating subtitles during playback of the video after a predetermined amount of time.

    Voice-assisted scanning
    7.
    发明授权

    公开(公告)号:US09767501B1

    公开(公告)日:2017-09-19

    申请号:US14074346

    申请日:2013-11-07

    CPC classification number: G06Q30/0623

    Abstract: In some cases, a handheld device that includes a microphone and a scanner may be used for voice-assisted scanning. For example, a user may provide a voice input via the microphone and may activate the scanner to scan an item identifier (e.g., a barcode). The handheld device may communicate voice data and item identifier information to a remote system for voice-assisted scanning. The remote system may perform automatic speech recognition (ASR) operations on the voice data and may perform item identification operations based on the scanned identifier. Natural language understanding (NLU) processing may be improved by combining ASR information with item information obtained based on the scanned identifier. An action may be executed based on the likely user intent.

    Maximum likelihood channel normalization
    8.
    发明授权
    Maximum likelihood channel normalization 有权
    最大似然信道规范化

    公开(公告)号:US09378729B1

    公开(公告)日:2016-06-28

    申请号:US13797662

    申请日:2013-03-12

    Abstract: Features are disclosed for applying maximum likelihood methods to channel normalization in automatic speech recognition (“ASR”). Feature vectors computed from an audio input of a user utterance can be compared to a Gaussian mixture model. The Gaussian that corresponds to each feature vector can be determined, and statistics (e.g., constrained maximum likelihood linear regression statistics) can then be accumulated for each feature vector. Using these statistics, or some subset thereof, offsets and/or a diagonal transform matrix can be computed for each feature vector. The offsets and/or diagonal transform matrix can be applied to the corresponding feature vector to generate a feature vector normalized based on maximum likelihood methods. The ASR process can then proceed using the transformed feature vectors.

    Abstract translation: 公开了用于将最大似然方法应用于自动语音识别(“ASR”)中的信道归一化的特征。 从用户发声的音频输入计算的特征向量可以与高斯混合模型进行比较。 可以确定对应于每个特征向量的高斯,然后可以为每个特征向量累积统计(例如,约束最大似然线性回归统计量)。 对于每个特征向量,可以使用这些统计量或其一些子集来计算偏移和/或对角变换矩阵。 偏移和/或对角变换矩阵可以应用于相应的特征向量,以生成基于最大似然方法归一化的特征向量。 ASR过程然后可以使用变换的特征向量进行。

    Automatic volume attenuation for speech enabled devices
    9.
    发明授权
    Automatic volume attenuation for speech enabled devices 有权
    支持语音功能的设备自动音量衰减

    公开(公告)号:US09324322B1

    公开(公告)日:2016-04-26

    申请号:US13920446

    申请日:2013-06-18

    Abstract: A speech recognition system that also automatically recognizes and acts in response to significant audio interruptions. Received audio is compared with stored acoustic signatures of noises which may trigger a change in device operation, such as pausing, loudening or attenuating of content playback after hearing a certain audio interruption, such as a doorbell, etc. If the received audio matches a stored acoustic model, the system alters an operational state of one or more devices, which may or may not include itself.

    Abstract translation: 一种语音识别系统,还可以自动识别并响应重大音频中断而起作用。 接收的音频与存储的可能触发设备操作变化的噪声的声学特征进行比较,例如在听到诸如门铃等某些音频中断之后暂停,扬声或衰减内容播放。如果接收到的音频与存储的 声学模型,系统改变一个或多个设备的操作状态,其可以包括或可以不包括其自身。

    SMART CIRCULAR AUDIO BUFFER
    10.
    发明申请
    SMART CIRCULAR AUDIO BUFFER 有权
    智能通信音频缓冲器

    公开(公告)号:US20150066494A1

    公开(公告)日:2015-03-05

    申请号:US14016403

    申请日:2013-09-03

    Abstract: An audio buffer is used to capture audio in anticipation of a user command to do so. Sensors and processor activity may be monitored, looking for indicia suggesting that the user command may be forthcoming. Upon detecting such indicia, a circular buffer is activated. Audio correction may be applied to the audio stored in the circular buffer. After receiving the user command instructing the device to process or record audio, at least a portion of the audio that was stored in the buffer before the command is combined with audio received after the command. The combined audio may then be processed, transmitted or stored.

    Abstract translation: 使用音频缓冲器来捕获音频,期望用户命令这样做。 可以监控传感器和处理器的活动,寻找表明用户命令可能出现的标记。 在检测到这样的标记时,循环缓冲器被激活。 音频校正可以应用于存储在循环缓冲器中的音频。 在接收到指令设备处理或记录音频的用户命令之后,在命令与命令之后接收的音频组合之后存储在缓冲器中的音频的至少一部分。 然后可以处理,发送或存储组合的音频。

Patent Agency Ranking