-
公开(公告)号:US20240363101A1
公开(公告)日:2024-10-31
申请号:US18771489
申请日:2024-07-12
发明人: Newton Jain , Sameer Syed Zaheer
CPC分类号: G10L15/063 , G06F8/41 , G10L15/16 , G10L2015/088
摘要: A server supports multiple virtual assistants. It receives requests that include wake phrase audio and an identification of the source of the request, such as a virtual assistant device. Based on the identification, the server searches a database for a wake phrase detector appropriate for the identified source. The server then applies the wake phrase detector to the received wake phrase audio. If the wake phrase audio triggers the wake phrase detector, the server provides an appropriate response to the source.
-
2.
公开(公告)号:US20240346031A1
公开(公告)日:2024-10-17
申请号:US18665264
申请日:2024-05-15
IPC分类号: G06F16/2457 , G06F16/951 , G06Q30/0251
CPC分类号: G06F16/24578 , G06F16/951 , G06Q30/0256
摘要: The technology disclosed relates to natural language understanding-based search engines, ranking sponsored search results and simulated ranking of sponsored search results. Tools and methods describe how to simulate the ranking of sponsored search results. The tools further identify instances of user queries within the scope of trigger patterns, optionally providing examples both of user queries for which a sponsored search result is likely to be displayed and examples for which the sponsored search result will not rank highly enough to be displayed, at least on the first page of search results.
-
公开(公告)号:US20240331702A1
公开(公告)日:2024-10-03
申请号:US18743562
申请日:2024-06-14
发明人: Kiersten L. BRADLEY , Ethan COEYTAUX , Ziming YIN
IPC分类号: G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/06 , G10L15/07
CPC分类号: G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/063 , G10L15/07 , G10L2015/0631
摘要: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.
-
公开(公告)号:US20240276138A1
公开(公告)日:2024-08-15
申请号:US18642492
申请日:2024-04-22
发明人: Karl Stahl
CPC分类号: H04R1/1083 , G10L15/22 , G10L21/0316 , G10L25/06 , G10L25/51 , H04R1/08 , G10L2015/223 , H04R5/0335 , H04R2420/07
摘要: A method for processing an audio signal involves receiving sound waves at a microphone, converting them into a first audio signal, and extracting a second audio signal from an electromagnetic signal received at a receiver. The first audio signal is correlated with the second audio signal to calculate a correlation value. If the correlation value exceeds a threshold, the first audio signal is processed using the second audio signal to reduce unwanted sound contributions, resulting in a processed audio signal. Further processing is then performed on the processed audio signal to determine a characteristic of the desired sound.
-
公开(公告)号:US20240046918A1
公开(公告)日:2024-02-08
申请号:US18474853
申请日:2023-09-26
IPC分类号: G10L15/06 , G10L15/16 , G10L15/18 , G10L13/02 , G10L15/197 , G10L15/22 , G10L15/187
CPC分类号: G10L15/063 , G10L15/16 , G10L15/1815 , G10L13/02 , G10L15/197 , G10L15/22 , G10L15/187
摘要: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.
-
公开(公告)号:US20240347055A1
公开(公告)日:2024-10-17
申请号:US18752481
申请日:2024-06-24
发明人: Karl Stahl
IPC分类号: G10L15/19 , G06F16/242 , G06F40/253 , G10L15/07 , G10L15/22 , G10L15/30
CPC分类号: G10L15/19 , G06F16/243 , G06F40/253 , G10L15/07 , G10L15/22 , G10L15/30 , G10L2015/223
摘要: [Object] Technology is provided to enable a mobile terminal to function as a digital assistant even when the mobile terminal is in a state where it cannot communicate with a server apparatus. [Solution] When a user terminal 200 receives a query A from a user, user terminal 200 sends query A to a server 100. Server 100 interprets the meaning of query A using a grammar A. Server 100 obtains a response to query A based on the meaning of query A and sends the response to user terminal 200. Server 100 further sends grammar A to user terminal 200. That is, server 100 sends to user terminal 200 a grammar used to interpret the query received from user terminal 200.
-
7.
公开(公告)号:US20240331697A1
公开(公告)日:2024-10-03
申请号:US18739011
申请日:2024-06-10
发明人: Utku Yabas , Philipp Hubert , Karl Stahl
IPC分类号: G10L15/22 , G06F40/211 , G06F40/284 , G10L15/183 , G10L15/26
CPC分类号: G10L15/22 , G06F40/211 , G06F40/284 , G10L15/183 , G10L15/26 , G10L2015/223
摘要: A user specifies a natural language command to a device. Software on the device generates contextual metadata about the user interface of the device, such as data about all visible elements of the user interface, and sends the contextual metadata along with the natural language command to a natural language understanding engine. The natural language understanding engine parses the natural language query using a stored grammar (e.g., a grammar provided by a maker of the device) and as a result of the parsing identifies information about the command (e.g., the user interface elements referenced by the command) and provides that information to the device. The device uses that provided information to respond to the command.
-
公开(公告)号:US20240046044A1
公开(公告)日:2024-02-08
申请号:US18381593
申请日:2023-10-18
摘要: Support for natural language expressions is provided by the use of semantic grammars that describe the structure of expressions in that grammar and that construct the meaning of a corresponding natural language expression. A semantic grammar extension mechanism is provided, which allows one semantic grammar to be used in the place of another semantic grammar. This enriches the expressivity of semantic grammars in a simple, natural, and decoupled manner.
-
公开(公告)号:US20240296844A1
公开(公告)日:2024-09-05
申请号:US18637771
申请日:2024-04-17
发明人: Tim Stonehocker , Zizo Gowayyed , Mijad Emami , Matthias Eichstaedt , Evelyn JIANG , Ryan BERRYHILL , Mathieu RAMONA , Neil VEIRA
摘要: A data processing system includes a queue manager receiving data processing requests and determining a queue depth representing the number of pending requests. A load supervisor assigns a service level to each request based on the queue depth when the request is at the head of the queue. The system offers two service levels, with the second level requiring fewer computing resources than the first. This dynamic management system optimizes resource allocation by adjusting service levels based on the workload, ensuring efficient processing of data requests.
-
公开(公告)号:US20240296197A1
公开(公告)日:2024-09-05
申请号:US18662973
申请日:2024-05-13
发明人: Masaki NAITO , Keisuke TSUCHIDA , Jun YONEYAMA , Kaku SAWADA
IPC分类号: G06F16/955 , G06F16/33 , G06F40/40 , G10L15/26
CPC分类号: G06F16/9566 , G06F16/3344 , G06F40/40 , G10L15/26
摘要: As audio (1) is input to an extension of a browser, the extension transmits the audio (1) to a language processing server. A speech recognition unit obtains a text (1) corresponding to the audio (1), and transmits the text (1) to a natural language understanding unit. In the natural language understanding unit, an information processing unit identifies a URL (1) corresponding to the text (1), and transmits the URL (1) to the browser. The extension passes the URL (1) to a browsing function. The browsing function uses the URL (1) to access a web server. The web server transmits a web page (1) corresponding to the URL (1) to the browser. The browsing function shows a screen corresponding to the web page (1) on a display.
-
-
-
-
-
-
-
-
-