-
公开(公告)号:US20230062201A1
公开(公告)日:2023-03-02
申请号:US17537104
申请日:2021-11-29
Applicant: GOOGLE LLC
Inventor: Akshay Goel , Nitin Khandelwal , Richard Park , Brian Chatham , Jonathan Eccles , David Sanchez , Dmytro Lapchuk
Abstract: Implementations described herein are directed to enabling collaborative ranking of interpretations of spoken utterances based on data that is available to an automated assistant and third-party agent(s), respectively. The automated assistant can determine first-party interpretation(s) of a spoken utterance provided by a user, and can cause the third-party agent(s) to determine third-party interpretation(s) of the spoken utterance provided by the user. In some implementations, the automated assistant can select a given interpretation, from the first-party interpretation(s) and the third-party interpretation(s), of the spoken utterance, and can cause a given third-party agent to satisfy the spoken utterance based on the given interpretation. In additional or alternative implementations, an independent third-party agent can obtain the first-party interpretation(s) and the third-party interpretation(s), select the given interpretation, and then transmit the given interpretation to the automated assistant and/or the given third-party agent.
-
2.
公开(公告)号:US12244568B2
公开(公告)日:2025-03-04
申请号:US17893728
申请日:2022-08-23
Applicant: GOOGLE LLC
Inventor: Akshay Goel , Jonathan Eccles , Nitin Khandelwal , Sarvjeet Singh , David Sanchez , Ashwin Ram
IPC: H04L9/40
Abstract: Implementations described herein utilize an independent server for facilitating secure exchange of data between multiple disparate parties. The independent server receives client data, via an automated assistant application executing at least in part at a client device, that is to be transmitted to a given third-party application. The independent server processes the client data, using a first encoder-decoder model, to generate opaque client data, and transmits the opaque client data to the given third-party application and without transmitting any of the client data. Further, the independent server receives response data, via the given third-party application, that is generated based on the opaque client data and that is to be transmitted back to the client device. The independent server processes the response data, using a second encoder-decoder model, to generate opaque response data, and transmits the opaque response data to the client device and without transmitting any of the response data.
-
公开(公告)号:US11935530B2
公开(公告)日:2024-03-19
申请号:US17515901
申请日:2021-11-01
Applicant: Google LLC
Inventor: April Pufahl , Jared Strawderman , Harry Yu , Adriana Olmos Antillon , Jonathan Livni , Okan Kolak , James Giangola , Nitin Khandelwal , Jason Kearns , Andrew Watson , Joseph Ashear , Valerie Nygaard
CPC classification number: G10L15/22 , G06F1/1694 , G06F3/167 , G06F2203/0381 , G10L2015/223 , G10L2015/225 , H04M2203/253
Abstract: Systems, methods, and apparatus for using a multimodal response in the dynamic generation of client device output that is tailored to a current modality of a client device is disclosed herein. Multimodal client devices can engage in a variety of interactions across the multimodal spectrum including voice only interactions, voice forward interactions, multimodal interactions, visual forward interactions, visual only interactions etc. A multimodal response can include a core message to be rendered for all interaction types as well as one or more modality dependent components to provide a user with additional information.
-
公开(公告)号:US20200082173A1
公开(公告)日:2020-03-12
申请号:US16687118
申请日:2019-11-18
Applicant: Google LLC
Inventor: Balakrishnan Varadarajan , George Dan Toderici , Apostol Natsev , Nitin Khandelwal , Sudheendra Vijayanarasimhan , Weilong Yang , Sanketh Shetty
Abstract: A system and methodology provide for annotating videos with entities and associated probabilities of existence of the entities within video frames. A computer-implemented method identifies an entity from a plurality of entities identifying characteristics of video items. The computer-implemented method selects a set of features correlated with the entity based on a value of a feature of a plurality of features, determines a classifier for the entity using the set of features, and determines an aggregation calibration function for the entity based on the set of features. The computer-implemented method selects a video frame from a video item, where the video frame having associated features, and determines a probability of existence of the entity based on the associated features using the classifier and the aggregation calibration function.
-
公开(公告)号:US12014542B2
公开(公告)日:2024-06-18
申请号:US17120525
申请日:2020-12-14
Applicant: Google LLC
Inventor: Sanketh Shetty , Tomas Izo , Min-Hsuan Tsai , Sudheendra Vijayanarasimhan , Apostol Natsev , Sami Abu-El-Haija , George Dan Toderici , Susana Ricco , Balakrishnan Varadarajan , Nicola Muscettola , WeiHsin Gu , Weilong Yang , Nitin Khandelwal , Phuong Le
IPC: G06K9/00 , G06F16/783 , G06V20/40
CPC classification number: G06V20/41 , G06F16/7834 , G06V20/46 , G06V20/47 , G06V20/49 , G06V2201/10
Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.
-
公开(公告)号:US20240169989A1
公开(公告)日:2024-05-23
申请号:US18430253
申请日:2024-02-01
Applicant: GOOGLE LLC
Inventor: April Pufahl , Jared Strawderman , Harry Yu , Adriana Olmos Antillon , Jonathan Livni , Okan Kolak , James Giangola , Nitin Khandelwal , Jason Kearns , Andrew Watson , Joseph Ashear , Valerie Nygaard
CPC classification number: G10L15/22 , G06F1/1694 , G06F3/167 , G06F2203/0381 , G10L2015/223 , G10L2015/225 , H04M2203/253
Abstract: Systems, methods, and apparatus for using a multimodal response in the dynamic generation of client device output that is tailored to a current modality of a client device is disclosed herein. Multimodal client devices can engage in a variety of interactions across the multimodal spectrum including voice only interactions, voice forward interactions, multimodal interactions, visual forward interactions, visual only interactions etc. A multimodal response can include a core message to be rendered for all interaction types as well as one or more modality dependent components to provide a user with additional information.
-
7.
公开(公告)号:US20240031339A1
公开(公告)日:2024-01-25
申请号:US17893728
申请日:2022-08-23
Applicant: GOOGLE LLC
Inventor: Akshay Goel , Jonathan Eccles , Nitin Khandelwal , Sarvjeet Singh , David Sanchez , Ashwin Ram
IPC: H04L9/40
CPC classification number: H04L63/04 , H04L63/0853
Abstract: Implementations described herein utilize an independent server for facilitating secure exchange of data between multiple disparate parties. The independent server receives client data, via an automated assistant application executing at least in part at a client device, that is to be transmitted to a given third-party application. The independent server processes the client data, using a first encoder-decoder model, to generate opaque client data, and transmits the opaque client data to the given third-party application and without transmitting any of the client data. Further, the independent server receives response data, via the given third-party application, that is generated based on the opaque client data and that is to be transmitted back to the client device. The independent server processes the response data, using a second encoder-decoder model, to generate opaque response data, and transmits the opaque response data to the client device and without transmitting any of the response data.
-
公开(公告)号:US20180239964A1
公开(公告)日:2018-08-23
申请号:US15959858
申请日:2018-04-23
Applicant: Google LLC
Inventor: Sanketh Shetty , Tomas Izo , Min-Hsuan Tsai , Sudheendra Vijayanarasimhan , Apostol Natsev , Sami Abu-El-Haija , George Dan Toderici , Susanna Ricco , Balakrishnan Varadarajan , Nicola Muscettola , WeiHsin Gu , Weilong Yang , Nitin Khandelwal , Phuong Le
CPC classification number: G06K9/00718 , G06F16/7834 , G06K9/00744 , G06K9/00751 , G06K9/00765 , G06K2209/27
Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.
-
公开(公告)号:US09953222B2
公开(公告)日:2018-04-24
申请号:US14848216
申请日:2015-09-08
Applicant: Google LLC
Inventor: Sanketh Shetty , Tomas Izo , Min-Hsuan Tsai , Sudheendra Vijayanarasimhan , Apostol Natsev , Sami Abu-El-Haija , George Dan Toderici , Susanna Ricco , Balakrishnan Varadarajan , Nicola Muscettola , WeiHsin Gu , Weilong Yang , Nitin Khandelwal , Phuong Le
CPC classification number: G06K9/00718 , G06F17/30787 , G06K9/00744 , G06K9/00751 , G06K9/00765 , G06K2209/27
Abstract: A computer-implemented method for selecting representative frames for videos is provided. The method includes receiving a video and identifying a set of features for each of the frames of the video. The features including frame-based features and semantic features. The semantic features identifying likelihoods of semantic concepts being present as content in the frames of the video. A set of video segments for the video is subsequently generated. Each video segment includes a chronological subset of frames from the video and each frame is associated with at least one of the semantic features. The method generates a score for each frame of the subset of frames for each video segment based at least on the semantic features, and selecting a representative frame for each video segment based on the scores of the frames in the video segment. The representative frame represents and summarizes the video segment.
-
公开(公告)号:US12254886B2
公开(公告)日:2025-03-18
申请号:US18590549
申请日:2024-02-28
Applicant: GOOGLE LLC
Inventor: Akshay Goel , Nitin Khandelwal , Richard Park , Brian Chatham , Jonathan Eccles , David Sanchez , Dmytro Lapchuk
Abstract: Implementations described herein are directed to enabling collaborative ranking of interpretations of spoken utterances based on data that is available to an automated assistant and third-party agent(s), respectively. The automated assistant can determine first-party interpretation(s) of a spoken utterance provided by a user, and can cause the third-party agent(s) to determine third-party interpretation(s) of the spoken utterance provided by the user. In some implementations, the automated assistant can select a given interpretation, from the first-party interpretation(s) and the third-party interpretation(s), of the spoken utterance, and can cause a given third-party agent to satisfy the spoken utterance based on the given interpretation. In additional or alternative implementations, an independent third-party agent can obtain the first-party interpretation(s) and the third-party interpretation(s), select the given interpretation, and then transmit the given interpretation to the automated assistant and/or the given third-party agent.
-
-
-
-
-
-
-
-
-