Patent search ap:("Roku Page Inc.") AND inv:"Kapil Kumar"

1.

发明公开
PERSONALIZED RETRIEVAL SYSTEM 审中-公开

公开(公告)号：US20240346084A1

公开(公告)日：2024-10-17

申请号：US18398495

申请日：2023-12-28

Applicant: Roku, Inc.

Inventor： Kapil Kumar , Abhishek Majumdar , Danish Shaikh , Nitish Aggarwal , Srimaruti Manoj Nimmagadda , Aniruddha Das

IPC: G06F16/9035 , G06F16/9038 , G06F40/40

CPC classification number: G06F16/9035 , G06F16/9038 , G06F40/40

Abstract: Disclosed are system, method and/or computer program product embodiments that retrieve items for a user based on a query using a two-tower deep machine learning model. An example embodiment provides input to a context tower, wherein the input includes the query and one or more of a query embedding corresponding to the query or a graph user embedding corresponding to the user. The context tower generates a context embedding in a vector space based on the input. The model determines a measure of similarity between the context embedding and each of a plurality of item embeddings in the vector space that are generated by an item tower and represent a plurality of candidate items. A relevancy score is calculated for each candidate item based on the measure of similarity between the context embedding and the corresponding item embedding. The relevancy scores are used for item retrieval and/or ranking.

2.

发明申请
RETRIEVAL OPTIMIZATION USING REINFORCEMENT LEARNING 有权

公开(公告)号：US20250103894A1

公开(公告)日：2025-03-27

申请号：US18423825

申请日：2024-01-26

Applicant: Roku, Inc.

Inventor： Abhishek Majumdar , Yuxi Liu , Kapil Kumar , Nitish Aggarwal , Manasi Deshmukh , Danish Nasir Shaikh , Ravi Tiwari

IPC: G06N3/092 , G06F16/2457 , G06N3/0455

Abstract: Retrieving content items in response to a query in a way that increases user satisfaction and increases chances of users consuming a retrieved content item is not trivial. One retrieval strategy may include dividing the content items into buckets according to a dimension about the content items and retrieving a top K number of items from different buckets to balance semantic affinity and the dimension. Choosing an optimal K for different buckets for a given query can be a challenge. Reinforcement learning can be used to train and implement an agent model that can choose the optimal K for different buckets.

3.

发明申请
ENHANCING TRANSFER LEARNING FOR LARGE LANGUAGE MODELS 有权

公开(公告)号：US20250045575A1

公开(公告)日：2025-02-06

申请号：US18423802

申请日：2024-01-26

Applicant: Roku, Inc.

Inventor： Abhishek Majumdar , Kapil Kumar , Nitish Aggarwal , Danish Nasir Shaikh , Manasi Deshmukh , Apoorva Jakalannanavar Halappa Manjula

IPC: G06N3/08

Abstract: Pre-trained large language models may be trained on a large data set which may not necessarily align with specific tasks, business goals, and requirements. Pre-trained large language models can solve generic semantic relationship or question-answering type problems but may not be suited for content item retrieval or recommendation of content items that are semantically relevant to a query. It is possible to build a machine learning model while using transfer learning to learn from pre-trained large language models. Training data can significantly impact the performance of machine learning models, especially machine learning models developed using transfer learning. The training data can impact a model's performance, generalization, fairness, and adaptation to specific domains. To address some of these concerns, a popularity bucketing strategy can be implemented to debias training data. Optionally, an ensemble of models can be used to generate diverse training data.

4.

发明公开
HETEROGENEOUS GRAPH NEURAL NETWORK USING OFFSET TEMPORAL LEARNING FOR SEARCH PERSONALIZATION 审中-公开

公开(公告)号：US20240346309A1

公开(公告)日：2024-10-17

申请号：US18582249

申请日：2024-02-20

Applicant: Roku, Inc.

Inventor： Abhishek Majumdar , Kapil Kumar , Nitish Aggarwal , Srimaruti Manoj Nimmagadda

IPC: G06N3/08 , G06N3/042

CPC classification number: G06N3/08 , G06N3/042

Abstract: Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for training a heterogenous graph neural network (GNN) to generate user embeddings corresponding to users and item embeddings corresponding to items. An example embodiment generates a first user interaction graph for a first time window and a second user interaction graph for a second time window, wherein each graph represents users and items as nodes and user-item interactions within the respective time window as edges, samples user-item node pairs from the second user interaction graph, and trains the heterogeneous GNN based on user-item node pairs from the first user interaction graph that correspond to the sampled user-item node pairs from the second user interaction graph. User and item embeddings generated by the trained GNN may be used to determine a relevancy of a given item with respect to a given user.

5.

发明授权
Multimodal analysis for content item semantic retrieval and identification 有权

公开(公告)号：US12153588B2

公开(公告)日：2024-11-26

申请号：US18167724

申请日：2023-02-10

Applicant: ROKU, INC.

Inventor： Peter Martigny , Fedor Bartosh , Danish Shaikh , Vinh Nguyen , Manasi Deshmukh , Ratul Ray , Nitish Aggarwal , Srimaruti Manoj Nimmagadda , Kapil Kumar , Sameer Girolkar

IPC: G06F16/2457 , G06F16/242 , G06F16/9535

Abstract: A content retrieval system may receive a query associated with a plurality of content items in a repository. For each content item of the plurality of content items: a respective first and second similarity score may be generated based on a similarity between embeddings indicative of a first and second data type generated from the query and for the content item; and a respective normalized similarity score may be generated based on a combination of the respective first and second similarity scores. A set of content items with respective normalized similarity scores that satisfy a similarity score threshold may be identified. An exact-match (lexical) search may yield respective mapping scores for content items that may also be ranked. An output indicative of content items that are identified in the set of content items with high-ranking similarity scores and identified in the set of content items with high-ranking mapping scores.

6.

发明申请
USING A LARGE LANGUAGE MODEL TO IMPROVE TRAINING DATA 有权

公开(公告)号：US20250045535A1

公开(公告)日：2025-02-06

申请号：US18423789

申请日：2024-01-26

Applicant: Roku, Inc.

Inventor： Kapil Kumar , Abhishek Majumdar , Nitish Aggarwal , Srimaruti Manoj Nimmagadda

IPC: G06F40/40

Abstract: Training data can significantly impact the performance of machine learning models. Its impact may be more significant in transfer learning. Different data sources can be used to generate training data used in transfer learning. The training data originating from user interaction logs may be subject to presentation bias. The training data originating from model generated labeled data may have false positives. Poor quality training data may cause the machine learning model to perform poorly. To address some of these concerns, a checker having one or more models can check for false positives and for labeled data entries that may have been subject to presentation bias. Such entries may be removed or modified. In some cases, the checker can generate a test that can be used to test the machine learning model and penalize the machine learning model if the model generates an incorrect prediction.

7.

发明申请
MULTIMODAL ANALYSIS FOR CONTENT ITEM SEMANTIC RETRIEVAL AND IDENTIFICATION 有权

公开(公告)号：US20250036638A1

公开(公告)日：2025-01-30

申请号：US18911887

申请日：2024-10-10

Applicant: ROKU, INC.

Inventor： Peter Martigny , Fedor Bartosh , Danish Shaikh , Vinh Nguyen , Manasi Deshmukh , Ratul Ray , Nitish Aggarwal , Srimaruti Manoj Nimmagadda , Kapil Kumar , Sameer Girolkar

IPC: G06F16/2457 , G06F16/242 , G06F16/9535

Abstract: A content retrieval system may receive a query associated with a plurality of content items in a repository. For each content item of the plurality of content items: a respective first and second similarity score may be generated based on a similarity between embeddings indicative of a first and second data type generated from the query and for the content item; and a respective normalized similarity score may be generated based on a combination of the respective first and second similarity scores. A set of content items with respective normalized similarity scores that satisfy a similarity score threshold may be identified. An exact-match (lexical) search may yield respective mapping scores for content items that may also be ranked. An output indicative of content items that are identified in the set of content items with high-ranking similarity scores and identified in the set of content items with high-ranking mapping scores.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification