Patent search ap:("International Business Machines Corporation") AND inv:"Ritesh Kumar Gupta" Page 1

1.

发明申请
INSIGHT EXPANSION IN SMART DATA RETENTION SYSTEMS 有权

公开(公告)号：US20220222265A1

公开(公告)日：2022-07-14

申请号：US17145458

申请日：2021-01-11

Applicant: International Business Machines Corporation

Inventor： Namit Kabra , Ritesh Kumar Gupta , Ron Reuben , Vijay Ekambaram , Smitkumar Narotambhai Marvaniya

IPC: G06F16/25 , G06F16/23 , G06F16/951 , G06F16/215 , G06N20/00

Abstract: A computer-implemented method applies insights from a variety of data sources to each of the data sources. The method includes identifying a set of data sources, wherein each of the data sources are associated with a domain. The method includes analyzing documentation for each of the data sources. The method further includes extracting a set of attributes for each data source, and determining a data schema associated with each data source. The method includes mapping each data schema to a common domain schema. The method also includes linking, based on the mapping and on the set of attributes for each data source, common features across each data source. The method includes generating, in response to the linking, a knowledge graph. The method further includes preparing a visual display for a set of domain insights; and forking the set of domain insights into a first data source.

2.

发明授权
Targeted data acquisition for model training 有权

公开(公告)号：US12217195B2

公开(公告)日：2025-02-04

申请号：US18392342

申请日：2023-12-21

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Namit Kabra , Ritesh Kumar Gupta , Vijay Ekambaram , Smitkumar Narotambhai Marvaniya

IPC: G06F15/16 , G06N5/04 , G06N20/00

Abstract: Targeted acquisition of data for model training includes identifying attributes of classified samples of a collection of samples classified by a classification model, and generating at least one query based on the identified attributes, the at least one query tailored, based on the attributes, to retrieve additional training data for training the classification model to more accurately classify samples and avoid incorrect sample classification.

3.

发明授权
Targeted data acquisition for model training 有权

公开(公告)号：US11907860B2

公开(公告)日：2024-02-20

申请号：US17935341

申请日：2022-09-26

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Namit Kabra , Ritesh Kumar Gupta , Vijay Ekambaram , Smitkumar Narotambhai Marvaniya

IPC: G06F15/16 , G06N5/04 , G06N20/00

CPC classification number: G06N5/04 , G06N20/00

Abstract: Targeted acquisition of data for model training includes automatically generating metadata describing samples, of an initial dataset, in neighborhoods of an embedding space in which the samples are embedded. The samples described by the automatically generated metadata are classified by a classification model, and include both correctly classified samples in the neighborhoods and incorrectly classified samples in the neighborhoods. Additionally, attributes of one or more correctly classified samples of the collection of samples and one or more incorrectly classified samples of the collection of samples are identified, and queries are generated based on the identified attributes, the queries tailored, based on the attributes, to retrieve additional training data for training the classification model to more accurately classify samples and avoid incorrect sample classification.

4.

发明授权
Self-learning selection of information-analysis runtimes 有权

公开(公告)号：US11288601B2

公开(公告)日：2022-03-29

申请号：US16360118

申请日：2019-03-21

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Ritesh Kumar Gupta , Namit Kabra , Eric Allen Jacobson , Scott Louis Brokaw , Jo Arao Ramos

IPC: G06N20/20 , G06N5/04 , G06N5/02 , G06K9/62

Abstract: A self-learning computer-based system has access to multiple runtime modules that are each capable of performing a particular algorithm. Each runtime module implements the algorithm with different code or runs in a different runtime environment. The system responds to a request to run the algorithm by selecting the runtime module or runtime environment that the system predicts will provide the most desirable results based on parameters like accuracy, performance, cost, resource-efficiency, or policy compliance. The system learns how to make such predictions through training sessions conducted by a machine-learning component. This training teaches the system that previous module selections produced certain types of results in the presence of certain conditions. After determining whether similar conditions currently exist, the system uses rules inferred from the training sessions to select the runtime module most likely to produce desired results.

5.

发明授权
Handling expiration of resources allocated by a resource manager running a data integration job 有权

公开(公告)号：US11194629B2

公开(公告)日：2021-12-07

申请号：US16211534

申请日：2018-12-06

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Krishna Kishore Bonagiri , Eric Allen Jacobson , Ritesh Kumar Gupta , Indrani Ghatare , Scott Louis Brokaw

IPC: G06F9/50 , G06F9/48 , H04L29/08

Abstract: A method includes: receiving, by a computer device, resource request for a data integration job, wherein the resource request is received from a job executor module and defines processes of the data integration job; allocating, by the computer device, containers for the processes of the data integration job; launching, by the computer device, a respective wrapper script on each respective one of the containers after allocating the respective one of the containers; and transmitting, by the computer device and in response to the allocating, node details to the job executor module. In embodiments, the wrapper script running on the container is configured to repeatedly check a predefined location for process commands from a job executor. After the resource manager allocates all the containers for a data integration job according to a resource request, the job executor writes the process commands to the predefined location. Each wrapper script continues to check the predefined location for the process command that it is assigned to run, and runs the process command as soon as it is available at the predefined location. The process commands may be indexed with index values matching those assigned to respective ones of the wrapper scripts.

6.

发明公开
TARGETED DATA ACQUISITION FOR MODEL TRAINING 审中-公开

公开(公告)号：US20240127085A1

公开(公告)日：2024-04-18

申请号：US18392342

申请日：2023-12-21

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Namit Kabra , Ritesh Kumar Gupta , Vijay Ekambaram , Smitkumar Narotambhai MARVANIYA

IPC: G06N5/04 , G06N20/00

CPC classification number: G06N5/04 , G06N20/00

Abstract: Targeted acquisition of data for model training includes identifying attributes of classified samples of a collection of samples classified by a classification model, and generating at least one query based on the identified attributes, the at least one query tailored, based on the attributes, to retrieve additional training data for training the classification model to more accurately classify samples and avoid incorrect sample classification.

7.

发明公开
AUTOMATICALLY ORCHESTRATING A COMPUTERIZED WORKFLOW 审中-公开

公开(公告)号：US20230409386A1

公开(公告)日：2023-12-21

申请号：US17840698

申请日：2022-06-15

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Anton Zorin , Manish Kesarwani , Niels Dominic Pardon , Ritesh Kumar Gupta , Sameep Mehta

IPC: G06F9/48 , G06N7/00

CPC classification number: G06F9/4881 , G06F9/4856 , G06N7/005

Abstract: The method performs at the orchestration interface at which update information, including changes to tasks of a workflow, is received from a task manager system (TMS), where the workflow includes a set of tasks, inputs to the tasks, and outputs from the tasks. The inputs and outputs determine runtime dependencies between the tasks. Based on the update information received, the orchestration interface populates a topology of nodes and edges as a directed acyclic graph (DAG) that maps nodes to tasks and edges to runtime dependencies between tasks, based on node inputs and outputs. The orchestration interface instructs the execution of the tasks and handling dependencies by interacting with a task execution system (TES) and by traversing the DAG, the orchestration interface identifies tasks that depend on completed tasks as per the runtime dependencies and instructs the TES to execute the dependent tasks identified.

8.

发明申请
TARGETED DATA ACQUISITION FOR MODEL TRAINING 有权

公开(公告)号：US20230016082A1

公开(公告)日：2023-01-19

申请号：US17935341

申请日：2022-09-26

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor： Namit Kabra , Ritesh Kumar Gupta , Vijay Ekambaram , Smitkumar Narotambhai MARVANIYA

IPC: G06N5/04 , G06N20/00

Abstract: Targeted acquisition of data for model training includes automatically generating metadata describing samples, of an initial dataset, in neighborhoods of an embedding space in which the samples are embedded. The samples described by the automatically generated metadata are classified by a classification model, and include both correctly classified samples in the neighborhoods and incorrectly classified samples in the neighborhoods. Additionally, attributes of one or more correctly classified samples of the collection of samples and one or more incorrectly classified samples of the collection of samples are identified, and queries are generated based on the identified attributes, the queries tailored, based on the attributes, to retrieve additional training data for training the classification model to more accurately classify samples and avoid incorrect sample classification.

9.

发明申请
RESOLVING CONTAINER PREEMPTION 审中-公开

公开(公告)号：US20200371839A1

公开(公告)日：2020-11-26

申请号：US16417678

申请日：2019-05-21

Applicant: International Business Machines Corporation

Inventor： Krishna Kishore Bonagiri , Eric A. Jacobson , Ritesh Kumar Gupta , Scott Louis Brokaw

IPC: G06F9/50 , G06F9/48 , G06F9/46

Abstract: A set of resources required to process a data integration job is determined. In response to determining that the set of resources is not available, queue occupation, for each queue in the computing environment, is predicted. Queue occupation is a workload of queue resources for a future time based on a previous workload. A best queue is selected based on the predicted queue occupation. The best queue is the queue or queues in the computing environment available to be assigned to process the data integration job without preemption. The data integration job is processed using the best queue. It is determined whether a preemption event occurred causing the removal of resources from the best queue. A checkpoint is created in response to determining that a preemption event occurred. The checkpoint indicates the last successful operation completed and provides a point where processing can resume when resources become available.

10.

发明授权
Dynamically modifying the parallelism of a task in a pipeline 有权

公开(公告)号：US11461135B2

公开(公告)日：2022-10-04

申请号：US16663428

申请日：2019-10-25

Applicant: International Business Machines Corporation

Inventor： Yannick Saillet , Namit Kabra , Ritesh Kumar Gupta

IPC: G06F9/46 , G06F9/48 , G06N20/00

Abstract: In an approach to dynamically identifying and modifying the parallelism of a particular task in a pipeline, the optimal execution time of each stage in a dynamic pipeline is calculated. The actual execution time of each stage in the dynamic pipeline is measured. Whether the actual time of completion of the data processing job will exceed a threshold is determined. If it is determined that the actual time of completion of the data processing job will exceed the threshold, then additional instances of the stages are created.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification