专利检索 ap:("INTERNATIONAL BUSINESS MACHINES CORPORATION") AND inv:"Rajesh Bordawekar" 第 1 页

1.

发明公开
CLUSTERING NUMERICAL VALUES USING LOGARITHMIC BINNING 审中-公开

公开(公告)号：US20240202214A1

公开(公告)日：2024-06-20

申请号：US18067770

申请日：2022-12-19

申请人： International Business Machines Corporation

发明人： Rajesh Bordawekar

IPC分类号： G06F16/28 , G06F16/242

CPC分类号： G06F16/285 , G06F16/2433

摘要： Clustering data points of a relational database having special data types is performed by establishing logarithmic bins in which the data is collected. Special data types include (i) zero; (ii) positive and negative values; (iii) infinity (positive and negative); (iv) not-a-number values (NaNs); (v) out-of-range values; and (vi) IEEE DECFloat (decimal floating-point) values. The numerical data is mapped to bins according to their values and redistributed among the bins based on median bin value. An occupancy-based partitioning process assures each bin has no more than a pre-defined threshold percentage of the data. Assigning data bins to clusters facilitates prediction of placement of input values into a particular cluster for response to database queries.

2.

发明公开
SCALABLE COUNT BASED INTERPRETABILITY FOR DATABASE ARTIFICIAL INTELLIGENCE (AI) 审中-公开

公开(公告)号：US20240045866A1

公开(公告)日：2024-02-08

申请号：US17817428

申请日：2022-08-04

申请人： International Business Machines Corporation

发明人： Rajesh Bordawekar , Prabhakar Kudva

IPC分类号： G06F16/2453 , G06F16/22 , G06F16/248

CPC分类号： G06F16/24549 , G06F16/2255 , G06F16/248

摘要： Systems, computer-implemented methods or computer program products to facilitate receiving results of a semantic structured query language (SQL) query and employing sparse hash-table based sketches to interpret a semantic structured query language (SQL) query result. A computing component stores a first space-efficient structure sketch in a compressed serialize form. The computing component can load a second space-efficient data structure sketch along with the first space-efficient data structure sketch and can compute one or more interpretability scores by extracting co-occurrence information from the first space-efficient data structure sketch. The second space-efficient data structure sketch can include a sketch for containment check.

3.

发明授权
Comparing time series data using context-based similarity 有权

公开(公告)号：US11244224B2

公开(公告)日：2022-02-08

申请号：US15926109

申请日：2018-03-20

申请人： International Business Machines Corporation

发明人： Rajesh Bordawekar , Tin Kam Ho

IPC分类号： G06N3/04 , G06K9/62 , G06N5/04 , G06N20/00

摘要： A first observation window in a first time series is identified. The first observation window is preceded by a first portion of the first time series. A neural network is trained using the first portion of the first time series and the first observation window, and weights are extracted from the middle layers of the neural network. A first feature vector is generated based on the weights. A second observation window in a second time series is identified, where the second observation window is preceded by a first portion of the second time series. A second feature vector associated with the second observation window is determined. The second feature vector is based at least in part on the first set of weights. A similarity between the first and second observation windows is determined based on comparing the first feature vector and the second feature vector.

4.

发明申请
BUILDING A WORD EMBEDDING MODEL TO CAPTURE RELATIONAL DATA SEMANTICS 有权

公开(公告)号：US20210124724A1

公开(公告)日：2021-04-29

申请号：US16665364

申请日：2019-10-28

申请人： International Business Machines Corporation

发明人： Rajesh Bordawekar

IPC分类号： G06F16/22 , G06F16/242 , G06F16/28

摘要： A computer-implemented method according to one embodiment includes identifying a relational database; determining columns of interest within the relational database; creating an unordered group of string tokens for each row of the relational database, utilizing the determined columns of interest; assigning weights for one or more columns within the relational database to one or more string tokens within each unordered group of string tokens to create a plurality of weighted unordered groups of string tokens; and determining a meaning vector for an identifier of each row of the relational database, utilizing the plurality of weighted unordered groups of string tokens.

5.

发明申请
RECORD CORRECTION AND COMPLETION USING DATA SOURCED FROM CONTEXTUALLY SIMILAR RECORDS 审中-公开

公开(公告)号：US20200159853A1

公开(公告)日：2020-05-21

申请号：US16197137

申请日：2018-11-20

申请人： International Business Machines Corporation

发明人： Rajesh Bordawekar , Tin Kam Ho

IPC分类号： G06F17/30 , G06F17/27

摘要： From a first attribute-value pair in a record, new data comprising a first token is created. From each token using a processor and a memory, new data including a corresponding vector is computed. From the record, a target row is selected, wherein a target attribute-value pair in the target row includes a value requiring correction. Using a similarity measure, a set of most similar rows to the target row is determined, wherein each row in the set of most similar rows to the target row has a corresponding similarity measure above a threshold similarity measure and wherein each row in the set of most similar rows includes the target attribute. From values corresponding to the target attribute in the set of most similar rows, a replacement value is determined. The value requiring correction in the target row is replaced with the replacement value.

6.

发明授权
Provisioning service requests in a computer system 有权

公开(公告)号：US10217053B2

公开(公告)日：2019-02-26

申请号：US14747062

申请日：2015-06-23

申请人： International Business Machines Corporation

发明人： Rajesh Bordawekar , Ashish Kundu , Oded Shmueli

IPC分类号： G06N5/04 , G06N7/00 , G06N99/00 , G06F9/50

摘要： Disclosed is a system, computer program product, and method for provisioning a new service request. The computer-implemented method begins with receiving a new service request for computational resources in a computing system. The required computational resources are memory usage, storage usage, processor usage, or a combination thereof to fulfill the new service request. Next a sandbox computing environment is used to operate the new service request. The sandbox computing environment is used to isolate the computing system. The sandbox computing environment produces a current computational resources usage data to fulfill the new service request in the sandbox computing environment. The current sandbox computational resources usage data and historical computational resources usage data are both used by a machine learning module to create a prediction of the computational resources that will be required in the computing system to fulfill the new service request.

7.

发明授权
Parallelized in-place radix sorting 有权

公开(公告)号：US09892149B2

公开(公告)日：2018-02-13

申请号：US14750363

申请日：2015-06-25

申请人： International Business Machines Corporation

发明人： Rajesh Bordawekar , Daniel Brand , Minsik Cho , Ulrich Finkler , Ruchir Puri

IPC分类号： G06F17/30

CPC分类号： G06F17/30345 , G06F17/30324 , G06F17/30445 , G06F17/30598

摘要： Methods for sorting a data set. A data storage is divided into a plurality of buckets that is each associated with a respective key value. A plurality of stripes is identified in each bucket. At least one data stripe set is defined that has one stripe within each respective bucket. An in-place partial bucket radix sort is performed on data items contained within one data stripe set with a first processor using an initial radix. Incorrectly sorted data items are then grouped in each bucket into a respective incorrect data item group within each bucket. A radix sort is then performed using the initial radix on the items within the respective incorrect data item group. A first level sorted output is produced.

8.

发明申请
INTERPRETATION OF RESULTS OF A SEMANTIC QUERY OVER A STRUCTURED DATABASE 有权

公开(公告)号：US20220269686A1

公开(公告)日：2022-08-25

申请号：US17184303

申请日：2021-02-24

申请人： International Business Machines Corporation

发明人： Rajesh Bordawekar , Apoorva Nitsure

IPC分类号： G06F16/2457 , G06F16/2455 , G06N5/04 , G06F16/2453 , G06F11/34

摘要： Systems, computer-implemented methods and/or computer program products to facilitate interpretation of a result of execution of a query over a structured database are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a determination component that determines a result of execution of a query over a structured database. The computer executable components also can comprise an interpretation component that interprets data underlying the result of execution of the query to determine one or more reasons that the result is provided in response to the query.

9.

发明申请
COMMUNICATION-EFFICIENT DATA PARALLEL ENSEMBLE BOOSTING 有权

公开(公告)号：US20220180253A1

公开(公告)日：2022-06-09

申请号：US17114644

申请日：2020-12-08

申请人： International Business Machines Corporation

发明人： Rajesh Bordawekar , Tin Kam Ho

IPC分类号： G06N20/20 , G06F9/52 , G06N5/00

摘要： Data-parallel ensemble training using gradient boosted trees includes training an ensemble of trees. The training includes splitting a training dataset into several data portions. Each data portion is assigned to each thread group from a set of thread groups. The training further includes executing a stage, in which each thread group, in parallel, trains a respective ensemble of decision trees. Executing the stage includes performing, by each thread group, in parallel, machine learning operations for the respective ensemble of decision trees using the data portion assigned to each thread group. Further, each thread group validates, in parallel, the respective ensemble of decision trees using a data portion assigned to another thread group. Execution of the stage is repeated until a predetermined threshold is satisfied. Further, a prediction is inferenced using the ensemble of decision trees that is formed using the respective ensemble of trees from each of the thread groups.

10.

发明申请
VECTOR EMBEDDING MODELS FOR RELATIONAL TABLES WITH NULL OR EQUIVALENT VALUES 有权

公开(公告)号：US20210294794A1

公开(公告)日：2021-09-23

申请号：US16825509

申请日：2020-03-20

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Rajesh Bordawekar , Tin Kam Ho

IPC分类号： G06F16/242 , G06F16/22 , G06F40/30 , G06N3/08 , G06K9/62 , G06F9/38

摘要： Structured and semi-structured databases and files are processed using natural language processing techniques to impute data for null value tokens in database records from other records that have non-null values for the same attributes. Vector embedding techniques are used, including, in some cases, appropriately tagging null value tokens to reduce or eliminate their undue impact on semantic vectors generating using a neural network.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类