METHODS AND SYSTEMS FOR ANALYZING HEALTHCARE DATA
    1.
    发明申请
    METHODS AND SYSTEMS FOR ANALYZING HEALTHCARE DATA 审中-公开
    用于分析健康数据的方法和系统

    公开(公告)号:US20150227691A1

    公开(公告)日:2015-08-13

    申请号:US14179752

    申请日:2014-02-13

    申请人: Xerox Corporation

    IPC分类号: G06F19/00 G06N99/00 G06N7/00

    摘要: Disclosed are the embodiments for creating a model capable of identifying one or more clusters in a healthcare dataset. An input is received pertaining to a range of numbers. Each number in the range of numbers is representative of a number of clusters in the healthcare dataset. For a cluster, one or more first parameters of a distribution associated with the cluster are estimated. Thereafter, a threshold value is determined based on the one or more first parameters. An inverse cumulative distribution of each of one or more n-dimensional variables in the healthcare dataset is determined. The one or more first parameters are updated to generate one or more second parameters based on the estimated inverse cumulative distribution. A model is created for each number in the range of numbers based on the one or more second parameters.

    摘要翻译: 公开了用于创建能够识别保健数据集中的一个或多个聚类的模型的实施例。 接收与数字范围有关的输入。 数字范围内的每个数字代表保健数据集中的多个聚类。 对于集群,估计与集群相关联的分发的一个或多个第一参数。 此后,基于一个或多个第一参数来确定阈值。 确定保健数据集中的一个或多个n维变量中的每一个的逆累积分布。 更新一个或多个第一参数以基于估计的反向累积分布生成一个或多个第二参数。 基于一个或多个第二参数,为数字范围内的每个数字创建模型。

    METHODS AND SYSTEMS FOR PREDICTING A HEALTH CONDITION OF A HUMAN SUBJECT
    2.
    发明申请
    METHODS AND SYSTEMS FOR PREDICTING A HEALTH CONDITION OF A HUMAN SUBJECT 审中-公开
    预测人体健康状况的方法与系统

    公开(公告)号:US20160306935A1

    公开(公告)日:2016-10-20

    申请号:US14687128

    申请日:2015-04-15

    申请人: XEROX CORPORATION

    摘要: Disclosed are embodiments of methods and systems for predicting a health condition of a first human subject. The method comprises extracting a historical data including physiological parameters of one or more second human subjects. A latent variable is determined based on an inverse cumulative distribution of a transformed historical data, determined by ranking of the historical data. Further, one or more parameters of a first distribution, deterministic of health conditions in the historical data, are determined based on the latent variable. For each physiological parameter, a random variable is sampled from a second distribution of the physiological parameter based on the one or more parameters. Further, based on the random variable, the latent variable is updated. Thereafter, the one or more parameters are re-estimated based on the updated latent variable. Based on the first distribution a classifier is trained to predict the health condition of the first human subject.

    摘要翻译: 公开了用于预测第一人类受试者的健康状况的方法和系统的实施例。 该方法包括提取包括一个或多个第二人类受试者的生理参数的历史数据。 潜在变量是根据由历史数据的排序确定的经变换的历史数据的反向累积分布确定的。 此外,基于潜在变量确定历史数据中健康状况的确定性的第一分布的一个或多个参数。 对于每个生理参数,基于一个或多个参数从生理参数的第二分布中采集随机变量。 此外,基于随机变量,更新潜变量。 此后,基于更新的潜在变量重新估计一个或多个参数。 基于第一次分配,训练分类器来预测第一人类受试者的健康状况。

    SYSTEM AND METHOD FOR PREDICTING HEALTH CONDITION OF A PATIENT
    3.
    发明申请
    SYSTEM AND METHOD FOR PREDICTING HEALTH CONDITION OF A PATIENT 审中-公开
    用于预测患者健康状况的系统和方法

    公开(公告)号:US20160300034A1

    公开(公告)日:2016-10-13

    申请号:US14632117

    申请日:2015-02-26

    申请人: XEROX CORPORATION

    IPC分类号: G06F19/00

    CPC分类号: G16H50/30 G16H10/60

    摘要: According to embodiments illustrated herein, there is provided a system for predicting a health condition of a first patient. The system includes a document processor configured to extract one or more headings from one or more medical records of the first patient based on one or more predefined rules. The document processor is further configured to extract one or more words from one or more phrases written under each of the extracted one or more headings, wherein the one or more phrases correspond to documentation of the observation of the first patient by a medical attender. The system further includes one or more processors configured to predict the health condition of the first patient based on a count of the one or more words in historical medical records and the one or more medical records.

    摘要翻译: 根据本文所示的实施例,提供了一种用于预测第一患者的健康状况的系统。 该系统包括被配置为基于一个或多个预定规则从第一患者的一个或多个医疗记录提取一个或多个标题的文档处理器。 文档处理器还被配置为从在所提取的一个或多个标题中的每一个标题下写入的一个或多个短语中提取一个或多个单词,其中所述一个或多个短语对应于由医疗人员观察第一患者的文档。 该系统还包括一个或多个处理器,其被配置为基于历史医疗记录中的一个或多个单词的计数和一个或多个医疗记录来预测第一患者的健康状况。

    METHODS AND SYSTEMS FOR PREDICTING HEALTH CONDITION OF HUMAN SUBJECTS
    4.
    发明申请
    METHODS AND SYSTEMS FOR PREDICTING HEALTH CONDITION OF HUMAN SUBJECTS 有权
    预测人体健康状况的方法与系统

    公开(公告)号:US20160246931A1

    公开(公告)日:2016-08-25

    申请号:US14629766

    申请日:2015-02-24

    申请人: XEROX CORPORATION

    IPC分类号: G06F19/00 G06N7/00 G06N99/00

    摘要: Disclosed are methods and systems for classifying one or more human subjects in one or more categories indicative of a health condition of the one or more human subjects. The method includes categorizing one or more parameters of each of the one or more human subjects in one or more data views based on a data type of each of the one or more parameters. A data view corresponds to a first data structure storing a set of parameters categorized in the data view, associated with each of the one or more human subjects. The one or more data views are transformed to a second data structure representative of the set of parameters across the one or more data views. Thereafter, a classifier is trained based on the second data structure, wherein the classifier classifies the one or more human subjects in the one or more categories.

    摘要翻译: 公开了用于对一个或多个人类受试者进行分类的方法和系统,所述方法和系统指示一种或多种人类受试者的健康状况。 该方法包括基于一个或多个参数中的每一个的数据类型,将一个或多个人物对象中的每一个的一个或多个参数分类为一个或多个数据视图。 数据视图对应于存储与数据视图中分类的一组参数的第一数据结构,其与一个或多个人类对象中的每一个相关联。 所述一个或多个数据视图被转换为代表跨越所述一个或多个数据视图的一组参数的第二数据结构。 此后,基于第二数据结构训练分类器,其中分类器对一个或多个类别中的一个或多个人类对象进行分类。

    METHODS AND SYSTEMS FOR PREDICTING MORTALITY OF A PATIENT
    5.
    发明申请
    METHODS AND SYSTEMS FOR PREDICTING MORTALITY OF A PATIENT 审中-公开
    用于预测患者的死亡率的方法和系统

    公开(公告)号:US20170055916A1

    公开(公告)日:2017-03-02

    申请号:US14841812

    申请日:2015-09-01

    申请人: XEROX CORPORATION

    IPC分类号: A61B5/00 A61B5/145 A61B5/0205

    摘要: Disclosed are embodiments of methods and systems for predicting mortality of a first patient. The method comprises categorizing a historical data into a first category and a second category. The method further comprises determining a first test parameter and a second test parameter based on at least one of a sample data of a first patient and the historical data corresponding to at least one of the first category and the second category. The method further comprises determining a probability score based on a cumulative distribution of at least one of the first test parameter and the second test parameter. The method further comprises categorizing the sample data in one of the first category and the second category based on the probability score. Further, the method comprises predicting the mortality of the first patient based on at least the categorization of the sample data of the first patient.

    摘要翻译: 公开了用于预测第一患者的死亡率的方法和系统的实施例。 该方法包括将历史数据分类为第一类别和第二类别。 该方法还包括基于第一患者的样本数据和对应于第一类别和第二类别中的至少一个的历史数据中的至少一个来确定第一测试参数和第二测试参数。 该方法还包括基于第一测试参数和第二测试参数中的至少一个的累积分布来确定概率得分。 该方法还包括基于概率分数将样本数据分类为第一类别和第二类别之一。 此外,该方法包括至少基于第一患者的样本数据的分类来预测第一患者的死亡率。

    METHODS AND SYSTEMS FOR DETERMINING INTER-DEPENDENICES BETWEEN APPLICATIONS AND COMPUTING INFRASTRUCTURES
    6.
    发明申请
    METHODS AND SYSTEMS FOR DETERMINING INTER-DEPENDENICES BETWEEN APPLICATIONS AND COMPUTING INFRASTRUCTURES 有权
    用于确定应用与计算基础设施之间的相互依赖关系的方法和系统

    公开(公告)号:US20150324695A1

    公开(公告)日:2015-11-12

    申请号:US14273566

    申请日:2014-05-09

    申请人: Xerox Corporation

    IPC分类号: G06N5/04 H04L29/08

    CPC分类号: G06N5/04 G06N99/005 H04L67/10

    摘要: Methods and systems for creating one or more statistical classifiers. A first set of performance parameters, corresponding to the one or more applications and the one or more computing infrastructures, is extracted from a historical data pertaining to the execution of the one or more applications on the one or more computing infrastructures. Further, a set of application-specific and a set of infrastructure-specific parameters are selected, from the first set of performance parameters, based on one or more statistical techniques. A similarity between each pair of the applications, each pair of the computing infrastructures, and each pair of possible combinations of an application and a computing infrastructure is determined. One or more statistical classifiers are created, based on the determined similarity.

    摘要翻译: 用于创建一个或多个统计分类器的方法和系统。 从与一个或多个计算基础设施上的一个或多个应用的​​执行有关的历史数据中提取对应于一个或多个应用和一个或多个计算基础设施的第一组性能参数。 此外,基于一种或多种统计技术,从第一组性能参数中选择一组特定应用和一组基础设施特定参数。 确定每对应用程序,每对计算基础设施以及应用程序和计算基础设施的每对可能组合之间的相似性。 基于所确定的相似度,创建一个或多个统计分类器。

    METHODS AND SYSTEMS FOR MODELING CLOUD USER BEHAVIOR
    7.
    发明申请
    METHODS AND SYSTEMS FOR MODELING CLOUD USER BEHAVIOR 审中-公开
    用于建模云的用户行为的方法和系统

    公开(公告)号:US20150294230A1

    公开(公告)日:2015-10-15

    申请号:US14250407

    申请日:2014-04-11

    申请人: Xerox Corporation

    摘要: Some embodiments are directed to a system for identifying clusters from a plurality of users using cloud services. A behavior collection module is configured to obtain user preferences for the plurality of users, and an EM module to configured estimate at least one parameter of a distance-based model by the Expectation-Maximization (EM) algorithm for various values of G (number of clusters). A selection module is configured to compute Bayesian Information Criteria (BIC) with the at least one estimated parameter obtained from the EM module for various values of G, compare BICs obtained for various values of G, select the model with the highest BIC as the best model (best model including the plurality of clusters) and use estimated latent variables of the best model to build a classifier. A characterization module is configured to classify each user into a cluster of the best model using the classifier, and to determine ranking preference of each cluster.

    摘要翻译: 一些实施例涉及用于使用云服务从多个用户中识别群集的系统。 行为收集模块被配置为获得多个用户的用户偏好,并且EM模块被配置为通过期望最大化(EM)算法估计基于距离的模型的至少一个参数,用于G的各种值 集群)。 选择模块被配置为使用从EM模块获得的各种G值的至少一个估计参数来计算贝叶斯信息准则(BIC),对于G的各种值获得的比较BIC,选择具有最高BIC的模型作为最佳 模型(包括多个集群的最佳模型),并使用最佳模型的估计潜在变量构建分类器。 表征模块被配置为使用分类器将每个用户分类为最佳模型的集群,并且确定每个集群的排名偏好。

    METHODS AND SYSTEMS FOR ANALYZING FINANCIAL DATASET
    8.
    发明申请
    METHODS AND SYSTEMS FOR ANALYZING FINANCIAL DATASET 审中-公开
    分析财务数据的方法与系统

    公开(公告)号:US20150228015A1

    公开(公告)日:2015-08-13

    申请号:US14179775

    申请日:2014-02-13

    申请人: Xerox Corporation

    IPC分类号: G06Q40/02

    CPC分类号: G06Q40/025

    摘要: Disclosed are the embodiments for creating a model capable of identifying one or more clusters in a financial data. An input is received pertaining to a range of numbers. Each number in the range of numbers is representative of a number of clusters in the financial data. For a cluster, one or more first parameters of a distribution associated with the cluster are estimated. Thereafter, a threshold value is determined based on the one or more first parameters. An inverse cumulative distribution of each of one or more n-dimensional variables in the financial data is determined. The one or more first parameters are updated to generate one or more second parameters based on the estimated inverse cumulative distribution. A model is created for each number in the range of numbers based on the one or more second parameters.

    摘要翻译: 公开了用于创建能够识别财务数据中的一个或多个集群的模型的实施例。 接收与数字范围有关的输入。 数字范围内的每个数字代表财务数据中的多个集群。 对于集群,估计与集群相关联的分发的一个或多个第一参数。 此后,基于一个或多个第一参数来确定阈值。 确定财务数据中的一个或多个n维变量中的每一个的逆累积分布。 更新一个或多个第一参数以基于估计的反向累积分布生成一个或多个第二参数。 基于一个或多个第二参数,为数字范围内的每个数字创建模型。

    METHODS AND SYSTEMS FOR SCHEDULING A BATCH OF TASKS
    9.
    发明申请
    METHODS AND SYSTEMS FOR SCHEDULING A BATCH OF TASKS 审中-公开
    调度一批任务的方法和系统

    公开(公告)号:US20150220871A1

    公开(公告)日:2015-08-06

    申请号:US14171793

    申请日:2014-02-04

    申请人: XEROX CORPORATION

    IPC分类号: G06Q10/06

    CPC分类号: G06Q10/063112

    摘要: The disclosed embodiments illustrate methods and systems for scheduling a batch of tasks on one or more crowdsourcing platforms. The method includes generating one or more forecast models for each of the one or more crowdsourcing platforms based on historical data associated with each of the one or more crowdsourcing platforms and a robustness parameter. Thereafter, for a forecast model, from the one or more forecast models, associated with each of the one or more crowdsourcing platforms, a schedule is generated based on the forecast model and one or more parameters associated with the batch of tasks. Further, the schedule is executed on each of the one or more forecasts models associated with the one or more crowdsourcing platforms to determine a performance score of the schedule on each of the one or more forecast models. Finally, the schedule is recommended to a requestor based on the performance score.

    摘要翻译: 所公开的实施例示出了用于在一个或多个众包平台上调度一批任务的方法和系统。 该方法包括基于与一个或多个众包平台中的每一个相关联的历史数据和鲁棒性参数为一个或多个众包平台中的每一个生成一个或多个预测模型。 此后,对于与一个或多个众包平台中的每一个相关联的一个或多个预测模型的预测模型,基于预测模型和与该批次任务相关联的一个或多个参数来生成调度。 此外,对与一个或多个众包平台相关联的一个或多个预测模型中的每一个执行日程表,以确定该一个或多个预测模型中的每个预测模型上的日程表的绩效分数。 最后,根据绩效分数向请求者推荐日程表。

    CLUSTERING HIGH DIMENSIONAL DATA USING GAUSSIAN MIXTURE COPULA MODEL WITH LASSO BASED REGULARIZATION

    公开(公告)号:US20170293856A1

    公开(公告)日:2017-10-12

    申请号:US15093302

    申请日:2016-04-07

    申请人: Xerox Corporation

    IPC分类号: G06N99/00 G06F17/30

    CPC分类号: G06F16/285 G06N7/005

    摘要: LASSO constraints can lead to a Gaussian mixture copula model that is more robust, better conditioned, and more reflective of the actual clusters in the training data. These qualities of the GMCM have been shown with data obtained from: digital images of fine needle aspirates of breast tissue for detecting cancer; email for detecting spam; two dimensional terrain data for detecting hills and valleys; and video sequences of hand movements to detect gestures. Using training data, a GMCM estimate can be produced and iteratively refined to maximize a penalized log likelihood estimate until sequential iterations are within a threshold value of one another. The GMCM estimate can then be used to classify further samples. The LASSO constraints help keep the analysis tractibe such that useful results can be found and used while the result is still useful.