摘要:
Disclosed are the embodiments for creating a model capable of identifying one or more clusters in a healthcare dataset. An input is received pertaining to a range of numbers. Each number in the range of numbers is representative of a number of clusters in the healthcare dataset. For a cluster, one or more first parameters of a distribution associated with the cluster are estimated. Thereafter, a threshold value is determined based on the one or more first parameters. An inverse cumulative distribution of each of one or more n-dimensional variables in the healthcare dataset is determined. The one or more first parameters are updated to generate one or more second parameters based on the estimated inverse cumulative distribution. A model is created for each number in the range of numbers based on the one or more second parameters.
摘要:
Disclosed are embodiments of methods and systems for predicting a health condition of a first human subject. The method comprises extracting a historical data including physiological parameters of one or more second human subjects. A latent variable is determined based on an inverse cumulative distribution of a transformed historical data, determined by ranking of the historical data. Further, one or more parameters of a first distribution, deterministic of health conditions in the historical data, are determined based on the latent variable. For each physiological parameter, a random variable is sampled from a second distribution of the physiological parameter based on the one or more parameters. Further, based on the random variable, the latent variable is updated. Thereafter, the one or more parameters are re-estimated based on the updated latent variable. Based on the first distribution a classifier is trained to predict the health condition of the first human subject.
摘要:
According to embodiments illustrated herein, there is provided a system for predicting a health condition of a first patient. The system includes a document processor configured to extract one or more headings from one or more medical records of the first patient based on one or more predefined rules. The document processor is further configured to extract one or more words from one or more phrases written under each of the extracted one or more headings, wherein the one or more phrases correspond to documentation of the observation of the first patient by a medical attender. The system further includes one or more processors configured to predict the health condition of the first patient based on a count of the one or more words in historical medical records and the one or more medical records.
摘要:
Disclosed are methods and systems for classifying one or more human subjects in one or more categories indicative of a health condition of the one or more human subjects. The method includes categorizing one or more parameters of each of the one or more human subjects in one or more data views based on a data type of each of the one or more parameters. A data view corresponds to a first data structure storing a set of parameters categorized in the data view, associated with each of the one or more human subjects. The one or more data views are transformed to a second data structure representative of the set of parameters across the one or more data views. Thereafter, a classifier is trained based on the second data structure, wherein the classifier classifies the one or more human subjects in the one or more categories.
摘要:
Disclosed are embodiments of methods and systems for predicting mortality of a first patient. The method comprises categorizing a historical data into a first category and a second category. The method further comprises determining a first test parameter and a second test parameter based on at least one of a sample data of a first patient and the historical data corresponding to at least one of the first category and the second category. The method further comprises determining a probability score based on a cumulative distribution of at least one of the first test parameter and the second test parameter. The method further comprises categorizing the sample data in one of the first category and the second category based on the probability score. Further, the method comprises predicting the mortality of the first patient based on at least the categorization of the sample data of the first patient.
摘要:
Methods and systems for creating one or more statistical classifiers. A first set of performance parameters, corresponding to the one or more applications and the one or more computing infrastructures, is extracted from a historical data pertaining to the execution of the one or more applications on the one or more computing infrastructures. Further, a set of application-specific and a set of infrastructure-specific parameters are selected, from the first set of performance parameters, based on one or more statistical techniques. A similarity between each pair of the applications, each pair of the computing infrastructures, and each pair of possible combinations of an application and a computing infrastructure is determined. One or more statistical classifiers are created, based on the determined similarity.
摘要:
Some embodiments are directed to a system for identifying clusters from a plurality of users using cloud services. A behavior collection module is configured to obtain user preferences for the plurality of users, and an EM module to configured estimate at least one parameter of a distance-based model by the Expectation-Maximization (EM) algorithm for various values of G (number of clusters). A selection module is configured to compute Bayesian Information Criteria (BIC) with the at least one estimated parameter obtained from the EM module for various values of G, compare BICs obtained for various values of G, select the model with the highest BIC as the best model (best model including the plurality of clusters) and use estimated latent variables of the best model to build a classifier. A characterization module is configured to classify each user into a cluster of the best model using the classifier, and to determine ranking preference of each cluster.
摘要:
Disclosed are the embodiments for creating a model capable of identifying one or more clusters in a financial data. An input is received pertaining to a range of numbers. Each number in the range of numbers is representative of a number of clusters in the financial data. For a cluster, one or more first parameters of a distribution associated with the cluster are estimated. Thereafter, a threshold value is determined based on the one or more first parameters. An inverse cumulative distribution of each of one or more n-dimensional variables in the financial data is determined. The one or more first parameters are updated to generate one or more second parameters based on the estimated inverse cumulative distribution. A model is created for each number in the range of numbers based on the one or more second parameters.
摘要:
The disclosed embodiments illustrate methods and systems for scheduling a batch of tasks on one or more crowdsourcing platforms. The method includes generating one or more forecast models for each of the one or more crowdsourcing platforms based on historical data associated with each of the one or more crowdsourcing platforms and a robustness parameter. Thereafter, for a forecast model, from the one or more forecast models, associated with each of the one or more crowdsourcing platforms, a schedule is generated based on the forecast model and one or more parameters associated with the batch of tasks. Further, the schedule is executed on each of the one or more forecasts models associated with the one or more crowdsourcing platforms to determine a performance score of the schedule on each of the one or more forecast models. Finally, the schedule is recommended to a requestor based on the performance score.
摘要:
LASSO constraints can lead to a Gaussian mixture copula model that is more robust, better conditioned, and more reflective of the actual clusters in the training data. These qualities of the GMCM have been shown with data obtained from: digital images of fine needle aspirates of breast tissue for detecting cancer; email for detecting spam; two dimensional terrain data for detecting hills and valleys; and video sequences of hand movements to detect gestures. Using training data, a GMCM estimate can be produced and iteratively refined to maximize a penalized log likelihood estimate until sequential iterations are within a threshold value of one another. The GMCM estimate can then be used to classify further samples. The LASSO constraints help keep the analysis tractibe such that useful results can be found and used while the result is still useful.