摘要:
A method, a computing system and a computer program product are provided. A computing system identifies elements within a collection of medical documents. The elements include patients, adverse events and medical drugs. The medical documents are analyzed by the computer system to determine associations between the identified medical drugs and corresponding identified adverse events. The identified elements and the determined associations may be encoded as features by the computing system. The computing system identifies portions of the medical documents as containing the identified elements and the determined associations. The computing system generates a classification model based at least on the encoded features associated with the identified portions for identifying medical case safety reports within medical documents. The classification model is applied to a new document to determine a classification of the new document with respect to a medical case safety report.
摘要:
A method, a computing system and a computer program product are provided. A computing system identifies elements within a collection of medical documents. The elements include patients, adverse events and medical drugs. The medical documents are analyzed by the computer system to determine associations between the identified medical drugs and corresponding identified adverse events. The identified elements and the determined associations may be encoded as features by the computing system. The computing system identifies portions of the medical documents as containing the identified elements and the determined associations. The computing system generates a classification model based at least on the encoded features associated with the identified portions for identifying medical case safety reports within medical documents. The classification model is applied to a new document to determine a classification of the new document with respect to a medical case safety report.
摘要:
Mechanisms are provide for implementing a context aware abbreviation detection and annotation operation. An instance of a full name of an entity is identified in received content and analysis of a context window associated with the instance of the full name of the entity is performed to identify a presence of a pattern of content representative of an abbreviation. An abbreviation is identified as being present in association with the instance of the full name of the entity based on results of the analysis of the context window and a mapping data structure that maps the full name of the entity to the abbreviation is generated. The received content is annotated based on the mapping data structure to thereby generate abbreviation annotations for the received content. The annotated received content is output for use by a cognitive system to perform a cognitive operation based on the annotated received content.
摘要:
Comparing document contents is provided. An ontological concept is extracted from a text snippet of a corpus document. One or more feature vectors are constructed that include associative information that describes an ontology that includes the focused concept. A topic model is trained using the one or more feature vectors. First and second topic sets are respectively extracted from first and second documents using the topic model. One or more topics from the first topic set are matched, using the topic model, with one or more topics from the second topic set to construct a matched topic set. Semantic analyses are respectively performed on first and second text snippet sets, wherein the first and second text snippet sets are chosen based, at least in part, on the matched topic set. Text snippets are matched based, at least in part, on the first and second semantic analyses.
摘要:
The embodiments relate to generating hierarchical patterns based on a corpus of text. The corpus is analyzed, which includes extracting a set of features of the corpus. A set of grammatical patterns are generated based on the extracted features. The set of grammatical patterns includes at least one grammatical pattern generated from an internal pattern and at least one grammatical pattern generated from an external pattern. The grammatical patterns of the set are organized into a hierarchy and/or are ranked. The hierarchy and/or ranking are visually displayed.
摘要:
A method and system for achieving emotional text to speech. The method includes: receiving text data; generating emotion tag for the text data by a rhythm piece; and achieving TTS to the text data corresponding to the emotion tag, where the emotion tags are expressed as a set of emotion vectors; where each emotion vector includes a plurality of emotion scores given based on a plurality of emotion categories. A system for the same includes: a text data receiving module; an emotion tag generating module; and a TTS module for achieving TTS, wherein the emotion tag is expressed as a set of emotion vectors; and wherein emotion vector includes a plurality of emotion scores given based on a plurality of emotion categories.
摘要:
The embodiments relate to generating hierarchical patterns based on a corpus of text. The corpus is analyzed, which includes extracting a set of features of the corpus. A set of grammatical patterns are generated based on the extracted features. The set of grammatical patterns includes at least one grammatical pattern generated from an internal pattern and at least one grammatical pattern generated from an external pattern. The grammatical patterns of the set are organized into a hierarchy and/or are ranked. The hierarchy and/or ranking are visually displayed.
摘要:
Mechanisms are provide for implementing a context aware abbreviation detection and annotation operation. An instance of a full name of an entity is identified in received content and analysis of a context window associated with the instance of the full name of the entity is performed to identify a presence of a pattern of content representative of an abbreviation. An abbreviation is identified as being present in association with the instance of the full name of the entity based on results of the analysis of the context window and a mapping data structure that maps the full name of the entity to the abbreviation is generated. The received content is annotated based on the mapping data structure to thereby generate abbreviation annotations for the received content. The annotated received content is output for use by a cognitive system to perform a cognitive operation based on the annotated received content.
摘要:
A system and method to perform context aware sentiment analysis on a project that includes two or more aspects are described. The method includes identifying one or more inputs related to the project. The method also includes decomposing each of the one or more inputs, based on a content of the one or more comments, into at least one of the two or more aspects to generate one or more comment-aspect sets, each of the two or more aspects representing a context within the project, extracting opinions from each of the comment-aspect sets, and generating a disruptive argument based on the opinions.
摘要:
The embodiments relate to generating hierarchical patterns based on a corpus of text. The corpus is analyzed, which includes extracting a set of features of the corpus. A set of grammatical patterns are generated based on the extracted features. The set of grammatical patterns includes at least one grammatical pattern generated from an internal pattern and at least one grammatical pattern generated from an external pattern. The grammatical patterns of the set are organized into a hierarchy and/or are ranked. The hierarchy and/or ranking are visually displayed.