-
公开(公告)号:US20240104128A1
公开(公告)日:2024-03-28
申请号:US18275134
申请日:2021-02-03
Applicant: NEC Corporation
Inventor: Masafumi Oyamada
Abstract: In order to make it possible to correctly merge dissimilarly expressed character strings, an information processing apparatus (1) includes: a data acquisition section (11) that acquires a data set including a plurality of character string pairs in each of which whether or not character strings therein indicate the same object is known; and a conversion pattern decision section (12) that decides, based on results of trials to convert each of the plurality of character string pairs included in the data set, a conversion pattern that heightens accuracy in determining whether or not the character string pair included in the data set indicates the same object.
-
公开(公告)号:US11934535B2
公开(公告)日:2024-03-19
申请号:US18169627
申请日:2023-02-15
Applicant: PROOFPOINT, INC.
Inventor: Daniel Clark Salo
IPC: G06F16/00 , G06F16/338 , G06F16/35 , G06F16/36 , G06F21/57
CPC classification number: G06F21/577 , G06F16/338 , G06F16/355 , G06F16/36
Abstract: A cyberthreat detection system queries a content database for unstructured content that contains a set of keywords, clusters the unstructured content into clusters based on topics, and determines a cybersecurity cluster utilizing a list of vetted cybersecurity phrases. The set of keywords represents a target of interest such as a newly discovered cyberthreat, an entity, a brand, or a combination thereof. The cybersecurity cluster thus determined is composed of unstructured content that has the set of keywords as well as some percentage of the vetted cybersecurity phrases. If the size of the cybersecurity cluster, as compared to the amount of unstructured content queried from the content database, meets or exceeds a predetermined threshold, the query is saved as a new classifier rule that can then be used by a cybersecurity classifier to automatically, dynamically and timely identify the target of interest in unclassified unstructured content.
-
公开(公告)号:US11907662B2
公开(公告)日:2024-02-20
申请号:US16958888
申请日:2018-12-27
Applicant: Robert Bosch GmbH
Inventor: Haibo Ding , Yifan He , Lin Zhao , Kui Xu , Zhe Feng
IPC: G06F17/00 , G06F40/30 , G06F16/36 , G06F40/247 , G06F40/169 , G06N5/022 , G06N5/04 , G06N7/01
CPC classification number: G06F40/30 , G06F16/36 , G06F40/169 , G06F40/247 , G06N5/022 , G06N5/04 , G06N7/01
Abstract: An automatic terminology linking system includes a candidate generator configured to identify candidate nodes for each terminology that is to be linked to a node of the knowledge base. A pseudo-candidate generator is configured to identify pseudo-candidate nodes for candidate-less terminologies. A candidate scorer is configured to respectively score the candidate nodes and the pseudo-candidate nodes by collective inference using occurrence statistics and co-occurrence statistics for these nodes. The pseudo-candidate generator is configured to identify knowledge base nodes that are semantically-related to candidate-less terminology as the pseudo-candidate nodes for the candidate-less terminology.
-
4.
公开(公告)号:US11874864B2
公开(公告)日:2024-01-16
申请号:US17290444
申请日:2019-11-26
Applicant: KONINKLIJKE PHILIPS N.V. , TRUSTEES OF BOSTON UNIVERSITY
Inventor: Henghui Zhu , Amir Mohammad Tahmasebi Maraghoosh , Ioannis Paschalidis
IPC: G06F40/20 , G06F16/33 , G06F16/36 , G06N20/00 , G06F40/284 , G06F40/205 , G06F18/214 , G06F18/10 , G06F40/211 , G06F40/279 , G06F40/295
CPC classification number: G06F16/3344 , G06F16/367 , G06F18/214 , G06F40/205 , G06F40/284 , G06N20/00 , G06F16/36 , G06F18/10 , G06F40/211 , G06F40/279 , G06F40/295
Abstract: A method (100) for generating a domain-specific training set, comprising: generating (130) a generic corpus comprising a plurality of tokenized documents, comprising: (i) parsing (132) a document retrieved from the generic corpus; (ii) preprocessing (134) the parsed document; (iii) tokenizing (136) the preprocessed document; and (iv) storing (138) the tokenized document in the generic corpus; generating (140) an ontology database of tokenized entries, comprising: (i) parsing (142) an ontology entry retrieved from an ontology; (ii) preprocessing (144) the parsed entry; (iii) tokenizing (146) the preprocessed entry; and (iv) storing (148) the tokenized entry in the ontology database; querying (150), using domain-specific tokenized entries from the ontology database, the tokenized documents in the generic corpus; identifying (160), based on the query, a plurality of tokenized documents specific to the domain; and storing (170), in a training set database, the identified tokenized documents as a training set specific to the domain.
-
公开(公告)号:US11836193B2
公开(公告)日:2023-12-05
申请号:US16318066
申请日:2017-07-14
Applicant: Albert Einstein College of Medicine , Franz, Inc.
Inventor: Parsa Mirhaji , Jannes Aasman
IPC: G06F16/93 , G06F16/36 , G06F16/901
CPC classification number: G06F16/94 , G06F16/36 , G06F16/9024 , G06F2216/11
Abstract: Persistence and linking of analytic products is provided. Information regarding a plurality of analytic methods is collected. A first process node is generated in a network. The first process node corresponds to a first analytic method. Information is collected regarding a plurality of executions of the first analytic method. A plurality of session nodes is generated in the network corresponding to the plurality of executions. Each of the plurality of session nodes is linked to the first process node. Metadata regarding the plurality of executions is associated with the plurality of session nodes. At least one product node is generated corresponding to a product. The product integrates a result value of at least one of the plurality of executions. The at least one product node is linked to the session node of the plurality of session nodes corresponding to the at least one of the plurality of executions.
-
公开(公告)号:US11816434B2
公开(公告)日:2023-11-14
申请号:US17406238
申请日:2021-08-19
Applicant: entigenlogic LLC
Inventor: Frank John Williams , Stephen Emerson Sundberg , Ameeta Vasant Reed , Dennis Arlen Roberson , Thomas James MacTavish , Karl Olaf Knutson , Jessy Thomas , Niklas Josiah MacTavish , David Michael Corns, II , Andrew Chu , Kyle Edward Alberth , Ali Fattahian , Zachary John McCord , Ahmad Abdelqader Abunaser , Gary W. Grube
IPC: G06F40/289 , G06F16/33 , G06F16/36 , G06F40/237 , G06F40/247
CPC classification number: G06F40/289 , G06F16/3344 , G06F16/367 , G06F40/237 , G06F40/247 , G06F16/36
Abstract: A method executed by a computing device includes determining a set of identigens for each phrase word of a phrase to produce sets of identigens. A set of identigens of the sets of identigens represents one or more different meanings of a phrase word of the phrase. The method further includes obtaining inflection information for one or more phrase words of the phrase. The method further includes selecting an identigen of a first set of identigens based on the inflection information to produce a first identigen selection for the first set of identigens having a selected meaning of one or more different meanings of the first phrase word. The method further includes interpreting remaining sets of identigens of the sets of identigens to produce an entigen group so that the entigen group represents a most likely meaning interpretation of the phrase.
-
公开(公告)号:US11797593B2
公开(公告)日:2023-10-24
申请号:US17855685
申请日:2022-06-30
Applicant: Intuit Inc.
IPC: G06F16/35 , G06F16/33 , G06F16/36 , G06F16/34 , G06F16/338
CPC classification number: G06F16/35 , G06F16/338 , G06F16/3331 , G06F16/34 , G06F16/36
Abstract: The invention relates to a method for mapping topics. The method includes obtaining terms, obtaining tokens from each term, and identifying a first and a second set of topics. Each of the topics represents one or more of the terms. The method further includes identifying first and second topic names for the first and the second sets of topics. For each topic, the tokens associated with the terms assigned to the topic are analyzed for relevance, and a token with a high relevance is selected as the topic name. The method also includes selecting one of the first and one of the second sets of topics to obtain first and second selected topics, determining, based on the one or more terms, a similarity value between each of the first and the second selected topics, and establishing a mapping between similar first and second selected topics.
-
8.
公开(公告)号:US20230334079A1
公开(公告)日:2023-10-19
申请号:US18138195
申请日:2023-04-24
Applicant: cortical.io AG
Inventor: Francisco De Sousa Webber
CPC classification number: G06F16/36 , G06F16/334
Abstract: A method for using distributed representations of data items within a first set of data documents clustered in a first two-dimensional metric space to generate a cluster of distributed representations in a second two-dimensional metric space includes clustering in a first two-dimensional metric space, by a reference map generator, a set of data documents, generating a semantic map. A parser generates an enumeration of data items occurring in the set of data documents. A representation generator generates a distributed representation using occurrence information about each data item. A sparsifying module receives an identification of a maximum level of sparsity and reduces a total number of set bits within the distributed representation. The reference map generator clusters, in a second two-dimensional metric space, a set of SDRs retrieved from the SDR database and selected according to a second at least one criterion, generating a second semantic map.
-
公开(公告)号:US20230306052A1
公开(公告)日:2023-09-28
申请号:US18317563
申请日:2023-05-15
Applicant: YAHOO ASSETS LLC
Inventor: Sanika Shirwadkar , Daozheng Chen , Guillaume Le Chenadec , Ralph Rabbat , Prateeksha Uday Chandraghatgi
IPC: G06F16/36
CPC classification number: G06F16/36
Abstract: The present teaching relates to entity extraction and disambiguation. In one example, an entity name extracted from a data source associated with a user is obtained. One or more entity types associated with the entity name are determined. One or more entity candidates are identified with respect to each of the one or more entity types. An entity candidate is selected with respect to one of the one or more entity types to be an individual associated with the entity name.
-
10.
公开(公告)号:US11698920B2
公开(公告)日:2023-07-11
申请号:US17347815
申请日:2021-06-15
Applicant: Open Text S.A. ULC
Inventor: Pascal Dimassimo , Steve Pettigrew , Martin Brousseau , Charles-Olivier Simard , Eric Williams , Francis Lacroix , Alex Dowgailenko , Agostino Deligia , Jean-Michel Texier
IPC: G06F16/00 , G06F16/31 , G06F16/36 , G06F16/248 , G06F16/338 , G06F16/951 , G06F16/332 , G06F16/33
CPC classification number: G06F16/316 , G06F16/248 , G06F16/338 , G06F16/3325 , G06F16/3344 , G06F16/36 , G06F16/951
Abstract: Methods, systems and computer-readable media enable various techniques related to semantic navigation. One aspect is a technique for displaying semantically derived facets in the search engine interface. Each of the facets comprises faceted search results. Each of the faceted search results is displayed in association with user interface elements for including or excluding the faceted search result as additional search terms to subsequently refine the search query. Another aspect automatically infers new metadata from the content and from existing metadata and then automatically annotates the content with the new metadata to improve recall and navigation. Another aspect identifies semantic annotations by determining semantic connections between the semantic annotations and then dynamically generating a topic page based on the semantic connections.
-
-
-
-
-
-
-
-
-