-
公开(公告)号:US20240242032A1
公开(公告)日:2024-07-18
申请号:US17317110
申请日:2021-05-11
发明人: Yong Yi Bay , Qianhui Rong , Yang Angelina Yang , Menglin Cao
IPC分类号: G06F40/295 , G06K9/62 , G06N3/08
CPC分类号: G06F40/295 , G06K9/6215 , G06K9/6223 , G06K9/6256 , G06N3/088
摘要: Systems, apparatuses, methods, and computer program products are disclosed for unsupervised named entity recognition. An example method includes receiving, by a communications circuitry, a reference named entity list, the reference named entity list identifying a set of named entities and an entity type of each identified named entity. The example method further includes generating, by a vectorizer, vectors from the named entities identified in the reference named entity list, and consolidating, by a synthesizer, the generated vectors into a set of representative vectors, wherein each representative vector is associated with a particular entity type. Finally, the example method receiving, by an analysis engine, a set of text, and performing, by the analysis engine, named entity recognition on the set of text using the set of representative vectors to generate a tagged set of text.
-
公开(公告)号:US20240086495A1
公开(公告)日:2024-03-14
申请号:US17942234
申请日:2022-09-12
申请人: ThoughtSpot, Inc.
发明人: Divesh Gandhi , Atul Mangat , Vidya Priyadarshini Narayanan , Shubham Jaiswal , Anand Kumar Ganesh , Saurabh Kakran
IPC分类号: G06K9/62 , G06F3/0484 , G06F8/71
CPC分类号: G06K9/6201 , G06F3/0484 , G06F8/71 , G06K9/6223
摘要: First images that are screenshots from a first version of a software component are obtained. Second images that are screenshots from a second version are obtained. A collection of image deviations that includes pair-wise image deviations between pairs of images are identified. A pair of images includes a first image from the first images and a corresponding second image from the second images. An image deviation indicates a portion of the second image identified as differing from a spatially corresponding portion of the first image. The image deviations are grouped into deviation groups. At least some of the second images are associated with at least some of the deviation groups. A subset of the second images corresponding to a deviation group is output responsive to a selection of an indication of the deviation group.
-
3.
公开(公告)号:US20230388346A1
公开(公告)日:2023-11-30
申请号:US17752987
申请日:2022-05-25
CPC分类号: H04L63/20 , H04L63/1433 , G06K9/6223
摘要: A system of one embodiment that provides proactive security policy suggestions for applications based on the applications' software composition and runtime behavior. The system includes a memory and a processor. The system is operable to access data that represents one or more features of an application. The application is running on one or more nodes in a computer network, and a feature indicates an application library of the node. The system is operable to apply a clustering algorithm to the data to generate a plurality of cluster sets. The system is operable to determine a security policy to apply to a cluster set of the plurality of cluster sets and apply the security policy to an application whose features are represented by the data in the cluster set.
-
公开(公告)号:US20230334122A1
公开(公告)日:2023-10-19
申请号:US17719724
申请日:2022-04-13
申请人: Dell Products L.P.
发明人: Min GONG , Qi BAO , Qicheng QIU , Chunxi CHEN
IPC分类号: G06K9/62 , G06F16/901 , G06F16/25 , G06N5/02
CPC分类号: G06K9/6223 , G06F16/9024 , G06F16/25 , G06N5/025
摘要: This disclosure provides systems, methods, and media for creating a data graph database from various unstructured and unstructured data items for use by various services. The method comprises the operations of identifying unstructured data items in data subjects; recognizing regions of interest (ROIs) in the unstructured data items; and extracting the ROIs from the unstructured data items. The method further comprises encoding the extracted ROIs into ROI vectors; creating a data graph to represent the data subjects, the data items, and the ROI vectors; and storing the data graph into a graph database. The various embodiments can manage data items of different data formats together rather than separately, thus creating a data management system for managing data across data formats. The data management system can also store structured data items into the graph database, thus complementing the existing ETL procedure for structured data items.
-
公开(公告)号:US20230281094A1
公开(公告)日:2023-09-07
申请号:US17840788
申请日:2022-06-15
发明人: Yi-Ju LIAO , JEN-YUAN CHANG , PO-HSIU CHEN , Hsieh-Liang TSAI
CPC分类号: G06F11/3058 , G06F11/3409 , G06K9/6223 , G06N5/003
摘要: A creating method of a classification model about a hard disk efficiency problem comprising: by an analyzing device, performing: obtaining a plurality of pieces of measurement data of a plurality of hard disk devices each of which comprises a plurality of values of a plurality of vibration parameters; binarizing the plurality of pieces of measurement data based on a plurality of preset conditions respectively corresponding to the plurality of vibration parameters; and obtaining the classification model about the hard disk efficiency problem based on the plurality of pieces of binarized measurement data and a decision tree algorithm.
-
公开(公告)号:US20230274126A1
公开(公告)日:2023-08-31
申请号:US17682953
申请日:2022-02-28
申请人: PayPal, Inc.
CPC分类号: G06N3/0445 , G06N3/08 , G06K9/6223 , G06K9/6256 , G06F40/20
摘要: A plurality of first entities have been previously associated with a predefined activity. By performing a clustering algorithm on the first entities, a subset of the first entities is identified that have met a predefined criterion. Via a Natural Language Processing (NLP) technique, a multi-dimensional matrix is generated. The matrix has a plurality of vectors associated with attributes of the subset of the first entities. A neural network model is trained with the multi-dimensional matrix. A plurality of second entities are on a list that contains entities that have been flagged for engaging in, or having engaged, the predefined activity. Based on the trained neural network model, a prediction is made whether scanning the second entities against a plurality of third entities for matches will cause a number of alerts having a predefined characteristic to exceed a predefined threshold. The alerts correspond to matches that needs further investigation.
-
公开(公告)号:US20230222334A1
公开(公告)日:2023-07-13
申请号:US17572459
申请日:2022-01-10
CPC分类号: G06N3/08 , G06N3/04 , G06K9/6223 , G06K9/6228
摘要: A deep learning model is quantized during its training to perform a target software engineering task. During training, a portion of the full-precision floating point weights is quantized into INT4 or INT 8 data types through scalar quantization or product quantization to make the model more resilient to quantization and to reduce the noise between the quantized and full-precision model outputs. In scalar quantization, each sub-block consists of a single weight that is mapped into a codeword of a codebook. In product quantization, an identity matrix and a codebook of centroids is used to map a quantized weight into its original value.
-
公开(公告)号:US20230205740A1
公开(公告)日:2023-06-29
申请号:US17560688
申请日:2021-12-23
申请人: SOFTWARE AG
发明人: Mohamed ABDELAAL
IPC分类号: G06F16/215 , G06F16/28 , G06K9/62 , G06N20/00
CPC分类号: G06F16/215 , G06F16/284 , G06K9/6223 , G06N20/00
摘要: Certain example embodiments relate to meta-learning based error detection. Base classifiers are provided for historical attributes in historical datasets. Each is trained to indicate dirtiness of a value for the associated historical attribute. Clusters and a clustering model are generated using historical clustering features determined for each historical attribute, which are then associated with the clusters. For each dirty attribute in a dirty dataset, corresponding dirty clustering features are determined. The dirty attributes are assigned to the clusters using the corresponding determined dirty clustering features and the clustering model. The base classifiers associated with the clusters to which the dirty attributes were assigned are retrieved. Dirty features are extracted from the dirty dataset, and selectively modified. The extracted dirty features are applied to the retrieved the base classifiers to determine meta-features. A meta-classifier is trained using labeled meta-features. Predictions about the dirty dataset's dirtiness can be made using the meta-classifier.
-
公开(公告)号:US11657080B2
公开(公告)日:2023-05-23
申请号:US16881747
申请日:2020-05-22
申请人: Rovi Guides, Inc.
发明人: Mohammed Yasir
IPC分类号: G06F16/00 , H04N21/466 , G06N20/00 , H04N21/45 , H04N21/482 , G06F16/906 , G06F16/9535 , G06K9/62
CPC分类号: H04N21/4668 , G06F16/906 , G06F16/9535 , G06K9/6223 , G06N20/00 , H04N21/4532 , H04N21/4662 , H04N21/4826
摘要: Systems and methods for generating and presenting content recommendations to new users during or immediately after the onboarding process, before any history of the new user's viewed content is available. A machine learning or other model may be trained to determine clusters of content genre values corresponding to genres of content watched by viewers. Clusters are thus associated with popular groupings of content genres viewed by many users. Clusters representing popular groupings of content genres may be selected for new users, and content corresponding to the selected clusters may be recommended to the new users as part of their onboarding process. A sufficient amount of content may be selected to fully populate any content recommendation portion of a new user onboarding page.
-
公开(公告)号:US20190244132A1
公开(公告)日:2019-08-08
申请号:US16329303
申请日:2017-08-18
申请人: SONY CORPORATION
发明人: NAOKI IDE
CPC分类号: G06N20/00 , G06F17/18 , G06K9/6223 , G06K9/6256 , G06K9/6267 , G06N99/00
摘要: [Object] To previously predict learning performance in accordance with the labeling status of learning data. [Solution] Provided is an information processing device including: a data distribution presentation unit configured to perform dimensionality reduction on input learning data to generate a data distribution diagram related to the learning data; a learning performance prediction unit configured to predict learning performance on the basis of the data distribution diagram and a labeling status related to the learning data; and a display control unit configured to control a display related to the data distribution diagram and the learning performance. The data distribution diagram includes overlap information about clusters including the learning data and information about the number of pieces of the learning data belonging to each of the clusters.
-
-
-
-
-
-
-
-
-