Detecting screenshot images for protecting against loss of sensitive screenshot-borne data

    公开(公告)号:US10949961B1

    公开(公告)日:2021-03-16

    申请号:US16891678

    申请日:2020-06-03

    申请人: Netskope, Inc.

    摘要: Disclosed is detecting screenshot images and protecting against loss of sensitive screenshot image data. The method collects examples of screenshot images and non-screenshot images, creates labelled ground-truth data for the examples, and applies re-rendering of collected images to represent variations that may contain sensitive information. The method includes training a DL stack by forward inference and back propagation using labelled ground truth data for the screenshots, storing parameters of the trained DL stack for inference from production images, and using a production DL stack with the stored parameters to classify at least one production image by inference as containing a screenshot image. Further, DL stack includes a first set of layers closer to an input layer being pre-trained to perform image recognition before exposing a second set of layers further from the input layer of DL stack to the labelled ground truth data for the screenshot images and non-screenshot images.

    Deep learning-based detection and data loss prevention of image-borne sensitive documents

    公开(公告)号:US11537745B2

    公开(公告)日:2022-12-27

    申请号:US17116862

    申请日:2020-12-09

    申请人: Netskope, Inc.

    IPC分类号: G06F21/60 G06F21/62 G06N20/00

    摘要: The technology disclosed relates to distributing a trained master deep learning (DL) stack with stored parameters to a plurality of organizations, to detect organization sensitive data in images, referred to as image-borne organization sensitive documents, and protecting against loss of the image-borne organization sensitive documents. Disclosed is providing organizations with a DL stack update trainer, under the organizations' control, configured to allow the organizations to perform update training to generate updated DL stacks, without the organizations forwarding images of organization-sensitive training examples, and to save non-invertible features derived from the images, ground truth labels for the images, and parameters of the updated DL stacks. In particular, the technology disclosed relates to receiving, from a plurality of the DL stack update trainers, organization-specific examples including the non-invertible features of the organization-sensitive training examples and the ground truth labels, and using the received organization-specific examples to update the trained master DL stack.

    Detecting image-borne identification documents for protecting sensitive information

    公开(公告)号:US10990856B1

    公开(公告)日:2021-04-27

    申请号:US16891647

    申请日:2020-06-03

    申请人: Netskope, Inc.

    IPC分类号: G06K9/62 G06N5/04 G06N3/08

    摘要: Disclosed is detecting identification documents in images (image-borne identification documents) and protecting against loss of the image-borne identification documents—training a DL stack by forward inference and back propagation using labelled ground truth data for the image-borne identification documents and the examples of other image documents. The DL stack includes a first set of layers closer to an input layer and a second set of layers further from the input layer, the first set of layers being pre-trained to perform image recognition before exposing the second layer of the DL stack to the labelled ground truth data for the image-borne identification documents and the examples of other image documents. Also included is storing parameters of the trained DL stack for inference from production images, and using a production DL stack with the stored parameters to classify at least one production image by inference as containing a sensitive image-borne identification document.

    Deep learning stack used in production to prevent exfiltration of image-borne identification documents

    公开(公告)号:US11574151B2

    公开(公告)日:2023-02-07

    申请号:US17229768

    申请日:2021-04-13

    申请人: Netskope, Inc.

    摘要: Disclosed is detecting identification documents in image-borne identification documents and protecting against loss of the image-borne identification documents. A trained deep learning (DL) stack is used to classify production images by inference as containing a sensitive image-borne identification document, with the trained stack configured with parameters determined using labelled ground truth data for the identification documents and examples of other image documents. The trained DL stack is configured to include a first set of layers closer to an input layer and a second set of layers further from the input layer, with the first set pre-trained to perform image recognition before exposing the second set of layers of the stack to the labelled ground truth data for the image-borne identification documents and examples of other image documents, and using the inferred classification of the sensitive image-borne identification document in a DLP system to protect against loss by image exfiltration.