-
公开(公告)号:US20170372135A1
公开(公告)日:2017-12-28
申请号:US15193660
申请日:2016-06-27
申请人: Xerox Corporation
发明人: Jutta Katharina Willamowski , Jerome Pouyadou , Yves Hoppenot , Matthieu Mazzega , Emmanuel Rado , Asma Bennani , Julien Soler , Michel Langlais , Juan-Pablo Suarez
CPC分类号: G06F3/1273 , G06F3/1207 , G06F3/1285 , G06Q10/0633 , G06Q10/10
摘要: A computer-implemented method for gathering knowledge within an organization for supporting the preparation, animation, and execution of a collaborative workshop for high speed and efficient document management and labeling. Printed documents are tracked within the system over a specified amount of time to acquire print job information from the jobs printed within an organization. Based upon the documents retrieved, a list of users is determined and invited to review and annotate the list of documents. The list of documents is then narrowed down to an optimized set for ease of labeling and clustering. Provision is made for user-annotation of the classification label associated with the submitted print jobs including a reason for printing the print job. User-annotations are received for at least some of the submitted print jobs. The print jobs may be clustered into clusters based on the print job representations and annotations. A representation of the set of print jobs is generated which represents the agreed upon labels for a set of documents with similar traits in at least one of the clusters, based on the user provided labels.
-
2.
公开(公告)号:US20170357909A1
公开(公告)日:2017-12-14
申请号:US15181714
申请日:2016-06-14
申请人: Xerox Corporation
发明人: Jutta Katharina Willamowski , Yves Hoppenot , Jerome Pouyadou , Michel Langlais , Juan-Pablo Suarez
CPC分类号: G06N20/00 , G06F16/285 , G06F16/93
摘要: A system and method that supports the efficient interactive identification of the most paper intensive document categories such that a maximum number of the documents belonging to those categories can be correctly categorized with a minimum effort and within a minimum amount of time is disclosed. Further, an iterative method combining automatic grouping mechanisms with human labelling. The system and method are configured to allow the automatic machine labelling to run iteratively to generate improved document clustering and categorization.
-