-
公开(公告)号:US20230282019A1
公开(公告)日:2023-09-07
申请号:US18314618
申请日:2023-05-09
Applicant: Open Text Corporation
Inventor: Jeroen Mattijs van Rotterdam , Michael T. Mohen , Chao Chen , Kun Zhao
IPC: G06V30/418 , G06F16/93 , G06F16/335 , G06V30/416 , G06V30/40
CPC classification number: G06V30/418 , G06F16/93 , G06F16/335 , G06V30/416 , G06V30/40 , G06V2201/10
Abstract: Systems and methods for assessing similarity of documents are provided. Embodiments of the systems and methods include extracting a reference document text from a reference document, extracting an archived document text from an archived document, and quantifying the reference document and the archived document. The systems and methods may also include determining a document similarity value of the quantified reference document and the archived document. Determining the document similarity value includes calculating a set of vector similarity values for a set of combinations of a reference document text vector and an archived document text vector, and calculating the document similarity value, including a sum of the plurality of vector similarity values.
-
公开(公告)号:US20210192204A1
公开(公告)日:2021-06-24
申请号:US17192498
申请日:2021-03-04
Applicant: Open Text Corporation
Inventor: Jeroen Mattijs van Rotterdam , Michael T. Mohen , Chao Chen , Kun Zhao
IPC: G06K9/00 , G06F16/93 , G06F16/335
Abstract: Systems and methods for assessing similarity of documents are provided. Embodiments of the systems and methods include extracting a reference document text from a reference document, extracting an archived document text from an archived document, and quantifying the reference document and the archived document. The systems and methods may also include determining a document similarity value of the quantified reference document and the archived document. Determining the document similarity value includes calculating a set of vector similarity values for a set of combinations of a reference document text vector and an archived document text vector, and calculating the document similarity value, including a sum of the plurality of vector similarity values.
-
公开(公告)号:US20200183986A1
公开(公告)日:2020-06-11
申请号:US16791628
申请日:2020-02-14
Applicant: OPEN TEXT CORPORATION
Inventor: Lei Zhang , Chao Chen , Kun Zhao , Jingjing Liu , Ying Teng
IPC: G06F16/93 , G06F16/2457 , G06F16/2455 , G06F16/22
Abstract: A method for document similarity analysis. The method includes generating a reference document content identifier for a reference document, including identifying frequently occurring terms in reference document content, encoding each frequently occurring term in a term identifier and combining the term identifiers to form the reference document content identifier associated with the reference document. The method also includes obtaining at least one document similarity value by comparing the reference document content identifier to a set of archived document content identifiers stored in a document repository.
-
4.
公开(公告)号:US10162742B2
公开(公告)日:2018-12-25
申请号:US15915866
申请日:2018-03-08
Applicant: Open Text Corporation
Inventor: Hong Yuan , Xiaochen Nie , Tingxian Cheng , Yujia Wang , Xia Liu , Chao Chen
Abstract: Software testing techniques based on image recognition are disclosed. In various embodiments, a programmatically implemented image classifier is trained to recognize a screen shot image as being associated with a transaction end condition of a transaction. A test script configured to initiate an iteration of the transaction is run. A start time of the iteration of the transaction is recorded. Screen shot images are generated during performance of the iteration of the transaction to capture a series of screen shot images of at least a portion of a user interface display associated with the iteration of the transaction. The image classifier is used to find an earliest-captured image that matches the transaction end condition. A time associated with the matched image is used as a transaction end time to compute an end-to-end time to perform the iteration of the transaction.
-
5.
公开(公告)号:US20180196742A1
公开(公告)日:2018-07-12
申请号:US15915866
申请日:2018-03-08
Applicant: Open Text Corporation
Inventor: Hong Yuan , Xiaochen Nie , Tingxian Cheng , Yujia Wang , Xia Liu , Chao Chen
CPC classification number: G06F11/3696 , G06F9/466 , G06F11/3672 , G06F17/30256 , G06K9/00624 , G06K9/4642 , G06K9/6267 , G06K9/6271 , G06K9/66 , G06N99/005
Abstract: Software testing techniques based on image recognition are disclosed. In various embodiments, a programmatically implemented image classifier is trained to recognize a screen shot image as being associated with a transaction end condition of a transaction. A test script configured to initiate an iteration of the transaction is run. A start time of the iteration of the transaction is recorded. Screen shot images are generated during performance of the iteration of the transaction to capture a series of screen shot images of at least a portion of a user interface display associated with the iteration of the transaction. The image classifier is used to find an earliest-captured image that matches the transaction end condition. A time associated with the matched image is used as a transaction end time to compute an end-to-end time to perform the iteration of the transaction.
-
公开(公告)号:US09852337B1
公开(公告)日:2017-12-26
申请号:US14871501
申请日:2015-09-30
Applicant: Open Text Corporation
Inventor: Jeroen Mattijs van Rotterdam , Michael T Mohen , Chao Chen , Kun Zhao
CPC classification number: G06K9/00483 , G06F17/30011 , G06F17/30699 , G06K9/00469
Abstract: A method for assessing similarity of documents. The method includes extracting a reference document text from a reference document, extracting an archived document text from an archived document, and quantifying the reference document and the archived document. Quantifying the reference and archived documents includes tokenizing sentences of the reference document and archived document, respectively, and vectorizing the tokenized sentences to obtain a reference document text vector and an archived document text vector for each sentence of the reference and archived document, respectively. The method also includes determining a document similarity value of the quantified reference document and the quantified archived document. Determining the document similarity value includes calculating a set of vector similarity values for a set of combinations of a reference document text vector and an archived document text vector, and calculating the document similarity value, including a sum of the plurality of vector similarity values.
-
公开(公告)号:US12189693B2
公开(公告)日:2025-01-07
申请号:US18345886
申请日:2023-06-30
Applicant: Open Text Corporation
Inventor: Lei Zhang , Chao Chen , Kun Zhao , Jingjing Liu , Ying Teng
IPC: G06F16/93 , G06F16/22 , G06F16/2455 , G06F16/2457
Abstract: A method for document similarity analysis. The method includes generating a reference document content identifier for a reference document, including identifying frequently occurring terms in reference document content, encoding each frequently occurring term in a term identifier and combining the term identifiers to form the reference document content identifier associated with the reference document. The method also includes obtaining at least one document similarity value by comparing the reference document content identifier to a set of archived document content identifiers stored in a document repository.
-
公开(公告)号:US10970536B2
公开(公告)日:2021-04-06
申请号:US16692005
申请日:2019-11-22
Applicant: Open Text Corporation
Inventor: Jeroen Mattijs van Rotterdam , Michael T Mohen , Chao Chen , Kun Zhao
IPC: G06K9/00 , G06F16/93 , G06F16/335
Abstract: Systems and methods for assessing similarity of documents are provided. Embodiments of the systems and methods include extracting a reference document text from a reference document, extracting an archived document text from an archived document, and quantifying the reference document and the archived document. The systems and methods may also include determining a document similarity value of the quantified reference document and the archived document. Determining the document similarity value includes calculating a set of vector similarity values for a set of combinations of a reference document text vector and an archived document text vector, and calculating the document similarity value, including a sum of the plurality of vector similarity values.
-
公开(公告)号:US20200089947A1
公开(公告)日:2020-03-19
申请号:US16692005
申请日:2019-11-22
Applicant: Open Text Corporation
Inventor: Jeroen Mattijs van Rotterdam , Michael T. Mohen , Chao Chen , Kun Zhao
IPC: G06K9/00 , G06F16/93 , G06F16/335
Abstract: Systems and methods for assessing similarity of documents are provided. Embodiments of the systems and methods include extracting a reference document text from a reference document, extracting an archived document text from an archived document, and quantifying the reference document and the archived document. The systems and methods may also include determining a document similarity value of the quantified reference document and the archived document. Determining the document similarity value includes calculating a set of vector similarity values for a set of combinations of a reference document text vector and an archived document text vector, and calculating the document similarity value, including a sum of the plurality of vector similarity values.
-
公开(公告)号:US20250124079A1
公开(公告)日:2025-04-17
申请号:US18929730
申请日:2024-10-29
Applicant: Open Text Corporation
Inventor: Chao Chen , Kunwu Huang , Hongtao Dai , Jingjing Liu
IPC: G06F16/583 , G06F16/9038 , G06F16/93 , G06F40/129 , G06F40/166 , G06V10/20 , G06V10/70 , G06V10/98 , G06V30/28 , G06V30/32
Abstract: Ideogram character analysis includes partitioning an original ideogram character into strokes and mapping each stroke to a corresponding stroke identifier (id) to create an original stroke id sequence that includes stroke identifiers. A candidate ideogram character that has a candidate stroke id sequence within a threshold distance to the original stroke id sequence is selected. One or more embodiments may create a new phrase by replacing the original ideogram character with the candidate ideogram character in a search phrase. One or more embodiments perform a search using the search phrase and the new phrase to obtain a result and present the result. One or more embodiments may replace an original ideogram character in a character recognized document with the candidate ideogram character and store the character recognized document.
-
-
-
-
-
-
-
-
-