摘要:
Methods and systems for testing base text direction (BTD) include comparing one or more images from an end-user system to a respective reference image associated with a respective text test case. Each of the one or more images includes respective text test case information. It is determined whether the end-user system produces BTD errors based on the comparison in accordance with one or more BTD error rules.
摘要:
A method and an electronic device for inputting handwriting character are provided. The electronic device comprises a touch screen, a memory, and a processor. The processor is configured to perform the functions of the method. The method comprises steps of: adding a handwriting input on the touch screen; detecting a position of an initial point of the handwriting input; determining an input area for the handwriting input among the plurality of input areas of the touch screen based on the position of the initial point of the handwriting input; determining an operation of the handwriting input based on the position of the initial point of the handwriting input and performing the determined operation; and upon completion of the handwriting input, recognizing the input as a character and displaying the recognized character in the determined input area on the touch screen.
摘要:
Methods, systems, and computer-readable media related to a technique for providing handwriting input functionality on a user device. A handwriting recognition module is trained to have a repertoire comprising multiple non-overlapping scripts and capable of recognizing tens of thousands of characters using a single handwriting recognition model. The handwriting input module provides real-time, stroke-order and stroke-direction independent handwriting recognition. User interfaces for providing the handwriting input functionality are also disclosed.
摘要:
Methods and systems for grapheme splitting of text input for recognition are provided. A method may include receiving a text input in a script and segmenting the text input into one or more graphemes. Each of the one or more graphemes may be split into one or more recognition units based on one or more recognition unit identification criteria associated with the script. Next, a text recognition system may be trained using the recognition units. Text input may be handwritten text input received from a user or a scanned image of text.
摘要:
Systems and methods are provided for optimizing a glyph-based file. Individual components may be identified within glyphs of a file. Each identified component within a glyph may be a portion of the glyph, and may be a joint component or disjoint component. Groupings of components may then be determined, where the groupings are determined based at least in part by identifying similarly shaped components. A representative component may then be selected from each grouping. Composite glyphs may be generated and stored in an optimized file, where each composite glyph includes a reference to at least one representative component.
摘要:
An electronic device and method identify a block of text in a portion of an image of real world captured by a camera of a mobile device, slice sub-blocks from the block and identify characters in the sub-blocks that form a first sequence to a predetermined set of sequences to identify a second sequence therein. The second sequence may be identified as recognized (as a modifier-absent word) when not associated with additional information. When the second sequence is associated with additional information, a check is made on pixels in the image, based on a test specified in the additional information. When the test is satisfied, a copy of the second sequence in combination with the modifier is identified as recognized (as a modifier-present word). Storage and use of modifier information in addition to a set of sequences of characters enables recognition of words with or without modifiers.
摘要:
Systems, apparatus and methods for extracting lower modifiers from a word image, before performing optical character recognition (OCR), based on a plurality of tests comprising a first test, a second test and a third test are presented. The method obtains the word image and performing a plurality of tests (e.g., a first test, a second test and a third test). The first test determines whether a vertical line spanning the height of the word image exists. The second test determines whether a jump of a number of components in the lower portion of the word image exists. The third test determines sparseness in a lower portion of the word image. The plurality of tests may run sequentially and/or in parallel. Results from the plurality of tests are used to decide whether a lower modifier exists by comparing and accumulating test results from the plurality of tests.
摘要:
Methods, systems, and computer-readable media related to a technique for providing handwriting input functionality on a user device. A handwriting recognition module is trained to have a repertoire comprising multiple non-overlapping scripts and capable of recognizing tens of thousands of characters using a single handwriting recognition model. The handwriting input module provides real-time, stroke-order and stroke-direction independent handwriting recognition. User interfaces for providing the handwriting input functionality are also disclosed.
摘要:
A method for automatically recognizing Arabic text includes building an Arabic corpus comprising Arabic text files written in different writing styles and ground truths corresponding to each of the Arabic text files, storing writing-style indices in association with the Arabic text files, digitizing a line of Arabic characters to form an array of pixels, dividing the line of the Arabic characters into line images, forming a text feature vector from the line images, training a Hidden Markov Model using the Arabic text files and ground truths in the Arabic corpus in accordance with the writing-style indices, and feeding the text feature vector into a Hidden Markov Model to recognize the line of Arabic characters.
摘要:
An electronic device and method receive a block sliced from a rectangular portion of an image of a scene of real world captured by a camera and use a property of the block to operate one of multiple optical character recognition (OCR) decoders. In an illustrative aspect, a first OCR decoder is configured to recognize characters whose property satisfies the test based on a first limit, the first limit being obtained by reducing a predetermined limit by an overlap amount. In this illustrative aspect, a second OCR decoder is configured to recognize characters whose property does not satisfy the test based on a second limit, the second limit being obtained by increasing the predetermined limit by the overlap amount. When the property of the block satisfies the test, the first OCR decoder is operated and alternatively the second OCR decoder is operated, resulting in candidates for a character being identified.