Abstract:
Methods, devices, and apparatuses are provided to facilitate a positioning of an item of virtual content in an extended reality environment. For example, a first user may access the extended reality environment through a display of a mobile device, and in some examples, the methods may determine positions and orientations of the first user and a second user within the extended reality environment. The methods may also determine a position for placement of the item of virtual content in the extended reality environment based on the determined positions and orientations of the first user and the second user, and perform operations that insert the item of virtual content into the extended reality environment at the determined placement position.
Abstract:
Systems and methods are provided for merging multiple images to produce a single fused image having desirable image characteristics derived from the multiple images. For example, the system may determine image characteristics for first and second images. The image characteristics may be related to contrast, exposure, color saturation, and the like. Based on the image characteristics, the system may generate a combined luma weight map. The system may decompose the first and second images and the combined luma weight map. In an example, the first image, the second image, and the combined luma weight map may be represented as scale-space representations having multiple scales or levels. The system may merge the decomposed representations of the first and second images and the combined luma weight map to form a decomposed representation of the fused image. The system may generate the actual fused image from the decomposed representation of the fused image.
Abstract:
Systems and methods are provided for merging multiple images to produce a single fused image having desirable image characteristics derived from the multiple images. For example, the system may determine image characteristics for first and second images. The image characteristics may be related to contrast, exposure, color saturation, and the like. Based on the image characteristics, the system may generate a combined luma weight map. The system may decompose the first and second images and the combined luma weight map. In an example, the first image, the second image, and the combined luma weight map may be represented as scale-space representations having multiple scales or levels. The system may merge the decomposed representations of the first and second images and the combined luma weight map to form a decomposed representation of the fused image. The system may generate the actual fused image from the decomposed representation of the fused image.
Abstract:
An electronic device and method identify regions that are likely to be text in a natural image or video frame, followed by processing as follows: lines that are nearly vertical are automatically identified in a selected text region, oriented relative to the vertical axis within a predetermined range −max_theta to +max_theta, followed by determination of an angle θ of the identified lines, followed by use of the angle θ to perform perspective correction by warping the selected text region. After perspective correction in this manner, each text region is processed further, to recognize text therein, by performing OCR on each block among a sequence of blocks obtained by slicing the potential text region. Thereafter, the result of text recognition is used to display to the user, either the recognized text or any other information obtained by use of the recognized text.
Abstract:
Certain aspects of the present disclosure relate to techniques for low-complexity encoding (compression) of broad class of signals, which are typically not well modeled as sparse signals in either time-domain or frequency-domain. First, the signal can be split in time-segments that may be either sparse in time domain or sparse in frequency domain, for example by using absolute second order differential operator on the input signal. Next, different encoding strategies can be applied for each of these time-segments depending in which domain the sparsity is present.
Abstract:
A difference in intensities of a pair of pixels in an image is repeatedly compared to a threshold, with the pair of pixels being separated by at least one pixel (“skipped pixel”). When the threshold is found to be exceeded, a selected position of a selected pixel in the pair, and at least one additional position adjacent to the selected position are added to a set of positions. The comparing and adding are performed multiple times to generate multiple such sets, each set identifying a region in the image, e.g. an MSER. Sets of positions, identifying regions whose attributes satisfy a test, are merged to obtain a merged set. Intensities of pixels identified in the merged set are used to generate binary values for the region, followed by classification of the region as text/non-text. Regions classified as text are supplied to an optical character recognition (OCR) system.
Abstract:
An electronic device and method receive (for example, from a memory), a grayscale image of a scene of real world captured by a camera of a mobile device. The electronic device and method also receive a color image from which the grayscale image is generated, wherein each color pixel is stored as a tuple of multiple components. The electronic device and method determine a new intensity for at least one grayscale pixel in the grayscale image, based on at least one component of a tuple of a color pixel located in correspondence to the at least one grayscale pixel. The determination may be done conditionally, by checking whether a local variance of intensities is below a predetermined threshold in a subset of grayscale pixels located adjacent to the at least one grayscale pixel, and selecting the component to provide most local variance of intensities.
Abstract:
Certain aspects of the present disclosure relate to a method for quantizing signals and reconstructing signals, and/or encoding or decoding data for storage or transmission. Points of a signal may be determined as local extrema or points where an absolute rise of the signal is greater than a threshold. The tread and value of the points may be quantized, and certain of the quantizations may be discarded before the quantizations are transmitted. After being received, the signal may be reconstructed from the quantizations using an iterative process.
Abstract:
An image of real world is processed to identify blocks as candidates to be recognized. Each block is subdivided into sub-blocks, and each sub-block is traversed to obtain counts, in a group for each sub-block. Each count in the group is either of presence of transitions between intensity values of pixels or of absence of transition between intensity values of pixels. Hence, each pixel in a sub-block contributes to at least one of the counts in each group. The counts in a group for a sub-block are normalized, based at least on a total number of pixels in the sub-block. Vector(s) for each sub-block including such normalized counts may be compared with multiple predetermined vectors of corresponding symbols in a set, using any metric of divergence between probability density functions (e.g. Jensen-Shannon divergence metric). Whichever symbol has a predetermined vector that most closely matches the vector(s) is identified and stored.
Abstract:
An electronic device and method identify a block of text in a portion of an image of real world captured by a camera of a mobile device, slice sub-blocks from the block and identify characters in the sub-blocks that form a first sequence to a predetermined set of sequences to identify a second sequence therein. The second sequence may be identified as recognized (as a modifier-absent word) when not associated with additional information. When the second sequence is associated with additional information, a check is made on pixels in the image, based on a test specified in the additional information. When the test is satisfied, a copy of the second sequence in combination with the modifier is identified as recognized (as a modifier-present word). Storage and use of modifier information in addition to a set of sequences of characters enables recognition of words with or without modifiers.