APPARATUS AND METHODS FOR IMAGE SEGMENTATION USING MACHINE LEARNING PROCESSES

    公开(公告)号:US20240078679A1

    公开(公告)日:2024-03-07

    申请号:US17901429

    申请日:2022-09-01

    CPC classification number: G06T7/11 G06T7/74 G06T2207/20112

    Abstract: Methods, systems, and apparatuses for image segmentation are provided. For example, a computing device may obtain an image, and may apply a process to the image to generate input image feature data and input image segmentation data. Further, the computing device may obtain reference image feature data and reference image classification data for a plurality of reference images. The computing device may generate reference image segmentation data based on the reference image feature data, the reference image classification data, and the input image feature data. The computing device may further blend the input image segmentation data and the reference image segmentation data to generate blended image segmentation data. The computing device may store the blended image segmentation data within a data repository. In some examples, the computing device provides the blended image segmentation data for display.

    DISTANCE-BASED BOUNDARY AWARE SEMANTIC SEGMENTATION

    公开(公告)号:US20220156528A1

    公开(公告)日:2022-05-19

    申请号:US17528141

    申请日:2021-11-16

    Abstract: A method applies a distance-based loss function to a boundary recognition model. The method classifies boundaries of an input with the boundary recognition model. The method also performs semantic segmentation based on the classifying of the boundaries, and outputting a segmentation map showing different classes of objects from the input, based on the semantic segmentation. The method may train an inverse transforming artificial neural network to predict a perspective transformation of an image so that the trained artificial neural network represents the distance-based loss function. The method may freeze weights of the inverse transforming artificial neural network, after training, to obtain the distance-based loss function. Training of the inverse transforming artificial neural network may include generating shifted, translated, and scaled versions of the image such that a ground truth comprises values corresponding to the amounts of shifting, translating, and scaling.

    HARDWARE-AWARE EFFICIENT ARCHITECTURES FOR TEXT-TO-IMAGE DIFFUSION MODELS

    公开(公告)号:US20250131606A1

    公开(公告)日:2025-04-24

    申请号:US18492572

    申请日:2023-10-23

    Abstract: A processor-implemented method includes receiving a text-semantic input at a first stage of a neural network, including a first convolutional block and no attention layers. The method receives, at a second stage, a first output from the first stage. The second stage comprises a first down sampling block including a first attention layer and a second convolutional block. The method receives, at a third stage, a second output from the second stage. The third stage comprises a first up sampling block including a second attention layer and a first set of convolutional blocks. The method receives, at a fourth stage, the first output from the first stage and a third output from the third stage. The fourth stage comprises a second up sampling block including no attention layers and a second set of convolutional blocks. The method generates an image at the fourth stage, based on the text-semantic input.

Patent Agency Ranking