Differencing based self-supervised scene change detection (D-SSCD) with temporal consistency

    公开(公告)号:US12062188B2

    公开(公告)日:2024-08-13

    申请号:US17691397

    申请日:2022-03-10

    CPC classification number: G06T7/20 G06N3/088 G06T2207/20081 G06T2207/20084

    Abstract: A computer implemented network for executing a self-supervised scene change detection method in which image pairs (T0, T1) from different time instances are subjected to random photometric transformations to obtain two pairs of augmented images (T0→T0′, T0″; T1→T1′, T1″), which augmented images are passed into an encoder (fθ) and a projection head (gϕ) to provide corresponding feature representations. Absolute feature differencing is applied over the outputs of the projection head (gϕ) to obtain difference representations (d1, d2) of changed features between the pair of images, and a self-supervised objective function (LSSL) is applied on the difference representations d1 and d2 to maximize a cross-correlation of the changed features, wherein d1 and d2 are defined as









    d
    1

    =



    "\[LeftBracketingBar]"



    g

    (

    f

    (

    T
    0


    )

    )

    -

    g

    (

    f

    (

    T
    0


    )

    )




    "\[RightBracketingBar]"








    d
    2

    =



    "\[LeftBracketingBar]"



    g

    (

    f

    (

    T
    0


    )

    )

    -

    g

    (

    f

    (

    T
    1


    )

    )




    "\[RightBracketingBar]"







    (
    1
    )







    Furthermore, an invariant prediction and change consistency loss is applied in the D-SSCD Network to reduce the effects of differences in the lighting conditions or camera viewpoints by enhancing the image alignment between the temporal images in the decision and feature space.

    SPARSE CODING IN A DUAL MEMORY SYSTEM FOR LIFELONG LEARNING

    公开(公告)号:US20240135169A1

    公开(公告)日:2024-04-25

    申请号:US18148257

    申请日:2022-12-29

    CPC classification number: G06N3/08 G06N3/0442 G06N3/048

    Abstract: A computer-implemented method that encourages sparse coding in deep neural networks and mimics the interplay of multiple memory systems for maintaining a balance between stability and plasticity. To this end, the method includes a multi-memory experience replay mechanism that employs sparse coding. Activation sparsity is enforced along with a complementary dropout mechanism, which encourages the model to activate similar neurons for semantically similar inputs while reducing the overlap with activation patterns of semantically dissimilar inputs. The semantic dropout provides an efficient mechanism for balancing reusability and interference of features depending on the similarity of classes across tasks. Furthermore, the method includes the step of maintaining an additional long-term semantic memory that aggregates the information encoded in the synaptic weights of the working memory. An additional long-term semantic memory is maintained that aggregates the information encoded in the synaptic weights of the working memory.

    Method and System for Multi-Task Structural Learning

    公开(公告)号:US20240037455A1

    公开(公告)日:2024-02-01

    申请号:US17894401

    申请日:2022-08-24

    CPC classification number: G06N20/10 G06K9/6256 G06K9/6215 G06N3/063 G06N5/022

    Abstract: A computer-implemented method for multi-task structural learning in artificial neural network in which both the architecture and its parameters are learned simultaneously. The method utilizes two neural operators, namely, neuron creation and neuron removal, to aid in structural learning. The method creates excess neurons by starting from a disparate network for each task. Through the progress of training, corresponding task neurons in a layer pave the way for a specialized group neuron leading to a structural change. In the task learning phase of training, different neurons specialize in different tasks. In the interleaved structural learning phase, locally similar task neurons, before being removed, transfer their knowledge to a newly created group neuron. The training is completed with a final fine-tuning phase where only the multi-task loss is used.

    Differencing Based Self-Supervised Scene Change Detection (D-SSCD) with Temporal Consistency

    公开(公告)号:US20230289977A1

    公开(公告)日:2023-09-14

    申请号:US17691397

    申请日:2022-03-10

    CPC classification number: G06T7/20 G06N3/088 G06T2207/20081 G06T2207/20084

    Abstract: A computer implemented network for executing a self-supervised scene change detection method in which image pairs (T0, T1) from different time instances are subjected to random photometric transformations to obtain two pairs of augmented images (T0 → T 0′, T 0‴ ; T1 → T 1′, T1″), which augmented images are passed into an encoder (fθ) and a projection head (gϕ) to provide corresponding feature representations. Absolute feature differencing is applied over the outputs of the projection head (gϕ) to obtain difference representations (d1, d2) of changed features between the pair of images, and a self-supervised objective function (LSSL) is applied on the difference representations d1 and d2 to maximize a cross-correlation of the changed features, wherein d1 and d2 are defined as










    d
    1

    =


    g


    f




    T


    0






    g


    f




    T


    1












    d
    2

    =


    g


    f




    T


    0






    g


    f




    T


    1













    ­­­(1)







    Furthermore, an invariant prediction and change consistency loss is applied in the D-SSCD Network to reduce the effects of differences in the lighting conditions or camera viewpoints by enhancing the image alignment between the temporal images in the decision and feature space.

    Method and System for Generating Ground-Truth Annotations of Roadside Objects in Video Data

    公开(公告)号:US20220092320A1

    公开(公告)日:2022-03-24

    申请号:US17482339

    申请日:2021-09-22

    Abstract: A method and system for generating ground-truth annotations for object detection and classification for roadside objects in video data, wherein the method uses in combination an object detector to detect object instances of roadside objects in each frame of a video, a visual object tracker to detect and track the roadside object across the remaining video frames the roadside object appears in and clusters these detected object instances of the same roadside object into an object track, a trajectory analyzer to filter out object tracks that are unlikely from roadside objects, a classification model to classify each object instance in the object track into a predefined roadside object class, after which the object track as a whole is classified by seeking consensus among the individual object instance classifications in the object track, and classification consistency to determine whether the resulting roadside object class can be assigned automatically to the concerning object track as a ground-truth annotation or whether the ground-truth annotation should be manually verified by an operator. Accordingly, it is possible with the invention to convert model prediction labels in an automated way into ground-truth annotations, so as to create ground-truth annotations with a similar reliability as manual annotation and significantly reduce the amount of manual effort involved in creating reliable ground-truth annotations.

Patent Agency Ranking