APPARATUS AND METHOD FOR 3D DYNAMIC SPARSE CONVOLUTION

    公开(公告)号:US20250148761A1

    公开(公告)日:2025-05-08

    申请号:US18717894

    申请日:2022-03-03

    Abstract: The disclosure provides an apparatus, method, device and medium for 3D dynamic sparse convolution. The method includes: receiving an input feature map of a 3D data sample; performing input feature map partition to divide the input feature map into a plurality of disjoint input feature map groups; performing a shared 3D dynamic sparse convolution to the plurality of disjoint input feature map groups respectively to obtain a plurality of output feature maps corresponding to the plurality of disjoint input feature map groups, wherein the shared 3D dynamic sparse convolution comprises a shared 3D dynamic sparse convolutional kernel; and performing output feature map grouping to sequentially stack the plurality of output feature maps to obtain an output feature map corresponding to the input feature map. (FIG. 2).

    APPARATUS AND METHOD FOR DYNAMIC QUADRUPLE CONVOLUTION IN 3D CNN

    公开(公告)号:US20240312196A1

    公开(公告)日:2024-09-19

    申请号:US18565967

    申请日:2021-11-30

    CPC classification number: G06V10/82 G06N3/0464 G06V20/42

    Abstract: An apparatus, method, device and medium for dynamic quadruple convolution in a 3-dimensional (3D) convolutional neural network (CNN) are provided. The method includes: a multi-dimensional attention block configured to: receive an input feature map of a video data sample; and dynamically generate convolutional kernel scalars along four dimensions of a 3-dimensional convolution kernel space based on the input feature map, the four dimensions comprising an output channel number, an input channel number, a temporal size and a spatial size; and a convolution block configured to sequentially multiply the generated convolutional kernel scalars with a static 3D convolution kernel in a matrix-vector product way to obtain a dynamic kernel of dynamic quadruple convolution.

    ENHANCED TECHNIQUES FOR REAL-TIME MULTI-PERSON THREE-DIMENSIONAL POSE TRACKING USING A SINGLE CAMERA

    公开(公告)号:US20240312055A1

    公开(公告)日:2024-09-19

    申请号:US18569996

    申请日:2021-12-10

    Abstract: This disclosure describes systems, methods, and devices related to real-time multi-person three-dimensional pose tracking using a single camera. A method may include receiving, by a device, two-dimensional image data from a camera, the two-dimensional image data representing a first person and a second person; generating, based on the two-dimensional image data, two-dimensional positions of body parts represented by the first person; generating, using a deep neural network, based on the two-dimensional positions, a three-dimensional pose regression of the body parts represented by the first person; identifying, based on the two-dimensional positions and the three-dimensional pose regression, contact between a ground plane and a foot of the first person; generating an absolute three-dimensional position of the contact between the ground plane and the foot of the first person; generating, based on the absolute three-dimensional position, a three-dimensional pose of the body parts represented by the first person.

    DYNAMIC CONDITIONAL POOLING FOR NEURAL NETWORK PROCESSING

    公开(公告)号:US20240013047A1

    公开(公告)日:2024-01-11

    申请号:US18252231

    申请日:2020-12-24

    CPC classification number: G06N3/08 G06V10/7715

    Abstract: Dynamic conditional pooling for neural network processing is disclosed. An example of a storage medium includes instructions for receiving an input at a convolutional layer of a convolutional neural network (CNN); receiving an input sample at a pooling stage of the convolutional layer; generating a plurality of soft weights based on the input sample; performing conditional aggregation on the input sample utilizing the plurality of soft weights to generate an aggregated value; and performing conditional normalization on the aggregated value to generate an output for the convolutional layer.

    OMNI-SCALE CONVOLUTION FOR CONVOLUTIONAL NEURAL NETWORKS

    公开(公告)号:US20230410496A1

    公开(公告)日:2023-12-21

    申请号:US18252164

    申请日:2020-12-23

    CPC classification number: G06V10/82

    Abstract: Omni-scale convolution for convolutional neural networks is disclosed. An example of an apparatus includes one or more processors to process data, including processing for a convolutional neural network (CNN); and a memory to store data, including CNN data, wherein processing of input data by the CNN includes implementing omni-scale convolution in one or more convolutional layers of the CNN, implementation of the omni-scale convolution into a convolutional layer of the one or more convolutional layers including at least applying multiple dilation rates in a plurality of kernels of a kernel lattice of the convolutional layer, and applying a cyclic pattern for the multiple dilation rates in the plurality of kernels of the convolutional layer.

    A GENERIC MODULAR SPARSE THREE-DIMENSIONAL (3D) CONVOLUTION DESIGN UTILIZING SPARSE 3D GROUP CONVOLUTION

    公开(公告)号:US20220147791A1

    公开(公告)日:2022-05-12

    申请号:US17435657

    申请日:2019-06-21

    Abstract: Embodiments are generally directed to sparse 3D convolution acceleration in a convolutional layer of an artificial neural network model. An embodiment of an apparatus includes one or more processors including a graphics processor to process data; and a memory for storage of data, including feature maps. The one or more processors are to provide for sparse 3D convolution acceleration by applying a shared 3D convolutional kernel/filter to an input feature map to produce an output feature map, including increasing sparsity of the input feature map by partitioning it into multiple disjoint input groups; generation of multiple disjoint output groups corresponding to the input groups by performing a convolution calculation represented by the shared 3D convolutional kernel/filter on all feature values associated with active/valid voxels of each input group to produce corresponding feature values within corresponding output groups; and outputting the output feature map by sequentially stacking the output groups.

Patent Agency Ranking