-
公开(公告)号:US11189075B2
公开(公告)日:2021-11-30
申请号:US16893107
申请日:2020-06-04
Applicant: NVIDIA CORPORATION
Inventor: Samuli Laine , Timo Aila , Tero Karras , Gregory Muthler , William P. Newhall, Jr. , Ronald C. Babich, Jr. , Craig Kolb , Ignacio Llamas , John Burgess
Abstract: Methods and systems are described in some examples for changing the traversal of an acceleration data structure in a highly dynamic query-specific manner, with each query specifying test parameters, a test opcode and a mapping of test results to actions. In an example ray tracing implementation, traversal of a bounding volume hierarchy by a ray is performed with the default behavior of the traversal being changed in accordance with results of a test performed using the test opcode and test parameters specified in the ray data structure and another test parameter specified in a node of the bounding volume hierarchy. In an example implementation a traversal coprocessor is configured to perform the traversal of the bounding volume hierarchy.
-
公开(公告)号:US20210366177A1
公开(公告)日:2021-11-25
申请号:US16880821
申请日:2020-05-21
Applicant: NVIDIA Corporation
Inventor: Timo Tapani Viitanen , Tero Tapani Karras , Samuli Laine
Abstract: In examples, a list of elements may be divided into spans and each span may be allocated a respective memory range for output based on a worst-case compression ratio of a compression algorithm that will be used to compress the span. Worker threads may output compressed versions of the spans to the memory ranges. To ensure placement constraints of a data structure will be satisfied, boundaries of the spans may be adjusted prior to compression. The size allocated to a span (e.g., each span) may be increased (or decreasing) to avoid padding blocks while allowing for the span's compressed data to use a block allocated to an adjacent span. Further aspects of the disclosure provide for compaction of the portions of compressed data in memory in order to free up space which may have been allocated to account for the memory gaps which may result from variable compression ratios.
-
公开(公告)号:US11157414B2
公开(公告)日:2021-10-26
申请号:US16101109
申请日:2018-08-10
Applicant: NVIDIA Corporation
Inventor: Greg Muthler , Timo Aila , Tero Karras , Samuli Laine , William Parsons Newhall, Jr. , Ronald Charles Babich, Jr. , John Burgess , Ignacio Llamas
IPC: G06F12/00 , G06F12/0875 , G06T15/06 , G06F16/901
Abstract: In a ray tracer, a cache for streaming workloads groups ray requests for coherent successive bounding volume hierarchy traversal operations by sending common data down an attached data path to all ray requests in the group at the same time or about the same time. Grouping the requests provides good performance with a smaller number of cache lines.
-
公开(公告)号:US11138009B2
公开(公告)日:2021-10-05
申请号:US16101247
申请日:2018-08-10
Applicant: NVIDIA Corporation
Inventor: Ronald Charles Babich, Jr. , John Burgess , Jack Choquette , Tero Karras , Samuli Laine , Ignacio Llamas , Gregory Muthler , William Parsons Newhall, Jr.
Abstract: Systems and methods for an efficient and robust multiprocessor-coprocessor interface that may be used between a streaming multiprocessor and an acceleration coprocessor in a GPU are provided. According to an example implementation, in order to perform an acceleration of a particular operation using the coprocessor, the multiprocessor: issues a series of write instructions to write input data for the operation into coprocessor-accessible storage locations, issues an operation instruction to cause the coprocessor to execute the particular operation; and then issues a series of read instructions to read result data of the operation from coprocessor-accessible storage locations to multiprocessor-accessible storage locations.
-
公开(公告)号:US10825230B2
公开(公告)日:2020-11-03
申请号:US16101148
申请日:2018-08-10
Applicant: NVIDIA Corporation
Inventor: Samuli Laine , Tero Karras , Timo Aila , Robert Ohannessian , William Parsons Newhall, Jr. , Greg Muthler , Ian Kwong , Peter Nelson , John Burgess
Abstract: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to properly handle numerically challenging computations at or near edges and/or vertices of primitives and/or ensure that a single intersection is reported when a ray intersects a surface formed by primitives at or near edges and/or vertices of the primitives.
-
公开(公告)号:US12198253B2
公开(公告)日:2025-01-14
申请号:US18239876
申请日:2023-08-30
Applicant: NVIDIA Corporation
Inventor: Samuli Laine , Tero Karras , Greg Muthler , William Parsons Newhall, Jr. , Ronald Charles Babich, Jr. , Ignacio Llamas , John Burgess
Abstract: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to provide a deterministic result of intersected triangles regardless of the order that the memory subsystem returns triangle range blocks for processing, while opportunistically eliminating alpha intersections that lie further along the length of the ray than closer opaque intersections.
-
公开(公告)号:US11966737B2
公开(公告)日:2024-04-23
申请号:US17465234
申请日:2021-09-02
Applicant: NVIDIA Corporation
Inventor: Ronald Charles Babich, Jr. , John Burgess , Jack Choquette , Tero Karras , Samuli Laine , Ignacio Llamas , Gregory Muthler , William Parsons Newhall, Jr.
CPC classification number: G06F9/3004 , G06F9/3877 , G06F9/4843 , G06F15/163 , G06T1/20 , G06T1/60 , G06T2200/28
Abstract: Systems and methods for an efficient and robust multiprocessor-coprocessor interface that may be used between a streaming multiprocessor and an acceleration coprocessor in a GPU are provided. According to an example implementation, in order to perform an acceleration of a particular operation using the coprocessor, the multiprocessor: issues a series of write instructions to write input data for the operation into coprocessor-accessible storage locations, issues an operation instruction to cause the coprocessor to execute the particular operation; and then issues a series of read instructions to read result data of the operation from coprocessor-accessible storage locations to multiprocessor-accessible storage locations.
-
公开(公告)号:US11704863B2
公开(公告)日:2023-07-18
申请号:US17716599
申请日:2022-04-08
Applicant: NVIDIA Corporation
Inventor: Samuli Laine , Tero Karras , Timo Aila , Robert Ohannessian , William Parsons Newhall, Jr. , Greg Muthler , Ian Kwong , Peter Nelson , John Burgess
CPC classification number: G06T15/06 , G06T15/005
Abstract: A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to properly handle numerically challenging computations at or near edges and/or vertices of primitives and/or ensure that a single intersection is reported when a ray intersects a surface formed by primitives at or near edges and/or vertices of the primitives.
-
公开(公告)号:US11675704B2
公开(公告)日:2023-06-13
申请号:US17483133
申请日:2021-09-23
Applicant: NVIDIA Corporation
Inventor: Greg Muthler , Timo Aila , Tero Karras , Samuli Laine , William Parsons Newhall, Jr. , Ronald Charles Babich, Jr. , John Burgess , Ignacio Llamas
IPC: G06F12/00 , G06F12/0875 , G06T15/06 , G06F16/901
CPC classification number: G06F12/0875 , G06F16/9027 , G06T15/06 , G06T2207/20021
Abstract: In a ray tracer, a cache for streaming workloads groups ray requests for coherent successive bounding volume hierarchy traversal operations by sending common data down an attached data path to all ray requests in the group at the same time or about the same time. Grouping the requests provides good performance with a smaller number of cache lines.
-
公开(公告)号:US11270495B2
公开(公告)日:2022-03-08
申请号:US16880821
申请日:2020-05-21
Applicant: NVIDIA Corporation
Inventor: Timo Tapani Viitanen , Tero Tapani Karras , Samuli Laine
IPC: G06T15/00 , G06T15/06 , G06T1/20 , G06T1/60 , G06T9/00 , H03M7/30 , G06F9/50 , G06F3/06 , G06F9/52
Abstract: In examples, a list of elements may be divided into spans and each span may be allocated a respective memory range for output based on a worst-case compression ratio of a compression algorithm that will be used to compress the span. Worker threads may output compressed versions of the spans to the memory ranges. To ensure placement constraints of a data structure will be satisfied, boundaries of the spans may be adjusted prior to compression. The size allocated to a span (e.g., each span) may be increased (or decreasing) to avoid padding blocks while allowing for the span's compressed data to use a block allocated to an adjacent span. Further aspects of the disclosure provide for compaction of the portions of compressed data in memory in order to free up space which may have been allocated to account for the memory gaps which may result from variable compression ratios.
-
-
-
-
-
-
-
-
-