-
公开(公告)号:US11010954B2
公开(公告)日:2021-05-18
申请号:US16438446
申请日:2019-06-11
Applicant: Samsung Electronics Co., Ltd.
Abstract: A computer-implemented redundant-coverage discard method and apparatus for reducing pixel shader work in a tile-based graphics rendering pipeline is disclosed. A coverage block information (CBI) FIFO buffer is disposed within an early coverage discard (ECD) logic section. The FIFO buffer receives and buffers coverage blocks in FIFO order. At least one coverage block that matches the block position within the TCPM is updated. The TCPM stores per-pixel primitive coverage information. The FIFO buffer buffers a moving window of the coverage blocks. Incoming primitive information associated with the coverage blocks is compared with the per-pixel primitive coverage information stored in the tile coverage-primitive map (TCPM) table at the corresponding positions for the live coverages only. Any preceding overlapping coverage within the moving window of the coverage blocks is rejected. An alternate embodiment uses a doubly linked-list rather than a FIFO buffer.
-
公开(公告)号:US10691455B2
公开(公告)日:2020-06-23
申请号:US15684573
申请日:2017-08-23
Applicant: Samsung Electronics Co., Ltd.
Inventor: Tejash M. Shah , Srinivasan S. Iyer , David C. Tannenbaum
IPC: G06F9/30 , G06F9/38 , G06F1/3234 , G06F9/32 , G06T1/20 , G06F30/34 , G05B19/4097 , G06F30/327 , G06F119/06
Abstract: A method and apparatus are provided. The method includes executing a plurality of threads in a temporal dimension, executing a plurality of threads in a spatial dimension, determining a branch target address for each of the plurality of threads in the temporal dimension and the plurality of threads in the spatial dimension, and comparing each of the branch target addresses to determine a minimum branch target address, wherein the minimum branch target address is a minimum value among branch target addresses of each of the plurality of threads.
-
公开(公告)号:US10282889B2
公开(公告)日:2019-05-07
申请号:US15432782
申请日:2017-02-14
Applicant: Samsung Electronics Co., Ltd
Inventor: David C. Tannenbaum , Manshila Adlakha , Vikash Kumar , Abhinav Golas
IPC: G06T15/00
Abstract: One or more embodiments of the present disclosure provide an apparatus used in source data compression, comprising a memory and a at least one processor. The memory is configured to store vertex attribute data and a set of instructions. The processor is coupled to the memory. The processor is configured to receive a source data stream that includes one or more values corresponding to the vertex attribute data. The processor is also configured to provide a dictionary for the one or more values in the source data stream, wherein the dictionary includes a plurality of index values corresponding to the one or more values in the source data stream. The processor is also configured to lace at least some of the one or more values in the source data stream with corresponding index values of the plurality of index values.
-
公开(公告)号:US20180341489A1
公开(公告)日:2018-11-29
申请号:US15684573
申请日:2017-08-23
Applicant: Samsung Electronics Co., Ltd.
Inventor: Tejash M. Shah , Srinivasan S. Iyer , David C. Tannenbaum
Abstract: A method and apparatus are provided. The method includes executing a plurality of threads in a temporal dimension, executing a plurality of threads in a spatial dimension, determining a branch target address for each of the plurality of threads in the temporal dimension and the plurality of threads in the spatial dimension, and comparing each of the branch target addresses to determine a minimum branch target address, wherein the minimum branch target address is a minimum value among branch target addresses of each of the plurality of threads.
-
公开(公告)号:US11971949B2
公开(公告)日:2024-04-30
申请号:US17173203
申请日:2021-02-10
Applicant: Samsung Electronics Co., Ltd.
Inventor: Christopher P. Frascati , Simon Waters , Rama S. B Harihara , David C. Tannenbaum
CPC classification number: G06F17/16 , G06F9/30101 , G06F17/15 , G06N3/08
Abstract: A graphics processing unit (GPU) and a method is disclosed that performs a convolution operation recast as a matrix multiplication operation. The GPU includes a register file, a processor and a state machine. The register file stores data of an input feature map and data of a filter weight kernel. The processor performs a convolution operation on data of the input feature map and data of the filter weight kernel as a matrix multiplication operation. The state machine facilitates performance of the convolution operation by unrolling the data of the input feature map and the data of the filter weight kernel in the register file. The state machine includes control registers that determine movement of data through the register file to perform the matrix multiplication operation on the data in the register file in an unrolled manner.
-
公开(公告)号:US11798218B2
公开(公告)日:2023-10-24
申请号:US17503259
申请日:2021-10-15
Applicant: Samsung Electronics Co., Ltd.
Inventor: Keshavan Varadarajan , Veynu Narasiman , David C. Tannenbaum
IPC: G06T15/00
CPC classification number: G06T15/005 , G06T2210/21
Abstract: A method of packing coverage in a graphics processing unit (GPU) may include receiving an indication for a portion of an image, determining, based on the indication, a packing technique for the portion of the image, and packing coverage for the portion of the image based on the packing technique. The indication may include one or more of: an importance, a quality, a level of interest, a level of detail, or a variable-rate shading (VRS) level. The indication may be received from an application. The packing technique may include array merging. The array merging may include quad merging. The packing technique may include pixel piling. The packing technique may be a first packing technique, and the method may further include determining, based on the indication, a second packing technique for the portion of the image, and packing coverage for the portion of the image based on the second packing technique.
-
17.
公开(公告)号:US11763521B2
公开(公告)日:2023-09-19
申请号:US17495804
申请日:2021-10-06
Applicant: Samsung Electronics Co., Ltd.
Inventor: Gabriel T. Dagani , Gregory Bergschneider , David C. Tannenbaum
CPC classification number: G06T15/80 , G06T1/20 , G06T15/005 , G06T15/503
Abstract: A system and a method are disclosed for varying a pixel-rate functionality of a GPU as an optional feature without an explicit implementation from within an application. User interface (UI) content may be detected in a draw call of an application and a variable-rate shader lookup map may be generated based on the detected UI content. A pixel rate of 3D content may be increased using the variable-rate shader lookup map. Additionally or alternatively, other conditions may be detected for increasing the pixel rate, such as using information in an application profile, detecting high or low luminance values, detecting motion and/or detecting temporal anti-aliasing.
-
公开(公告)号:US11715252B2
公开(公告)日:2023-08-01
申请号:US17168168
申请日:2021-02-04
Applicant: Samsung Electronics Co., Ltd.
Inventor: Keshavan Varadarajan , David C. Tannenbaum , F N U Gurupad
CPC classification number: G06T15/005 , G06T1/20 , G06T15/80
Abstract: A GPU includes shader cores and a shader warp packer unit. The shader warp packer unit may receive a first primitive associated with a first partially covered quad, and a second primitive associated with a second partially covered quad. The shader warp packer unit may determine that the first partially covered quad and the second partially covered quad have non-overlapping coverage. The shader warp packer unit may pack the first partially covered quad and the second partially covered quad into a packed quad. The shader warp packer unit may send the packed quad to the shader cores. The first partially covered quad and the second partially covered quad may be spatially disjoint from each other. The shader cores may receive and process the packed quad with no loss of information relative to the shader cores individually processing the first partially covered quad and the second partially covered quad.
-
公开(公告)号:US11416960B2
公开(公告)日:2022-08-16
申请号:US17110284
申请日:2020-12-02
Applicant: Samsung Electronics Co., Ltd.
Inventor: David C. Tannenbaum , Keshavan Varadarajan , Veynu Narasiman
Abstract: A binning subsystem of a GPU includes a storage subsystem, a shader core to output first data via a first path, a selector to receive the first data via the first path, and to receive second data from the storage subsystem via a second path. The storage subsystem includes a binner unit and a control logic unit. The control logic unit causes the selector to transfer the first data or the second data to the binner unit. The binner unit may transfer binner output data to the shader core via a third path. The binner unit may transfer the binner output data to one or more subsequent stages of a graphics pipeline via a fourth path. The binner unit may transfer the binner output data to the storage subsystem via a fifth path. The control logic unit may control the binner unit such that the binner unit can be used for general purpose computation.
-
公开(公告)号:US20190384600A1
公开(公告)日:2019-12-19
申请号:US16127104
申请日:2018-09-10
Applicant: Samsung Electronics Co., Ltd.
Inventor: Mitchell K. Alsup , David C. Tannenbaum , Derek Lentz , Srinivasan S. Iyer , Christopher J. Goodman
Abstract: A system and method for binding instructions to a graphical processing unit (GPU) includes a GPU configured to receive bindlessly compiled instructions and interpret the bindlessly compiled instruction at runtime to identify a needed conversion The GPU generates a conversion information based on the bindlessly compiled instruction and needed conversion and converts the bindlessly compiled instruction according to the conversion information to generate a bound format instruction. The GPU may then execute the bound format instruction.
-
-
-
-
-
-
-
-
-