摘要:
One embodiment sets forth a technique for ensuring relaxed coherency between different caches. Two different execution units may be configured to access different caches that may store one or more cache lines corresponding to the same memory address. During time periods between memory barrier instructions relaxed coherency is maintained between the different caches. More specifically, writes to a cache line in a first cache that corresponds to a particular memory address are not necessarily propagated to a cache line in a second cache before the second cache receives a read or write request that also corresponds to the particular memory address. Therefore, the first cache and the second are not necessarily coherent during time periods of relaxed coherency. Execution of a memory barrier instruction ensures that the different caches will be coherent before a new period of relaxed coherency begins.
摘要:
In a graphics pipeline, a rasterizer circuit generates fragments for an image having multiple surfaces that have been tessellated into primitive objects, such as triangles. First and second fragments are associated with the same pixel. A merge buffer merges the first fragment with the second fragment when the two fragments belong to the same tessellated surface, the first fragment's primitive is adjacent to the second fragment's primitive, both fragments face either toward or away from the viewer, and the first and second fragment are sufficiently similar that merging is unlikely to introduce visually objectionable artifacts. A frame buffer receives fragments from the merge buffer, stores the fragments, combines the fragments into pixels, and outputs the pixels to a display.
摘要:
The invention includes an apparatus and method for buffering data transmitted by a processor and received by an I/O device via a memory and buses. The memory arranged at a plurality of levels includes a lower level of the memory operating faster than a higher level of the memory. A plurality of ring buffers are allocated at different levels of the memory and available buffers at a lowest possible level of the memory are preferentially selected as write buffers to store data transmitted by the processor. The apparatus includes a first level of the memory arranged on an integrated circuit with the processor, a second level of the memory arranged in an off-chip cache, and a third level of the memory arranged in a dynamic random access memory. Read buffers are selected to store data to be received by the I/O device. Stored control values indicate the order for selecting the read buffers and are used by the processor to select the write buffer. Control values are stored in sets of registers located in the I/O device and in a software-based set of registers located in the dynamic random access memory. A selection register located in the I/O device indicates the selected read buffer.
摘要:
One embodiment sets forth a technique for ensuring relaxed coherency between different caches. Two different execution units may be configured to access different caches that may store one or more cache lines corresponding to the same memory address. During time periods between memory barrier instructions relaxed coherency is maintained between the different caches. More specifically, writes to a cache line in a first cache that corresponds to a particular memory address are not necessarily propagated to a cache line in a second cache before the second cache receives a read or write request that also corresponds to the particular memory address. Therefore, the first cache and the second are not necessarily coherent during time periods of relaxed coherency. Execution of a memory barrier instruction ensures that the different caches will be coherent before a new period of relaxed coherency begins.
摘要:
A method and apparatus for arranging fragments in a graphics memory. Each pixel of a display has a corresponding list of fragments in the graphics memory. Each fragment describes a three-dimensional surface at a plurality of sample points associated with the pixel. A predetermined number of fragments are statically allocated to each pixel. Additional space for fragment data is dynamically allocated and deallocated. Each dynamically allocated unit of memory contains fragment data for a plurality of pixels. Fragment data are arranged to exploit modem DRAM capabilities by increasing locality of reference within a single DRAM page, by putting other fragments likely to be referenced soon in pages that belong to non-conflicting banks, and by maintaining bookkeeping structures that allow the relevant DRAM precharge and row activate operations to be scheduled far in advance of access to fragment data.
摘要:
A method and apparatus for visiting all productive stamp positions for a two-dimensional convex polygonal object. The object is visited with a stamp that has a stamp rectangle, and one or more discrete sample points. A productive location is one for which the object contains at least one of the stamp's sample points when the stamp is placed at that location. An unproductive location is one for which the object contains none of the stamp's sample points when the stamp is placed at that location. Stamp locations are discrete points that are separated vertically by the stamp rectangle's height, and horizontally by the stamp rectangle's width. The stamp may move to a nearby position, or to a previously saved position, as it traverses the object. The stamp moves in such a way as to visit all productive locations for an object while avoiding most of the unproductive locations.
摘要:
A method and apparatus for visiting all stamps that are relevant to a two-dimensional convex polygonal object. The object is visited with a rectangular stamp, which contains one or more discrete sample points. A relevant location is one for which the object contains at least one of the stamp's sample points when the stamp is placed at that location. Stamp locations are discrete points that are separated vertically by the stamp's height, and horizontally by the stamp's width. The stamp may move to a nearby position, or to a previously saved position, as it traverses the object. The plane in which the object lies is partitioned into rectangular tiles, which are at least as wide and high as the stamp. The invention visits stamp locations in an order that respects tile boundaries—that is, it visits all locations within one tile before visiting any locations within another tile. The invention may also be used with further partitioning of the plane (metatiles), so that it will visit all locations within a metatile before visiting any locations within another metatile, and further visit all locations within a portion of a tile within the current metatile before visiting any locations within a portion of a different tile within the current metatile.