摘要:
A method for managing hardware resources and threads within a data processing system is disclosed. Compilation attributes of a function are collected during and after the compilation of the function. The pre-processing attributes of the function are also collected before the execution of the function. The collected attributes of the function are then analyzed, and a runtime configuration is assigned to the function based of the result of the attribute analysis. The runtime configuration may include, for example, the designation of the function to be executed under either a single-threaded mode or a simultaneous multi-threaded mode. During the execution of the function, real-time attributes of the function are being continuously collected. If necessary, the runtime configuration under which the function is being executed can be changed based on the real-time attributes collected during the execution of the function.
摘要:
A method efficiently dispatches/completes a work element within a multi-node, data processing system that has a global command queue (GCQ) and at least one high latency node. The method comprises: at the high latency processor node, work scheduling logic establishing a local command/work queue (LCQ) in which multiple work items for execution by local processing units can be staged prior to execution; a first local processing unit retrieving via a work request a larger chunk size of work than can be completed in a normal work completion/execution cycle by the local processing unit; storing the larger chunk size of work retrieved in a local command/work queue (LCQ); enabling the first local processing unit to locally schedule and complete portions of the work stored within the LCQ; and transmitting a next work request to the GCQ only when all the work within the LCQ has been dispatched by the local processing units.
摘要:
A method comprises receiving scene model data including a scene geometry model and a plurality of pixel data describing objects arranged in a scene. The method generates a primary ray based on a selected first pixel data. In the event the primary ray intersects an object in the scene, the method determines primary hit color data and generates a plurality of secondary rays. The method groups the secondary packets and arranges the packets in a queue based on the octant of each direction vector in the secondary ray packet. The method generates secondary color data based on the secondary ray packets in the queue and generates a pixel color based on the primary hit color data, and the secondary color data. The method generates an image based on the pixel color for the pixel data.
摘要:
A cache manager receives a request for data, which includes a requested effective address. The cache manager determines whether the requested effective address matches a most recently used effective address stored in a mapped tag vector. When the most recently used effective address matches the requested effective address, the cache manager identifies a corresponding cache location and retrieves the data from the identified cache location. However, when the most recently used effective address fails to match the requested effective address, the cache manager determines whether the requested effective address matches a subsequent effective address stored in the mapped tag vector. When the cache manager determines a match to a subsequent effective address, the cache manager identifies a different cache location corresponding to the subsequent effective address and retrieves the data from the different cache location.
摘要:
A system and method for cache optimized data formatting is presented. A processor generates images by calculating a plurality of image point values using height data, color data, and normal data. Normal data is computed for a particular image point using pixel data adjacent to the image point. The computed normalized data, along with corresponding height data and color data, are included in a limited space data stream and sent to a processor to generate an image. The normalized data may be computed using adjacent pixel data at any time prior to inserting the normalized data in the limited space data stream.
摘要:
An approach to hiding memory latency in a multi-thread environment is presented. Branch Indirect and Set Link (BISL) and/or Branch Indirect and Set Link if External Data (BISLED) instructions are placed in thread code during compilation at instances that correspond to a prolonged instruction. A prolonged instruction is an instruction that instigates latency in a computer system, such as a DMA instruction. When a first thread encounters a BISL or a BISLED instruction, the first thread passes control to a second thread while the first thread's prolonged instruction executes. In turn, the computer system masks the latency of the first thread's prolonged instruction. The system can be optimized based on the memory latency by creating more threads and further dividing a register pool amongst the threads to further hide memory latency in operations that are highly memory bound.
摘要:
Adaptive span computation when ray casting is presented. A processor uses start point fractional values during view screen segment computations that start a view screen segment's computations a particular distance away from a down point. This prevents an excessive sampling density during image generation without wasting processor resources. The processor identifies a start point fractional value for each view screen segment based upon each view screen segment's identifier, and computes a view screen segment start point for each view screen segment using the start point fractional value. View screen segment start points are “tiered” and are a particular distance away from the down point. This stops the view screen segments from converging to a point of severe over sampling while, at the same time, providing a pseudo-uniform sampling density.
摘要:
An image that includes ray traced pixel data and rasterized pixel data is generated. A synergistic processing unit (SPU) uses a rendering algorithm to generate ray traced data for objects that require high-quality image rendering. The ray traced data is fragmented, whereby each fragment includes a ray traced pixel depth value and a ray traced pixel color value. A rasterizer compares ray traced pixel depth values to corresponding rasterized pixel depth values, and overwrites ray traced pixel data with rasterized pixel data when the corresponding rasterized fragment is “closer” to a viewing point, which results in composite data. A display subsystem uses the resultant composite data to generate an image on a user's display.
摘要:
An approach is provided to allow virtual devices that use a plurality of processors in a multiprocessor systems, such as the BE environment. Using this method, a synergistic processing unit (SPU) can either be dedicated to performing a particular function (i.e., audio, video, etc.) or a single SPU can be programmed to perform several functions on behalf of the other processors in the system. The application, preferably running in one of the primary (PU) processors, issues IOCTL commands through device drivers that correspond to SPUs. The kernel managing the primary processors responds by sending an appropriate message to the SPU that is performing the dedicated function. Using this method, an SPU can be virtualized for swapping multiple tasks or dedicated to performing a particular task.
摘要:
A system and method for terrain rendering using a limited memory footprint is presented. A system and method to perform vertical ray terrain rendering by using a terrain data subset for image point value calculations. Terrain data is segmented into terrain data subsets whereby the terrain data subsets are processed in parallel. A bottom view ray intersects the terrain data to provide a memory footprint starting point. In addition, environmental visibility settings provide a memory footprint ending point. The memory footprint starting point, the memory footprint ending point, and vertical ray adjacent data points define a terrain data subset that corresponds to a particular vertical ray. The terrain data subset includes height and color information which are used for vertical ray coherence terrain rendering.