摘要:
A method and apparatus for predicting load addresses and identifying new threads of instructions for execution in a multithreaded processor. A load prediction unit scans an instruction window for load instructions. A load prediction table is searched for an entry corresponding to a detected load instruction. If an entry is found in the table, a load address prediction is made for the load instruction and conveyed to the data cache. If the load address misses in the cache, the data is prefetched. Subsequently, if it is determined that the load prediction was incorrect, a miss counter in the corresponding entry in the load prediction table is incremented. If on a subsequent detection of the load instruction, the miss counter has reached a threshold, the load instruction is predicted to miss. In response to the predicted miss, a new thread of instructions is identified for execution.
摘要:
A computer system employs a hierarchical ring structure for communication. Computer system elements are configured into modules with ring interface hardware, and the modules are coupled to one or more rings. Bridge modules may be included for transmitting between rings in the hierarchy. The rings are time division multiplexed, and each time slot on a ring carries a frame. According to an address carried within the frame, bridge modules determine whether or not to transmit a frame circulating on a source ring onto a target ring. If the address of the frame indicates a module upon the source ring, the bridge module retransmits the frame on the source ring. Otherwise, the bridge module transmits the frame on the target ring. The bridge module operates in this fashion at any level of the hierarchy. The owner of a time slot on a ring is permitted to release the time slot for use by other modules. To reclaim a time slot, the owner marks the time slot owned. The module using the time slot, upon detecting the owned mark, removes the frame from the time slot and responds with a null frame. If a module detects a frame to which that module is to respond but the module's buffer is full, the module may retransmit the frame upon the source ring. The time slot carrying the frame effectively serves as a queue position. According to one embodiment, rings comprise optical links.
摘要:
A system and method for transferring data over a dedicated memory transfer bus between high and low speed memories of a computer system which share a single real memory address space are disclosed. The dedicated memory transfer bus operates independently from the system bus to avoid any adverse effects on bandwidth and latency of the system bus and to allow virtually any memory hierarchy to be selected. The transfer is controlled by the operating system software upon the execution of instructions issued by the memory management unit. Status information such as "invalid" state is used to direct the transfer.
摘要:
A system and method for transferring data over a dedicated memory transfer bus between high and low speed memories of a computer system which share a single real memory address space are disclosed. The dedicated memory transfer bus operates independently from the system bus to avoid any adverse effects on bandwidth and latency of the system bus and to allow virtually any memory hierarchy to be selected. The transfer is controlled by the operating system software upon the execution of instructions issued by the memory management unit. Status information such as "invalid" state is used to direct the transfer.
摘要:
A processor that includes hardware resources for the operating system that are separate and independent from resources dedicated to user programs. The OS resources preferably include a separate OS arithmetic logic unit (OS/ALU) along with a dedicated instruction buffer, instruction cache and data cache. The OS/ALU is preferably able to control the registers and program address of user processes, and can read a program request register from the user program.
摘要翻译:一种处理器,其包括与专用于用户程序的资源分开独立的操作系统的硬件资源。 OS资源优选地包括单独的OS算术逻辑单元(OS / ALU)以及专用指令缓冲器,指令高速缓存和数据高速缓存。 OS / ALU优选地能够控制用户进程的寄存器和程序地址,并且可以从用户程序读取程序请求寄存器。
摘要:
Logic circuitry and a corresponding method for computing an indexed set address utilized by a cache to mitigate the probability of a conflict miss occurring for a given memory access. Implemented at component or system level, the logic circuitry performs pseudo-random indexing of a set address obtained from a memory address during a memory access by a processor unit. This is accomplished by performing operations consistent with modulo operations on the memory address.
摘要:
A computer system comprises a plurality of modules and a shift register having a plurality of slots connected in series, wherein each of the plurality of slots is coupled to one of the plurality of modules. In one embodiment, an output of a last slot of the plurality of slots is coupled to an input of an initial slot of the plurality of slots to form a ring. Each slot of the shift register corresponds to a time slot on the ring, and each of the time slots is assigned to one of the modules. At least two of the modules are configured to independently generate frames for transmission on the ring. In another embodiment, at least one of the modules comprises a bridge module coupled to communicate with other bridge modules separate from the plurality of modules.
摘要:
A method and apparatus for switching between threads of a program in response to a long-latency event. In one embodiment, the long-latency events are load or store operations which trigger a thread switch if there is a miss in the level 2 cache. In addition to providing separate groups of registers for multiple threads, a group of program address registers pointing to different threads are provided. A switching mechanism switches between the program address registers in response to the long-latency events.
摘要:
One embodiment of the present invention provides an inductor with a variable inductance within a semiconductor chip. This inductor includes a primary spiral composed of a conductive material embedded within the semiconductor chip to provide a source of variable inductance. It also includes a control spiral composed of the conductive material vertically displaced from the primary spiral in neighboring layers of the semiconductor chip. This control spiral is magnetically coupled with the primary spiral so that changing a control current through the control spiral induces a change in inductance through the primary spiral. The inductor also includes a controllable current source coupled to the control spiral that is configured to provide the control current. One embodiment of the present invention includes a core surrounding the primary spiral and the control spiral in the semiconductor chip. This core is comprised of a core material with a magnetic permeability that facilitates magnetically coupling the control spiral with the primary spiral. In a variation on this embodiment, the core material includes a high frequency ferrite that operates at a frequency above one gigahertz without resistive eddy losses that substantially prevent a magnetic coupling between the control spiral and the primary spiral. In a variation on this embodiment, the high frequency ferrite can include NiZn.
摘要:
A computer system employs a hierarchical ring structure for communication. Computer system elements are configured into modules with ring interface hardware, and the modules are coupled to one or more rings. Bridge modules may be included for transmitting between rings in the hierarchy. The rings are time division multiplexed, and each time slot on a ring carries a frame. According to an address carried within the frame, bridge modules determine whether or not to transmit a frame circulating on a source ring onto a target ring. If the address of the frame indicates a module upon the source ring, the bridge module retransmits the frame on the source ring. Otherwise, the bridge module transmits the frame on the target ring. The bridge module operates in this fashion at any level of the hierarchy. The owner of a time slot on a ring is permitted to release the time slot for use by other modules. To reclaim a time slot, the owner marks the time slot owned. The module using the time slot, upon detecting the owned mark, removes the frame from the time slot and responds with a null frame. If a module detects a frame to which that module is to respond but the module's buffer is full, the module may retransmit the frame upon the source ring. The time slot carrying the frame effectively serves as a queue position. According to one embodiment, rings comprise optical links.