-
公开(公告)号:US11972230B2
公开(公告)日:2024-04-30
申请号:US16914318
申请日:2020-06-27
申请人: Intel Corporation
发明人: Menachem Adelman , Robert Valentine , Barukh Ziv , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark J. Charney , Christopher J. Hughes , Alexander F. Heinecke , Evangelos Georganas , Binh Pham
CPC分类号: G06F7/78 , G06F9/3001 , G06F9/3016 , G06F17/16
摘要: Embodiments for a matrix transpose and multiply operation are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a destination matrix location, a first source operand field to specify a first source matrix location, and a second source operand field to specify a second source matrix location. The execution circuitry is to, in response to the decoded instruction, transpose the first source matrix to generate a transposed first source matrix, perform a matrix multiplication using the transposed first source matrix and the second source matrix to generate a result, and store the result in a destination matrix location.
-
公开(公告)号:US11055232B2
公开(公告)日:2021-07-06
申请号:US16370848
申请日:2019-03-29
申请人: Intel Corporation
发明人: David Pardo Keppel , Binh Pham
IPC分类号: G06F12/1036 , G06F12/1045 , G06F12/1009
摘要: A processor includes a translation lookaside buffer (TLB) to store a TLB entry, wherein the TLB entry comprises a first set of valid bits to identify if the first TLB entry corresponds to a virtual address from a memory access request, wherein the valid bits are set based on a first page size associated with the TLB entry from a first set of different page sizes assigned to a first probe group; and a control circuit to probe the TLB for each page size of the first set of different page sizes assigned to the first probe group in a single probe cycle to determine if the TLB entry corresponds to the virtual address from the memory access request.
-
公开(公告)号:US10754782B1
公开(公告)日:2020-08-25
申请号:US16370893
申请日:2019-03-30
申请人: Intel Corporation
IPC分类号: G06F12/00 , G06F12/0875 , G06F12/0831 , G06F9/54 , G06F9/30 , G06F16/901 , G06F12/12
摘要: Systems, methods, and apparatuses relating to circuitry to accelerate store processing are described. In one embodiment, a processor includes a (e.g., L1) cache, a fill buffer, a store buffer, and a cache controller to allocate a first entry of a plurality of entries in the fill buffer to store a first storage request when the first storage request misses in the cache, send a first request for ownership to another cache corresponding to the first storage request, detect a hit in the cache for a second storage request, update a globally observable buffer to indicate the first entry in the fill buffer for the first storage request is earlier in program order than the second storage request in the store buffer, allocate, before the second storage request is removed from the store buffer, a second entry of the plurality of entries in the fill buffer to store the third storage request when the third storage request misses in the cache, send a second request for ownership to another cache corresponding to the third storage request, and update the globally observable buffer to indicate the second entry in the fill buffer for the third storage request is later in program order than the second storage request in the store buffer.
-
公开(公告)号:US20240329938A1
公开(公告)日:2024-10-03
申请号:US18607024
申请日:2024-03-15
申请人: Intel Corporation
发明人: Menachem Adelman , Robert Valentine , Barukh Ziv , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark J. Charney , Christopher J. Hughes , Alexander F. Heinecke , Evangelos Georganas , Binh Pham
CPC分类号: G06F7/78 , G06F9/3001 , G06F9/3016 , G06F17/16
摘要: Embodiments for a matrix transpose and multiply operation are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a destination matrix location, a first source operand field to specify a first source matrix location, and a second source operand field to specify a second source matrix location. The execution circuitry is to, in response to the decoded instruction, transpose the first source matrix to generate a transposed first source matrix, perform a matrix multiplication using the transposed first source matrix and the second source matrix to generate a result, and store the result in a destination matrix location.
-
公开(公告)号:US11989135B2
公开(公告)日:2024-05-21
申请号:US16786815
申请日:2020-02-10
申请人: Intel Corporation
发明人: Farah E. Fargo , Mitchell Diamond , David Keppel , Samantika S. Sury , Binh Pham , Shobha Vissapragada
IPC分类号: G06F12/10 , G06F12/1027
CPC分类号: G06F12/1027 , G06F2212/657
摘要: Examples described herein relate to a computing system supporting custom page sized ranges for an application to map contiguous memory regions instead of many smaller sized pages. An application can request a custom range size. An operating system can allocate a contiguous physical memory region to a virtual address range by specifying a custom range sizes that are larger or smaller than the normal general page sizes. Virtual-to-physical address translation can occur using an address range circuitry and translation lookaside buffer in parallel. The address range circuitry can determine if a custom entry is available to use to identify a physical address translation for the virtual address. Physical address translation can be performed by transforming the virtual address in some examples.
-
公开(公告)号:US20200081718A1
公开(公告)日:2020-03-12
申请号:US16680907
申请日:2019-11-12
申请人: Intel Corporation
摘要: In an embodiment, a processor includes a branch prediction circuit and a plurality of processing engines. The branch prediction circuit is to: detect a coherence operation associated with a first memory address; identify a first branch instruction associated with the first memory address; and predict a direction for the identified branch instruction based on the detected coherence operation. Other embodiments are described and claimed.
-
公开(公告)号:US10521236B2
公开(公告)日:2019-12-31
申请号:US15940408
申请日:2018-03-29
申请人: Intel Corporation
摘要: In an embodiment, a processor includes a branch prediction circuit and a plurality of processing engines. The branch prediction circuit is to: detect a coherence operation associated with a first memory address; identify a first branch instruction associated with the first memory address; and predict a direction for the identified branch instruction based on the detected coherence operation. Other embodiments are described and claimed.
-
公开(公告)号:US11886884B2
公开(公告)日:2024-01-30
申请号:US16680907
申请日:2019-11-12
申请人: Intel Corporation
CPC分类号: G06F9/3844 , G06F9/3004 , G06F9/30058
摘要: In an embodiment, a processor includes a branch prediction circuit and a plurality of processing engines. The branch prediction circuit is to: detect a coherence operation associated with a first memory address; identify a first branch instruction associated with the first memory address; and predict a direction for the identified branch instruction based on the detected coherence operation. Other embodiments are described and claimed.
-
公开(公告)号:US20210405974A1
公开(公告)日:2021-12-30
申请号:US16914318
申请日:2020-06-27
申请人: Intel Corporation
发明人: Menachem Adelman , Robert Valentine , Barukh Ziv , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark J. Charney , Christopher J. Hughes , Alexander F. Heinecke , Evangelos Georganas , Binh Pham
摘要: Embodiments for a matrix transpose and multiply operation are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a destination matrix location, a first source operand field to specify a first source matrix location, and a second source operand field to specify a second source matrix location. The execution circuitry is to, in response to the decoded instruction, transpose the first source matrix to generate a transposed first source matrix, perform a matrix multiplication using the transposed first source matrix and the second source matrix to generate a result, and store the result in a destination matrix location.
-
公开(公告)号:US10860244B2
公开(公告)日:2020-12-08
申请号:US15854357
申请日:2017-12-26
申请人: Intel Corporation
IPC分类号: G06F3/06 , G06F12/0862 , G06F12/0871 , G06F12/1027 , G06F12/0897 , G06F12/1045 , G06F12/128 , G06F12/14 , G06F12/123
摘要: An apparatus is described that includes a memory controller to couple to a multi-level memory characterized by a faster higher level and a slower lower level. The memory controller having early demotion logic circuitry to demote a page from the higher level to the lower level without system software having to instruct the memory controller to demote the page and before the system software promotes another page from the lower level to the higher level.
-
-
-
-
-
-
-
-
-