-
公开(公告)号:US10817425B2
公开(公告)日:2020-10-27
申请号:US14583389
申请日:2014-12-26
申请人: Intel Corporation
发明人: Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran
IPC分类号: G06F12/0842 , G06F12/0831 , G06F12/0893 , G06F12/109 , G06F12/0813 , G06F9/455
摘要: Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.
-
公开(公告)号:US10284470B2
公开(公告)日:2019-05-07
申请号:US14580801
申请日:2014-12-23
申请人: Intel Corporation
发明人: Ren Wang , Namakkal N. Venkatesan , Aamer Jaleel , Tsung-Yuan C. Tai , Sameh Gobriel , Christian Maciocco
IPC分类号: H04L12/743 , H04L29/06 , H04L12/747
摘要: Technologies for managing network flow lookups of a network device include a network controller and a target device, each communicatively coupled to the network device. The network device includes a cache for a processor of the network device and a main memory. The network device additionally includes a multi-level hash table having a first-level hash table stored in the cache of the network device and a second-level hash table stored in the main memory of the network device. The network device is configured to determine whether to store a network flow hash corresponding to a network flow indicating the target device in the first-level or second-level hash table based on a priority of the network flow provided to the network device by the network controller.
-
公开(公告)号:US10268580B2
公开(公告)日:2019-04-23
申请号:US15282483
申请日:2016-09-30
申请人: Intel Corporation
IPC分类号: G06F9/30 , G06F12/128 , G06F12/0811
摘要: Processors and methods implementing a machine instruction to perform cache line demotion on multiple cache lines to enable efficient sharing of cache lines between processor cores. One general aspect includes a processor comprising: a plurality of hardware processor cores, where each of the hardware processor cores to include a first cache. The processor also includes a second cache, communicatively coupled to and shared by the plurality of hardware processor cores. The processor to support a first machine instruction, the first machine instruction to include a vector register operand identifying a vector register which contains a plurality of data elements each used to identify a cache line. An execution of the first machine instruction by one of the plurality of hardware processor cores to cause a plurality of identified cache lines to be demoted, such that the demoted cache lines are moved from the first cache to the second cache.
-
公开(公告)号:US20180103129A1
公开(公告)日:2018-04-12
申请号:US15677564
申请日:2017-08-15
申请人: Intel Corporation
IPC分类号: H04L29/06 , H04L12/743 , H04L12/64
CPC分类号: H04L69/22 , H04L12/6418 , H04L45/7453
摘要: Technologies for packet flow classification on a computing device include a hash table including a plurality of hash table buckets in which each hash table bucket maps a plurality of keys to corresponding traffic flows. The computing device performs packet flow classification on received data packets, where the packet flow classification includes a plurality of sequential classification stages and fetch classification operations and non-fetch classification operations are performed in each classification stage. The fetch classification operations include to prefetch a key of a first received data packet based on a set of packet fields of the first received data packet for use during a subsequent classification stage, prefetch a hash table bucket from the hash table based on a key signature of the prefetched key for use during another subsequent classification stage, and prefetch a traffic flow to be applied to the first received data packet based on the prefetched hash table bucket and the prefetched key. The computing device handles processing of received data packets such that a fetch classification operation is performed by the flow classification module on the first received data packet while a non-fetch classification operation is performed by the flow classification module on a second received data packet.
-
公开(公告)号:US11513957B2
公开(公告)日:2022-11-29
申请号:US17027248
申请日:2020-09-21
申请人: Intel Corporation
发明人: Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran
IPC分类号: G06F12/0842 , G06F12/0893 , G06F12/109 , G06F12/0813 , G06F12/0831 , G06F9/455
摘要: Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.
-
公开(公告)号:US11418495B2
公开(公告)日:2022-08-16
申请号:US15715569
申请日:2017-09-26
申请人: INTEL CORPORATION
发明人: John J. Browne , Chris Macnamara , Namakkal N. Venkatesan , Tomasz Kantecki , Declan W. Doherty
IPC分类号: H04L29/06 , H04L9/40 , H04L47/10 , H04L69/166 , H04L47/2441 , H04L12/46
摘要: Techniques and apparatuses for processing data unit are described. In one embodiment, for example, an apparatus for networking may include at least one memory, logic, at least a portion of the logic comprised in hardware coupled to the at least one memory, the logic to access an encrypted packet having an encrypted portion, determine at least one flow control segment of the encrypted portion, decrypt the at least one flow control segment to generate a partially-decrypted packet comprising a decrypted at least one flow control segment and an encrypted remainder portion, the remainder portion comprising a portion of the encrypted packet that does not include the decrypted at least one flow control segment, access process information in the decrypted at least one flow control segment, and process the partially-decrypted packet according to the process information. Other embodiments are described and claimed.
-
公开(公告)号:US10455063B2
公开(公告)日:2019-10-22
申请号:US15677564
申请日:2017-08-15
申请人: Intel Corporation
IPC分类号: H04L29/06 , H04L12/743 , H04L12/64
摘要: Technologies for packet flow classification on a computing device include a hash table including a plurality of hash table buckets in which each hash table bucket maps a plurality of keys to corresponding traffic flows. The computing device performs packet flow classification on received data packets, where the packet flow classification includes a plurality of sequential classification stages and fetch classification operations and non-fetch classification operations are performed in each classification stage. The fetch classification operations include to prefetch a key of a first received data packet based on a set of packet fields of the first received data packet for use during a subsequent classification stage, prefetch a hash table bucket from the hash table based on a key signature of the prefetched key for use during another subsequent classification stage, and prefetch a traffic flow to be applied to the first received data packet based on the prefetched hash table bucket and the prefetched key. The computing device handles processing of received data packets such that a fetch classification operation is performed by the flow classification module on the first received data packet while a non-fetch classification operation is performed by the flow classification module on a second received data packet.
-
公开(公告)号:US10334041B2
公开(公告)日:2019-06-25
申请号:US14949265
申请日:2015-11-23
申请人: Intel Corporation
摘要: A network interface device (NID) interfaced with a host machine communicates with a local link of the host machine to obtain transaction-specific data relied upon by the host machine to be delivered to a destination by the NID according to a reliable message delivery protocol. The NID conducts communications over a network in response to obtaining of the transaction-specific data, with the network communications including execution of the reliable message delivery protocol independent of any operability of the host machine.
-
公开(公告)号:US20170149926A1
公开(公告)日:2017-05-25
申请号:US15426718
申请日:2017-02-07
申请人: Intel Corporation
发明人: Ren Wang , Sameh Gobriel , Christian Maciocco , Tsung-Yuan C. Tai , Ben-Zion Friedman , Hang T. Nguyen , Namakkal N. Venkatesan , Michael A. O'Hanlon , Shrikant M. Shah , Sanjeev Jain
IPC分类号: H04L29/08 , H04L12/743 , H04L12/721 , H04L12/24 , H04L12/741
CPC分类号: H04L67/2852 , H04L41/0893 , H04L45/38 , H04L45/745 , H04L45/7453 , H04L49/00
摘要: Technologies for identifying a cache line of a network packet for eviction from an on-processor cache of a network device communicatively coupled to a network controller. The network device is configured to determine whether a cache line of the cache corresponding to the network packet is to be evicted from the cache based on a determination that the network packet is not needed subsequent to processing the network packet, and provide an indication that the cache line is to be evicted from the cache based on an eviction policy received from the network controller.
-
公开(公告)号:US20200285578A1
公开(公告)日:2020-09-10
申请号:US16822939
申请日:2020-03-18
申请人: Intel Corporation
发明人: Ren Wang , Joseph Nuzman , Samantika S. Sury , Andrew J. Herdrich , Namakkal N. Venkatesan , Anil Vasudevan , Tsung-Yuan C. Tai , Niall D. McDonnell
IPC分类号: G06F12/0831 , G06F12/084 , G06F12/0811
摘要: Apparatus, method, and system for implementing a software-transparent hardware predictor for core-to-core data communication optimization are described herein. An embodiment of the apparatus includes a plurality of hardware processor cores each including a private cache; a shared cache that is communicatively coupled to and shared by the plurality of hardware processor cores; and a predictor circuit. The predictor circuit is to track activities relating to a plurality of monitored cache lines in the private cache of a producer hardware processor core (producer core) and to enable a cache line push operation upon determining a target hardware processor core (target core) based on the tracked activities. An execution of the cache line push operation is to cause a plurality of unmonitored cache lines in the private cache of the producer core to be moved to the private cache of the target core.
-
-
-
-
-
-
-
-
-