-
公开(公告)号:US20210367887A1
公开(公告)日:2021-11-25
申请号:US17396553
申请日:2021-08-06
Applicant: Intel Corporation
Inventor: Ren Wang , Tsung-Yuan C. Tai , Yipeng Wang , Sameh Gobriel
IPC: H04L12/743 , H04L12/851 , H04L12/741 , H04L29/12
Abstract: Apparatus, methods, and systems for tuple space search-based flow classification using cuckoo hash tables and unmasked packet headers are described herein. A device can communicate with one or more hardware switches. The device can include memory to store hash table entries of a hash table. The device can include processing circuitry to perform a hash lookup in the hash table. The lookup can be based on an unmasked key include in a packet header corresponding to a received data packet. The processing circuitry can retrieve an index pointing to a sub-table, the sub-table including a set of rules for handling the data packet. Other embodiments are also described.
-
公开(公告)号:US20190052719A1
公开(公告)日:2019-02-14
申请号:US15862311
申请日:2018-01-04
Applicant: Intel Corporation
Inventor: Yipeng Wang , Ren Wang , Antonio Fischetti , Sameh Gobriel , Tsung-Yuan C. Tai
Abstract: Technologies for flow rule aware exact match cache compression include multiple computing devices in communication over a network. A computing device reads a network packet from a network port and extracts one or more key fields from the packet to generate a lookup key. The key fields are identified by a key field specification of an exact match flow cache. The computing device may dynamically configure the key field specification based on an active flow rule set. The computing device may compress the key field specification to match a union of non-wildcard fields of the active flow rule set. The computing device may expand the key field specification in response to insertion of a new flow rule. The computing device looks up the lookup key in the exact match flow cache and, if a match is found, applies the corresponding action. Other embodiments are described and claimed.
-
公开(公告)号:US09992299B2
公开(公告)日:2018-06-05
申请号:US15426718
申请日:2017-02-07
Applicant: Intel Corporation
Inventor: Ren Wang , Sameh Gobriel , Christian Maciocco , Tsung-Yuan C. Tai , Ben-Zion Friedman , Hang T. Nguyen , Namakkal N. Venkatesan , Michael A. O'Hanlon , Shrikant M. Shah , Sanjeev Jain
IPC: H04L29/08 , H04L12/24 , H04L12/741 , H04L12/721 , H04L12/743
CPC classification number: H04L67/2852 , H04L41/0893 , H04L45/38 , H04L45/745 , H04L45/7453 , H04L49/00
Abstract: Technologies for identifying a cache line of a network packet for eviction from an on-processor cache of a network device communicatively coupled to a network controller. The network device is configured to determine whether a cache line of the cache corresponding to the network packet is to be evicted from the cache based on a determination that the network packet is not needed subsequent to processing the network packet, and provide an indication that the cache line is to be evicted from the cache based on an eviction policy received from the network controller.
-
公开(公告)号:US09882814B2
公开(公告)日:2018-01-30
申请号:US14496495
申请日:2014-09-25
Applicant: Intel Corporation
Inventor: Kannan Babu Ramia , Christian Maciocco , Sameh Gobriel , Ashok Sunder Rajan
IPC: H04L12/28 , H04L12/56 , H04L12/803 , H04L12/715 , H04L12/751 , H04L12/755 , H04L12/741 , H04L12/801
CPC classification number: H04L47/125 , H04L45/02 , H04L45/021 , H04L45/04 , H04L45/64 , H04L45/745 , H04L47/17
Abstract: Technologies for bridging between coarse-grained and fine-grained load balancing include a computing node of a cluster computing device and a network controller. The computing node may add a flow entry to a local flow table based on flow information received from the network controller. The computing node may transmit a multicast network packet including the flow information and next hop information to other computing nodes of the cluster device. The computing node may also add a different flow entry to the local flow table and a next hop entry to a local next hop table based on a multicast network packet received from another computing node of the cluster device. The computing node may locally process a network packet received from a remote computing device or forward the received network packet to another computing node of the cluster device based on the flow entries added to the local flow table.
-
5.
公开(公告)号:US20250061316A1
公开(公告)日:2025-02-20
申请号:US18934700
申请日:2024-11-01
Applicant: Intel Corporation
Inventor: Sameh Gobriel , Nilesh Jain , Vui Seng Chua , Juan Pablo Munoz , Gopi Krishna Jha
IPC: G06N3/0495 , G06N3/082
Abstract: Key-value (KV) cache paging schemes can improve memory management for KV caches by storing a KV cache page having key tensors and value tensors for a fixed number of tokens in a fixed-sized block in the KV cache of a worker. To further improve memory management, the schemes can be modified to implement dynamic variable quantization. Quantization level of a KV cache page can be set based on a runtime importance score of the KV cache page. In addition, the quantization level of the KV cache page can be set based on the system load. The end result is a scheme that can achieve a high compression ratio of KV cache pages in the KV cache. Fitting more KV cache pages in the KV cache can lead to higher inference throughput, higher system-level user capacity, and higher end-to-end service availability.
-
公开(公告)号:US12197601B2
公开(公告)日:2025-01-14
申请号:US17560193
申请日:2021-12-22
Applicant: Intel Corporation
Inventor: Ren Wang , Sameh Gobriel , Somnath Paul , Yipeng Wang , Priya Autee , Abhirupa Layek , Shaman Narayana , Edwin Verplanke , Mrittika Ganguli , Jr-Shian Tsai , Anton Sorokin , Suvadeep Banerjee , Abhijit Davare , Desmond Kirkpatrick , Rajesh M. Sankaran , Jaykant B. Timbadiya , Sriram Kabisthalam Muthukumar , Narayan Ranganathan , Nalini Murari , Brinda Ganesh , Nilesh Jain
Abstract: Examples described herein relate to offload circuitry comprising one or more compute engines that are configurable to perform a workload offloaded from a process executed by a processor based on a descriptor particular to the workload. In some examples, the offload circuitry is configurable to perform the workload, among multiple different workloads. In some examples, the multiple different workloads include one or more of: data transformation (DT) for data format conversion, Locality Sensitive Hashing (LSH) for neural network (NN), similarity search, sparse general matrix-matrix multiplication (SpGEMM) acceleration of hash based sparse matrix multiplication, data encode, data decode, or embedding lookup.
-
7.
公开(公告)号:US10719442B2
公开(公告)日:2020-07-21
申请号:US16126907
申请日:2018-09-10
Applicant: Intel Corporation
Inventor: Ren Wang , Raanan Sade , Yipeng Wang , Tsung-Yuan Tai , Sameh Gobriel
IPC: G06F12/00 , G06F12/0811 , G06F9/38 , G06F16/18
Abstract: An apparatus and method for prioritizing transactional memory regions. For example, one embodiment of a processor comprises: a plurality of cores to execute threads comprising sequences of instructions, at least some of the instructions specifying a transactional memory region; a cache of each core to store a plurality of cache lines; transactional memory circuitry of each core to manage execution of the transactional memory (TM) regions based on priorities associated with each of the TM regions; and wherein the transactional memory circuitry, upon detecting a conflict between a first TM region having a first priority value and a second TM region having a second priority value, is to determine which of the first TM region or the second TM region is permitted to continue executing and which is to be aborted based, at least in part, on the first and second priority values.
-
公开(公告)号:US10462059B2
公开(公告)日:2019-10-29
申请号:US15473413
申请日:2017-03-29
Applicant: Intel Corporation
Inventor: Byron Marohn , Christian Maciocco , Sameh Gobriel , Ren Wang , Tsung-Yuan C. Tai
IPC: H04L12/819 , H04L12/743 , H04L12/721 , H04L12/741
Abstract: The present disclosure describes a process and apparatus for improving insertions of entries into a hash table. A large number of smaller virtual buckets may be combined together and associated with buckets used for hash table entry lookups and/or entry insertion. On insertion of an entry, hash table entries associated with a hashed-to virtual bucket may be moved between groups of buckets associated with the virtual bucket, to better distribute entries across the available buckets to reduce the number of entries in the largest buckets and the standard deviation of the bucket sizes across the entire hash table.
-
公开(公告)号:US20180375773A1
公开(公告)日:2018-12-27
申请号:US15632592
申请日:2017-06-26
Applicant: Intel Corporation
Inventor: Sameh Gobriel , Wei Shen , Tsung-Yuan C. Tai , Ren Wang
IPC: H04L12/743 , H04L12/741 , H04L29/12
Abstract: Technologies for efficient network flow classification include a computing device that receives a network packet that includes a header. The computing device generates a vector Bloom filter (VBF) key as a function of the header and searches multiple VBFs for a VBF that matches the VBF key. Each VBF is associated with a flow sub-table that includes one or more flow rules. Each flow sub-table is associated with a mask length. If a matching VBF is found, the computing device searches the corresponding flow sub-table for a flow rule that matches a masked header of the network packet. If no matching VBF is found or if no matching flow rule is found, the computing device searches all of the flow sub-tables for a flow rule that matches the header. The computing device applies a flow action of a matching flow rule. Other embodiments are described and claimed.
-
公开(公告)号:US20170149926A1
公开(公告)日:2017-05-25
申请号:US15426718
申请日:2017-02-07
Applicant: Intel Corporation
Inventor: Ren Wang , Sameh Gobriel , Christian Maciocco , Tsung-Yuan C. Tai , Ben-Zion Friedman , Hang T. Nguyen , Namakkal N. Venkatesan , Michael A. O'Hanlon , Shrikant M. Shah , Sanjeev Jain
IPC: H04L29/08 , H04L12/743 , H04L12/721 , H04L12/24 , H04L12/741
CPC classification number: H04L67/2852 , H04L41/0893 , H04L45/38 , H04L45/745 , H04L45/7453 , H04L49/00
Abstract: Technologies for identifying a cache line of a network packet for eviction from an on-processor cache of a network device communicatively coupled to a network controller. The network device is configured to determine whether a cache line of the cache corresponding to the network packet is to be evicted from the cache based on a determination that the network packet is not needed subsequent to processing the network packet, and provide an indication that the cache line is to be evicted from the cache based on an eviction policy received from the network controller.
-
-
-
-
-
-
-
-
-