-
公开(公告)号:US11698929B2
公开(公告)日:2023-07-11
申请号:US16207065
申请日:2018-11-30
Applicant: Intel Corporation
Inventor: Ren Wang , Andrew J. Herdrich , Tsung-Yuan C. Tai , Yipeng Wang , Raghu Kondapalli , Alexander Bachmutsky , Yifan Yuan
IPC: G06F7/00 , G06F16/901 , G06F16/903 , G06F16/906
CPC classification number: G06F16/9017 , G06F16/906 , G06F16/90335
Abstract: A central processing unit can offload table lookup or tree traversal to an offload engine. The offload engine can provide hardware accelerated operations such as instruction queueing, bit masking, hashing functions, data comparisons, a results queue, and a progress tracking. The offload engine can be associated with a last level cache. In the case of a hash table lookup, the offload engine can apply a hashing function to a key to generate a signature, apply a comparator to compare signatures against the generated signature, retrieve a key associated with the signature, and apply the comparator to compare the key against the retrieved key. Accordingly, a data pointer associated with the key can be provided in the result queue. Acceleration of operations in tree traversal and tuple search can also occur.
-
公开(公告)号:US20210263779A1
公开(公告)日:2021-08-26
申请号:US17255588
申请日:2019-04-16
Applicant: Intel Corporation
Inventor: Mohammad R. Haghighat , Kshitij Doshi , Andrew J. Herdrich , Anup Mohan , Ravishankar R. Iyer , Mingqiu Sun , Krishna Bhuyan , Teck Joo Goh , Mohan J. Kumar , Michael Prinke , Michael Lemay , Leeor Peled , Jr-Shian Tsai , David M. Durham , Jeffrey D. Chamberlain , Vadim A. Sukhomlinov , Eric J. Dahlen , Sara Baghsorkhi , Harshad Sane , Areg Melik-Adamyan , Ravi Sahita , Dmitry Yurievich Babokin , Ian M. Steiner , Alexander Bachmutsky , Anil Rao , Mingwei Zhang , Nilesh K. Jain , Amin Firoozshahian , Baiju V. Patel , Wenyong Huang , Yeluri Raghuram
Abstract: Embodiments of systems, apparatuses and methods provide enhanced function as a service (FaaS) to users, e.g., computer developers and cloud service providers (CSPs). A computing system configured to provide such enhanced FaaS service include one or more controls architectural subsystems, software and orchestration subsystems, network and storage subsystems, and security subsystems. The computing system executes functions in response to events triggered by the users in an execution environment provided by the architectural subsystems, which represent an abstraction of execution management and shield the users from the burden of managing the execution. The software and orchestration subsystems allocate computing resources for the function execution by intelligently spinning up and down containers for function code with decreased instantiation latency and increased execution scalability while maintaining secured execution. Furthermore, the computing system enables customers to pay only when their code gets executed with a granular billing down to millisecond increments.
-
公开(公告)号:US20200285578A1
公开(公告)日:2020-09-10
申请号:US16822939
申请日:2020-03-18
Applicant: Intel Corporation
Inventor: Ren Wang , Joseph Nuzman , Samantika S. Sury , Andrew J. Herdrich , Namakkal N. Venkatesan , Anil Vasudevan , Tsung-Yuan C. Tai , Niall D. McDonnell
IPC: G06F12/0831 , G06F12/084 , G06F12/0811
Abstract: Apparatus, method, and system for implementing a software-transparent hardware predictor for core-to-core data communication optimization are described herein. An embodiment of the apparatus includes a plurality of hardware processor cores each including a private cache; a shared cache that is communicatively coupled to and shared by the plurality of hardware processor cores; and a predictor circuit. The predictor circuit is to track activities relating to a plurality of monitored cache lines in the private cache of a producer hardware processor core (producer core) and to enable a cache line push operation upon determining a target hardware processor core (target core) based on the tracked activities. An execution of the cache line push operation is to cause a plurality of unmonitored cache lines in the private cache of the producer core to be moved to the private cache of the target core.
-
公开(公告)号:US10649813B2
公开(公告)日:2020-05-12
申请号:US15929005
申请日:2018-03-29
Applicant: Intel Corporation
Inventor: Mark A. Schmisseur , Francesc Guim Bernat , Andrew J. Herdrich , Karthik Kumar
Abstract: Technology for a memory pool arbitration apparatus is described. The apparatus can include a memory pool controller (MPC) communicatively coupled between a shared memory pool of disaggregated memory devices and a plurality of compute resources. The MPC can receive a plurality of data requests from the plurality of compute resources. The MPC can assign each compute resource to one of a set of compute resource priorities. The MPC can send memory access commands to the shared memory pool to perform each data request prioritized according to the set of compute resource priorities. The apparatus can include a priority arbitration unit (PAU) communicatively coupled to the MPC. The PAU can arbitrate the plurality of data requests as a function of the corresponding compute resource priorities.
-
公开(公告)号:US10445271B2
公开(公告)日:2019-10-15
申请号:US14987676
申请日:2016-01-04
Applicant: Intel Corporation
Inventor: Ren Wang , Namakkal N. Venkatesan , Debra Bernstein , Edwin Verplanke , Stephen R. Van Doren , An Yan , Andrew Cunningham , David Sonnier , Gage Eads , James T. Clee , Jamison D. Whitesell , Yipeng Wang , Jerry Pirog , Jonathan Kenny , Joseph R. Hasting , Narender Vangati , Stephen Miller , Te K. Ma , William Burroughs , Andrew J. Herdrich , Jr-Shian Tsai , Tsung-Yuan C. Tai , Niall D. McDonnell , Hugh Wilkinson , Bradley A. Burres , Bruce Richardson
IPC: G06F13/37 , G06F12/0811 , G06F13/16 , G06F12/0868 , G06F12/04 , G06F9/38
Abstract: Apparatus and methods implementing a hardware queue management device for reducing inter-core data transfer overhead by offloading request management and data coherency tasks from the CPU cores. The apparatus include multi-core processors, a shared L3 or last-level cache (“LLC”), and a hardware queue management device to receive, store, and process inter-core data transfer requests. The hardware queue management device further comprises a resource management system to control the rate in which the cores may submit requests to reduce core stalls and dropped requests. Additionally, software instructions are introduced to optimize communication between the cores and the queue management device.
-
26.
公开(公告)号:US20190004958A1
公开(公告)日:2019-01-03
申请号:US15640060
申请日:2017-06-30
Applicant: Intel Corporation
Inventor: Anil Vasudevan , Venkata Krishnan , Andrew J. Herdrich , Ren Wang , Robert G. Blankenship , Vedaraman Geetha , Shrikant M. Shah , Marshall A. Millier , Raanan Sade , Binh Q. Pham , Olivier Serres , Chyi-Chang Miao , Christopher B. Wilkerson
IPC: G06F12/0868 , G06F12/0811 , G06F3/06 , G06F12/0871
Abstract: Method and system for performing data movement operations is described herein. One embodiment of a method includes: storing data for a first memory address in a cache line of a memory of a first processing unit, the cache line associated with a coherency state indicating that the memory has sole ownership of the cache line; decoding an instruction for execution by a second processing unit, the instruction comprising a source data operand specifying the first memory address and a destination operand specifying a memory location in the second processing unit; and responsive to executing the decoded instruction, copying data from the cache line of the memory of the first processing unit as identified by the first memory address, to the memory location of the second processing unit, wherein responsive to the copy, the cache line is to remain in the memory and the coherency state is to remain unchanged.
-
公开(公告)号:US20190004862A1
公开(公告)日:2019-01-03
申请号:US15637003
申请日:2017-06-29
Applicant: Intel Corporation
Inventor: Francesc Guim Bernat , Kshitij A. Doshi , Andrew J. Herdrich , Edwin Verplanke , Daniel Rivas Barragan
IPC: G06F9/50
Abstract: Technologies for managing quality of service of a platform interconnect include a compute device. The compute device includes one or more processors, one or more resources capable of being utilized by the one or more processors, and a platform interconnect to facilitate communication of messages between the one or more processors and the one or more resources. The compute device is to obtain class of service data for one or more workloads to be executed by the compute device. The class of service data is indicative of a capacity of one or more of the resources to be utilized in the execution of each corresponding workload. The compute device is also to execute the one or more workloads and manage the amount of traffic transmitted through the platform interconnect for each corresponding workload as a function of the class of service data as the one or more workloads are executed.
-
公开(公告)号:US20180352311A1
公开(公告)日:2018-12-06
申请号:US15913357
申请日:2018-03-06
Applicant: Intel Corporation
Inventor: Andrew J. Herdrich , Patrick L. Connor , Dinesh Kumar , Alexander W. Min , Daniel J. Dahle , Kapil Sood , Jeffrey B. Shaw , Edwin Verplanke , Scott P. Dubal , James Robert Hearn
CPC classification number: H04Q9/02 , H04L41/5009 , H04L41/5019 , H04L43/08 , H04L43/10
Abstract: Devices and techniques for out-of-band platform tuning and configuration are described herein. A device can include a telemetry interface to a telemetry collection system and a network interface to network adapter hardware. The device can receive platform telemetry metrics from the telemetry collection system, and network adapter silicon hardware statistics over the network interface, to gather collected statistics. The device can apply a heuristic algorithm using the collected statistics to determine processing core workloads generated by operation of a plurality of software systems communicatively coupled to the device. The device can provide a reconfiguration message to instruct at least one software system to switch operations to a different processing core, responsive to detecting an overload state on at least one processing core, based on the processing core workloads. Other embodiments are also described.
-
公开(公告)号:US20170192887A1
公开(公告)日:2017-07-06
申请号:US15401220
申请日:2017-01-09
Applicant: Intel Corporation
Inventor: Andrew J. Herdrich , Edwin Verplanke , Ravishankar Iyer , Christopher C. Gianos , Jeffrey D. Chamberlain , Ronak Singhal , Julius Mandelblat , Bret L. Toll
IPC: G06F12/0804 , G06F12/084 , G06F12/0897 , G06F12/0864 , G06F12/0875 , G06F12/0811 , G06F12/0842
CPC classification number: G06F12/0804 , G06F12/0811 , G06F12/084 , G06F12/0842 , G06F12/0848 , G06F12/0864 , G06F12/0875 , G06F12/0895 , G06F12/0897 , G06F12/123 , G06F12/128 , G06F2212/1004 , G06F2212/1016 , G06F2212/1024 , G06F2212/604
Abstract: Systems and methods for cache allocation with code and data prioritization. An example system may comprise: a cache; a processing core, operatively coupled to the cache; and a cache control logic, responsive to receiving a cache fill request comprising an identifier of a request type and an identifier of a class of service, to identify a subset of the cache corresponding to a capacity bit mask associated with the request type and the class of service.
-
公开(公告)号:US09639372B2
公开(公告)日:2017-05-02
申请号:US13730565
申请日:2012-12-28
Applicant: Intel Corporation
Inventor: Paolo Narvaez , Ganapati N. Srinivasa , Eugene Gorbatov , Dheeraj R. Subbareddy , Mishali Naik , Alon Naveh , Abirami Prabhakaran , Eliezer Weissmann , David A. Koufaty , Paul Brett , Scott D. Hahn , Andrew J. Herdrich , Ravishankar Iyer , Nagabhushan Chitlur , Inder M. Sodhi , Gaurav Khanna , Russell J. Fenger
CPC classification number: G06F9/3891
Abstract: A heterogeneous processor architecture is described. For example, a processor according to one embodiment of the invention comprises: a set of large physical processor cores; a set of small physical processor cores having relatively lower performance processing capabilities and relatively lower power usage relative to the large physical processor cores; virtual-to-physical (V-P) mapping logic to expose the set of large physical processor cores to software through a corresponding set of virtual cores and to hide the set of small physical processor core from the software.
-
-
-
-
-
-
-
-
-