Patent search cpc:"G06F12/0897" Page 1

1.

发明公开
Multi-tile Memory Management for Detecting Cross Tile Access Providing Multi-Tile Inference Scaling and Providing Page Migration 审中-公开

公开(公告)号：US20240345990A1

公开(公告)日：2024-10-17

申请号：US18626775

申请日：2024-04-04

Applicant: Intel Corporation

Inventor： Lakshminarayanan Striramassarma , Prasoonkumar Surti , Varghese George , Ben Ashbaugh , Aravindh Anantaraman , Valentin Andrei , Abhishek Appu , Nicolas Galoppo Von Borries , Altug Koker , Mike Macpherson , Subramaniam Maiyuran , Nilay Mistry , Elmoustapha Ould-Ahmed-Vall , Selvakumar Panneer , Vasanth Ranganathan , Joydeep Ray , Ankur Shah , Saurabh Tangri

IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06N3/08 , G06T1/20 , G06T1/60 , G06T15/06 , H03M7/46

CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06

Abstract: Multi-tile Memory Management for Detecting Cross Tile Access, Providing Multi-Tile Inference Scaling with multicasting of data via copy operation, and Providing Page Migration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a memory and a memory controller, a second graphics processing unit (GPU) having a memory and a cross-GPU fabric to communicatively couple the first and second GPUs. The memory controller is configured to determine whether frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU in the multi-GPU configuration and to send a message to initiate a data transfer mechanism when frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU.

2.

发明公开
METHOD AND APPARATUS FOR IMPLIED BIT HANDLING IN FLOATING POINT MULTIPLICATION 审中-公开

公开(公告)号：US20240330203A1

公开(公告)日：2024-10-03

申请号：US18739768

申请日：2024-06-11

Applicant: Texas Instruments Incorporated

Inventor： Mujibur Rahman , Timothy David Anderson

IPC: G06F12/1045 , G06F7/24 , G06F7/487 , G06F7/499 , G06F7/53 , G06F7/57 , G06F9/30 , G06F9/32 , G06F9/345 , G06F9/38 , G06F9/48 , G06F11/00 , G06F11/10 , G06F12/0862 , G06F12/0875 , G06F12/0897 , G06F12/1009 , G06F15/78 , G06F17/16 , H03H17/06

CPC classification number: G06F12/1045 , G06F7/24 , G06F7/487 , G06F7/4876 , G06F7/49915 , G06F7/53 , G06F7/57 , G06F9/3001 , G06F9/30014 , G06F9/30021 , G06F9/30032 , G06F9/30036 , G06F9/30065 , G06F9/30072 , G06F9/30098 , G06F9/30112 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/32 , G06F9/345 , G06F9/3802 , G06F9/3818 , G06F9/383 , G06F9/3836 , G06F9/3851 , G06F9/3856 , G06F9/3867 , G06F9/3887 , G06F9/48 , G06F11/00 , G06F11/1048 , G06F12/0862 , G06F12/0875 , G06F12/0897 , G06F12/1009 , G06F17/16 , H03H17/0664 , G06F9/30018 , G06F9/325 , G06F9/381 , G06F9/3822 , G06F11/10 , G06F15/7807 , G06F15/781 , G06F2212/452 , G06F2212/60 , G06F2212/602 , G06F2212/68

Abstract: Devices and methods are provided for performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers. In an example, a device includes a processor that includes a multiply circuit. The multiply circuit is configured to multiply floating point numbers in response to a floating point multiply instruction, and is further configured to determine values of implied bits of mantissas of the floating point numbers, and multiply the mantissas in parallel with the determining operation.

3.

发明公开
GPU CHIPLETS USING HIGH BANDWIDTH CROSSLINKS 审中-公开

公开(公告)号：US20240330196A1

公开(公告)日：2024-10-03

申请号：US18388602

申请日：2023-11-10

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Skyler J. SALEH , Samuel NAFFZIGER , Milind S. BHAGAVAT , Rahul AGARWAL

IPC: G06F12/0897 , G06F13/16 , G06F13/40

CPC classification number: G06F12/0897 , G06F13/1668 , G06F13/4027 , G06F2212/1024

Abstract: A chiplet system includes a central processing unit (CPU) communicably coupled to a first GPU chiplet of a GPU chiplet array. The GPU chiplet array includes the first GPU chiplet communicably coupled to the CPU via a bus and a second GPU chiplet communicably coupled to the first GPU chiplet via a passive crosslink. The passive crosslink is a passive interposer die dedicated for inter-chiplet communications and partitions systems-on-a-chip (SoC) functionality into smaller functional chiplet groupings.

4.

发明授权
Method and apparatus for vector permutation 有权

公开(公告)号：US12105635B2

公开(公告)日：2024-10-01

申请号：US17384858

申请日：2021-07-26

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventor： Timothy David Anderson , Mujibur Rahman , Dheera Balasubramanian Samudrala , Peter Richard Dent , Duc Quang Bui

IPC: G06F9/30 , G06F7/24 , G06F7/487 , G06F7/499 , G06F7/53 , G06F7/57 , G06F9/32 , G06F9/345 , G06F9/38 , G06F9/48 , G06F11/00 , G06F11/10 , G06F12/0862 , G06F12/0875 , G06F12/0897 , G06F12/1009 , G06F12/1045 , G06F17/16 , H03H17/06 , G06F15/78

CPC classification number: G06F12/1045 , G06F7/24 , G06F7/487 , G06F7/4876 , G06F7/49915 , G06F7/53 , G06F7/57 , G06F9/3001 , G06F9/30014 , G06F9/30021 , G06F9/30032 , G06F9/30036 , G06F9/30065 , G06F9/30072 , G06F9/30098 , G06F9/30112 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/32 , G06F9/345 , G06F9/3802 , G06F9/3818 , G06F9/383 , G06F9/3836 , G06F9/3851 , G06F9/3856 , G06F9/3867 , G06F9/3887 , G06F9/48 , G06F11/00 , G06F11/1048 , G06F12/0862 , G06F12/0875 , G06F12/0897 , G06F12/1009 , G06F17/16 , H03H17/0664 , G06F9/30018 , G06F9/325 , G06F9/381 , G06F9/3822 , G06F11/10 , G06F15/7807 , G06F15/781 , G06F2212/452 , G06F2212/60 , G06F2212/602 , G06F2212/68

Abstract: A method is provided that includes performing, by a processor in response to a vector permutation instruction, permutation of values stored in lanes of a vector to generate a permuted vector, wherein the permutation is responsive to a control storage location storing permute control input for each lane of the permuted vector, wherein the permute control input corresponding to each lane of the permuted vector indicates a value to be stored in the lane of the permuted vector, wherein the permute control input for at least one lane of the permuted vector indicates a value of a selected lane of the vector is to be stored in the at least one lane, and storing the permuted vector in a storage location indicated by an operand of the vector permutation instruction.

5.

发明公开
DATA TRANSFER IN PORT SWITCH MEMORY 审中-公开

公开(公告)号：US20240319909A1

公开(公告)日：2024-09-26

申请号：US18679895

申请日：2024-05-31

Applicant: Lodestar Licensing Group LLC

Inventor： Frank F. Ross

IPC: G06F3/06 , G06F11/20 , G06F12/0868 , G06F12/0897 , G06F13/16 , G11C7/10

CPC classification number: G06F3/0655 , G06F3/0635 , G06F3/0679 , G06F3/0688 , G06F11/201 , G06F12/0868 , G06F12/0897 , G06F13/1668 , G11C7/1075

Abstract: The present disclosure includes apparatuses and methods related to data transfer in memory. An example apparatus can include a first number of memory devices coupled to a host via a first number of ports and a second number of memory devices coupled to the first number of memory device via a second number of ports, wherein a first number of commands are executed to transfer data between the first number of memory devices and the host via the first number of ports and a second number of commands are executed to transfer data between the first number of memory device and the second number of memory device via the second number of ports.

6.

发明授权
Load-store pipeline selection for vectors 有权

公开(公告)号：US12086067B2

公开(公告)日：2024-09-10

申请号：US18141463

申请日：2023-04-30

Applicant: SiFive, Inc.

Inventor： Andrew Waterman , Krste Asanovic

IPC: G06F12/0855 , G06F12/0815 , G06F12/0875 , G06F12/0897

CPC classification number: G06F12/0855 , G06F12/0815 , G06F12/0875 , G06F12/0897

Abstract: Systems and methods are disclosed for load-store pipeline selection for vectors. For example, an integrated circuit (e.g., a processor) for executing instructions includes an L1 cache that provides an interface to a memory system; an L2 cache connected to the L1 cache that implements a cache coherency protocol with the L1 cache; a first store unit configured to write data to the memory system via the L1 cache; a second store unit configured to bypass the L1 cache and write data to the memory system via the L2 cache; and a store pipeline selection circuitry configured to: identify an address associated with a first beat of a store instruction with a vector argument; select between the first store unit and the second store unit based on the address associated with the first beat of the store instruction; and dispatch the store instruction to the selected store unit.

7.

发明授权
Streaming engine with fetch ahead hysteresis 有权

公开(公告)号：US12079470B2

公开(公告)日：2024-09-03

申请号：US17379345

申请日：2021-07-19

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventor： Matthew Pierson

IPC: G06F3/06 , G06F9/30 , G06F9/32 , G06F9/38 , G06F12/0875 , G06F12/0897 , G06F13/14

CPC classification number: G06F3/0604 , G06F3/0656 , G06F3/0659 , G06F3/0683 , G06F9/3004 , G06F9/30047 , G06F9/30076 , G06F9/3016 , G06F9/32 , G06F9/3802 , G06F9/383 , G06F12/0875 , G06F12/0897 , G06F13/14 , G06F2212/1016 , G06F2212/452 , G06F2212/60

Abstract: Disclosed embodiments relate to one or more techniques to control access by a requestor of a computing system to a shared memory resource. In one embodiment, a technique includes determining a number (N) of pending requests to be sent to the memory by the requestor, determining a number (M) of requests that the requestor is limited to sending based on an amount of buffering resources available, and comparing M to N. When N is both greater than zero and less than or equal to M, the requestor sends the N pending requests to the memory. When N is both greater than zero and greater than M, M is compared to a hysteresis value (R) and, when M is less than R, the requestor sends R of the N pending requests to the memory.

8.

发明授权
Graphics processor operation scheduling for deterministic latency 有权

公开(公告)号：US12079155B2

公开(公告)日：2024-09-03

申请号：US17428216

申请日：2020-03-14

Applicant: Intel Corporation

Inventor： Joydeep Ray , Selvakumar Panneer , Saurabh Tangri , Ben Ashbaugh , Scott Janus , Abhishek Appu , Varghese George , Ravishankar Iyer , Nilesh Jain , Pattabhiraman K , Altug Koker , Mike MacPherson , Josh Mastronarde , Elmoustapha Ould-Ahmed-Vall , Jayakrishna P. S , Eric Samson

IPC: G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06N3/08 , G06T15/06

CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06

Abstract: Embodiments described herein include software, firmware, and hardware that provides techniques to enable deterministic scheduling across multiple general-purpose graphics processing units. One embodiment provides a multi-GPU architecture with uniform latency. One embodiment provides techniques to distribute memory output based on memory chip thermals. One embodiment provides techniques to enable thermally aware workload scheduling. One embodiment provides techniques to enable end to end contracts for workload scheduling on multiple GPUs.

9.

发明授权
Dual data streams sharing dual level two cache access ports to maximize bandwidth utilization 有权

公开(公告)号：US12061908B2

公开(公告)日：2024-08-13

申请号：US17472852

申请日：2021-09-13

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventor： Joseph Zbiciak , Timothy Anderson

IPC: G06F9/32 , G06F9/30 , G06F9/345 , G06F9/38 , G06F11/00 , G06F12/02 , G06F12/0875 , G06F12/0897 , G06F13/16 , G06F13/40 , G06F11/10

CPC classification number: G06F9/321 , G06F9/30014 , G06F9/30036 , G06F9/30043 , G06F9/30047 , G06F9/30098 , G06F9/30112 , G06F9/30145 , G06F9/3016 , G06F9/32 , G06F9/345 , G06F9/3802 , G06F9/383 , G06F9/3867 , G06F11/00 , G06F12/0207 , G06F12/0875 , G06F12/0897 , G06F13/1605 , G06F13/4068 , G06F9/3836 , G06F11/10 , G06F2212/452 , G06F2212/60

Abstract: A streaming engine employed in a digital data processor specifies fixed first and second read only data streams. Corresponding stream address generator produces address of data elements of the two streams. Corresponding steam head registers stores data elements next to be supplied to functional units for use as operands. The two streams share two memory ports. A toggling preference of stream to port ensures fair allocation. The arbiters permit one stream to borrow the other's interface when the other interface is idle. Thus one stream may issue two memory requests, one from each memory port, if the other stream is idle. This spreads the bandwidth demand for each stream across both interfaces, ensuring neither interface becomes a bottleneck.

10.

发明授权
Joint scheduler for high bandwidth multi-shot prefetching 有权

公开(公告)号：US12038843B1

公开(公告)日：2024-07-16

申请号：US18537927

申请日：2023-12-13

Applicant: Next Silicon Ltd

Inventor： Yiftach Gilad , Liron Zur

IPC: G06F12/08 , G06F12/0862 , G06F12/0897

CPC classification number: G06F12/0862 , G06F12/0897 , G06F2212/602 , G06F2212/6024

Abstract: A joint scheduler adapted for dispatching prefetch and demand accesses of data relating to a plurality of instructions loaded in an execution pipeline of processing circuit(s). Each prefetch access comprises checking whether a respective data is cached in a cache entry and each demand access comprises accessing a respective data. The joint scheduler is adapted to, responsive to each hit prefetch access dispatched for a respective data relating to a respective instruction, associate the respective instruction with a valid indication and a pointer to a respective cache entry storing the respective data such that the demand access relating to the respective instruction uses the associated pointer to access the respective data in the cache, and responsive to each missed prefetch access dispatched for a respective data relating to a respective instruction, initiate a read cycle for loading the respective data from next level memory and cache it in the cache.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification