-
公开(公告)号:US10970078B2
公开(公告)日:2021-04-06
申请号:US15946719
申请日:2018-04-05
Applicant: Apple Inc.
Inventor: Eric Bainville , Tal Uliel , Jeffry E. Gonion , Ali Sazegari , Erik K. Norden
Abstract: In an embodiment, a computation engine may perform computations on input vectors having vector elements of a first precision and data type. The computation engine may convert the vector elements from the first precision to a second precision and may also interleave the vector elements as specified by an instruction issued by the processor to the computation engine. The interleave may be based on a ratio of a result precision and the second precision. An extract instruction may be supported to extract results from the computations and convert and deinterleave the vector elements to provide a compact result in a desired order.
-
公开(公告)号:US20210064539A1
公开(公告)日:2021-03-04
申请号:US16874997
申请日:2020-05-15
Applicant: Apple Inc.
Inventor: Jeffry E. Gonion , Bernard Joseph Semeria , Michael J. Swift , Pradeep Kanapathipillai , David J. Williamson
IPC: G06F12/1009 , G06F12/1072 , G06F12/0873 , G06F12/14
Abstract: A system and method for efficiently transferring address mappings and data access permissions corresponding to the address mappings. A computing system includes at least one processor and memory for storing a page table. In response to receiving a memory access operation comprising a first address, the address translation unit is configured to identify a data access permission based on a permission index corresponding to the first address, and access data stored in a memory location of the memory identified by a second address in a manner defined by the retrieved data access permission. The address translation unit is configured to access a table to identify the data access permission, and is configured to determine the permission index and the second address based on the first address. A single permission index may correspond to different permissions for different entities within the system.
-
公开(公告)号:US10831488B1
公开(公告)日:2020-11-10
申请号:US16105783
申请日:2018-08-20
Applicant: Apple Inc.
Inventor: Eric Bainville , Jeffry E. Gonion , Ali Sazegari , Gerard R. Williams, III , Andrew J. Beaumont-Smith
Abstract: In an embodiment, a computation engine may offload work from a processor (e.g. a CPU) and efficiently perform computations such as those used in LSTM and other workloads at high performance. In an embodiment, the computation engine may perform computations on input vectors from input memories in the computation engine, and may accumulate results in an output memory within the computation engine. The input memories may be loaded with initial vector data from memory, incurring the memory latency that may be associated with reading the operands. Compute instructions may be performed on the operands, generating results in an output memory. One or more extract instructions may be supported to move data from the output memory to the input memory, permitting additional computation on the data in the output memory without moving the results to main memory.
-
公开(公告)号:US20200348934A1
公开(公告)日:2020-11-05
申请号:US16928752
申请日:2020-07-14
Applicant: Apple Inc.
Inventor: Eric Bainville , Jeffry E. Gonion , Ali Sazegari, PhD , Gerard R. Williams, III
Abstract: In an embodiment, a computation engine is configured to perform vector multiplications, producing either vector results or outer product (matrix) results. The instructions provided to the computation engine specify a matrix mode or a vector mode for the instructions. The computation engine performs the specified operation. The computation engine may perform numerous computations in parallel, in an embodiment. In an embodiment, the instructions may also specify an offset with the input memories, providing additional flexibility in the location of operands. More particularly, the computation engine may be configured to perform numerous multiplication operations in parallel and to accumulate results in a result memory, performing multiply-accumulate operations for each matrix/vector element in the targeted locations of the output memory.
-
公开(公告)号:US20190310855A1
公开(公告)日:2019-10-10
申请号:US15946724
申请日:2018-04-05
Applicant: Apple Inc.
Inventor: Tal Uliel , Eric Bainville , Jeffry E. Gonion , Ali Sazegari
IPC: G06F9/38
Abstract: In an embodiment, a computation engine may perform dot product computations on input vectors. The dot product operation may have a first operand and a second operand, and the dot product may be performed on a subset of the vector elements in the first operand and each of the vector elements in the second operand. The subset of vector elements may be separated in the first operand by a stride that skips one or more elements between each element to which the dot product operation is applied. More particularly, in an embodiment, the input operands of the dot product operation may be a first vector having second vectors as elements, and the stride may select a specified element of each second vector.
-
公开(公告)号:US10331558B2
公开(公告)日:2019-06-25
申请号:US15663115
申请日:2017-07-28
Applicant: Apple Inc.
Inventor: Ali Sazegari , Charles E. Tucker , Jeffry E. Gonion , Gerard R. Williams, III , Chris Cheng-Chieh Lee
Abstract: Systems, apparatuses, and methods for efficiently moving data for storage and processing. A compression unit within a processor includes multiple hardware lanes, selects two or more input words to compress, and for assigns them to two or more of the multiple hardware lanes. As each assigned input word is processed, each word is compared to an entry of a plurality of entries of a table. If it is determined that each of the assigned input words indexes the same entry of the table, the hardware lane with the oldest input word generates a single read request for the table entry and the hardware lane with the youngest input word generates a single write request for updating the table entry upon completing compression. Each hardware lane generates a compressed packet based on its assigned input word.
-
公开(公告)号:US09715386B2
公开(公告)日:2017-07-25
申请号:US14688043
申请日:2015-04-16
Applicant: Apple Inc.
Inventor: Jeffry E. Gonion
IPC: G06F9/30
CPC classification number: G06F9/30036 , G06F9/3004 , G06F9/30072 , G06F9/30076 , G06F9/3838
Abstract: In an embodiment, a processor may implement a conditional stop instruction that includes a first predicate vector identifying the active elements of the instruction, a second predicate vector indicating true and false results for a conditional expression within a loop that is being vectorized, and a source operand specifying which combinations in the true and false results may indicate a dependency. The conditional stop instruction may generate a vector result indicating vector elements that have a dependency on a prior vector element, as well as an identification of which element position the dependency is on. More particularly, dependencies may be detected only on active elements as indicated by the first predicate vector. False dependencies that may occur due to inactive elements may be avoided, which may improve performance and/or provide for correct functional operation.
-
公开(公告)号:US20170024559A1
公开(公告)日:2017-01-26
申请号:US14807609
申请日:2015-07-23
Applicant: Apple Inc.
Inventor: Gregory D. Hughes , Conrado Blasco , Gerard R. Williams, III , Jacques Anthony Vidrine , Jeffry E. Gonion , Timothy R. Paaske , Tristan F. Schaap
IPC: G06F21/54
CPC classification number: G06F21/54
Abstract: Systems, apparatuses, methods, and computer-readable mediums for preventing return oriented programming (ROP) attacks. A compiler may insert landing pads adjacent to valid return targets in an instruction sequence. When a return instruction is executed, the processor may treat the return as suspicious if the target of the return instruction does not have an adjacent landing pad. Additionally, each landing pad may be encoded with a color, and a colored launch pad may be inserted into the instruction stream next to each return instruction. When a return instruction is executed, the processor may determine if the target of the return has a landing pad with the same color as the launch pad of the return instruction. Return-target pairs with color mismatches may be treated as suspicious and the offending process may be killed.
Abstract translation: 用于防止返回定向编程(ROP)攻击的系统,装置,方法和计算机可读介质。 编译器可以在指令序列中插入与有效返回目标相邻的着陆焊盘。 当执行返回指令时,如果返回指令的目标没有相邻的着陆垫,则处理器可以将返回值视为可疑。 此外,每个着陆垫可以用颜色编码,并且彩色的发射板可以插入每个返回指令旁边的指令流中。 当执行返回指令时,处理器可以确定返回目标是否具有与返回指令的发射台相同颜色的着陆键盘。 具有颜色不匹配的返回目标对可能被视为可疑的,并且违规进程可能被杀死。
-
公开(公告)号:US09400651B2
公开(公告)日:2016-07-26
申请号:US14034670
申请日:2013-09-24
Applicant: Apple Inc.
Inventor: Jeffry E. Gonion
CPC classification number: G06F9/30036 , G06F9/30018 , G06F9/3836
Abstract: In an embodiment, a processor includes an issue circuit configured to issue instruction operations for execution. The issue circuit may be configured to monitor the source operands of the instruction operations, and to issue instruction operations for which the source operands (including predicate operands, as appropriate) are resolved. Additionally, the issue circuit may be configured to detect a null predicate that indicates that none of the vector elements will be modified by a corresponding instruction operation. The issue circuit may be configured to issue the corresponding instruction operation with the null predicate even if other source operands are not yet resolved.
Abstract translation: 在一个实施例中,处理器包括配置成发出用于执行的指令操作的发行电路。 发布电路可以被配置为监视指令操作的源操作数,并且发出解决源操作数(包括适当的谓词操作数)的指令操作。 此外,发行电路可以被配置为检测指示不会通过相应的指令操作来修改向量元素的零谓词。 即使其他源操作数尚未解决,发布电路也可以被配置为使用空谓词发出相应的指令操作。
-
公开(公告)号:US09367309B2
公开(公告)日:2016-06-14
申请号:US14034640
申请日:2013-09-24
Applicant: Apple Inc.
Inventor: Jeffry E. Gonion
IPC: G06F9/30
CPC classification number: G06F9/30036 , G06F9/30018 , G06F9/30105 , G06F9/3013
Abstract: In an embodiment, a processor includes a register attribute tracker configured to track one or more attributes corresponding to registers. The register attribute tracker may track the attributes associated with the registers when those registers are used as output registers of instructions that explicitly define the attributes and, if the register attribute tracker has a tracked attribute associated with an input register of an instruction that does not explicitly define the attribute, the register attribute tracker may annotate the instruction with an attribute and/or associate an attribute with the output register of the instruction in the register attribute tracker.
Abstract translation: 在一个实施例中,处理器包括配置为跟踪与寄存器对应的一个或多个属性的寄存器属性跟踪器。 当这些寄存器用作明确定义属性的指令的输出寄存器时,寄存器属性跟踪器可以跟踪与寄存器相关联的属性,并且如果寄存器属性跟踪器具有与未明确指定的指令的输入寄存器相关联的跟踪属性 定义属性,寄存器属性跟踪器可以使用属性注释指令和/或将属性与注册属性跟踪器中的指令的输出寄存器相关联。
-
-
-
-
-
-
-
-
-