Content-addressable memory filtering based on microarchitectural state

    公开(公告)号:US11347514B2

    公开(公告)日:2022-05-31

    申请号:US16277764

    申请日:2019-02-15

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to filtering access to a content-addressable memory (CAM). In some embodiments, a processor monitors for certain microarchitectural states and filters access to the CAM in states where there cannot be a match in the CAM or where matching entries will not be used even if there is a match. In some embodiments, toggle control circuitry prevents toggling of input lines when filtering CAM access, which may reduce dynamic power consumption. In some example embodiments, the CAM is used to access a load queue to validate that out-of-order execution for a set of instructions matches in-order execution, and situations where ordering should be checked are relatively rare.

    Translation Lookaside Buffer Striping for Efficient Invalidation Operations

    公开(公告)号:US20220066947A1

    公开(公告)日:2022-03-03

    申请号:US17008457

    申请日:2020-08-31

    Applicant: Apple Inc.

    Abstract: Systems, apparatuses, and methods for implementing translation lookaside buffer (TLB) striping to enable efficient invalidation operations are described. TLB sizes are growing in width (more features in a given page table entry) and depth (to cover larger memory footprints). A striping scheme is proposed to enable an efficient and high performance method for performing TLB maintenance operations in the face of this growth. Accordingly, a TLB stores first attribute data in a striped manner across a plurality of arrays. The striped manner allows different entries to be searched simultaneously in response to receiving an invalidation request which identifies a particular attribute of a group to be invalidated. Upon receiving an invalidation request, the TLB generates a plurality of indices with an offset between each index and walks through the plurality of arrays by incrementing each index and simultaneously checking the first attribute data in corresponding entries.

    PROCESSING MEMORY ACCESSES WHILE SUPPORTING A ZERO SIZE CACHE IN A CACHE HIERARCHY

    公开(公告)号:US20200320004A1

    公开(公告)日:2020-10-08

    申请号:US16374667

    申请日:2019-04-03

    Applicant: Apple Inc.

    Inventor: Brian R. Mestan

    Abstract: A system and method for efficiently supporting a cache memory hierarchy potentially using a zero size cache in a level of the hierarchy. In various embodiments, logic in a lower-level cache controller or elsewhere receives a miss request from an upper-level cache controller. When the requested data is non-cacheable, the logic sends a snoop request with an address of the memory access operation to the upper-level cache controller to determine whether the requested data is in the upper-level data cache. When the snoop response indicates a miss or the requested data is cacheable, the logic retrieves the requested data from memory. When the snoop response indicates a hit, the logic retrieves the requested data from the upper-level cache. The logic completes servicing the memory access operation while preventing cache storage of the received requested data in a cache at a same level of the cache memory hierarchy as the logic.

    Translation lookaside buffer invalidation by range

    公开(公告)号:US10725928B1

    公开(公告)日:2020-07-28

    申请号:US16243901

    申请日:2019-01-09

    Applicant: Apple Inc.

    Abstract: A system and method for efficiently performing maintenance on a cache. In various embodiments, control logic in a cache controller or elsewhere receives an indication for invalidating a range of virtual-to-physical mappings in a given translation lookaside buffer (TLB). The logic determines a first latency to invalidate entries of the TLB based on a number of addresses in the range and a number of supported page sizes simultaneously stored in the TLB. The logic determines a second latency based on a number of entries in the TLB. If the first latency is greater, then the logic traverses through each TLB entry and invalidates TLB entries storing a virtual address within the range. If the first latency is smaller, then the logic traverses through each address in the range and invalidates TLB entries storing a virtual address within the range.

    Translation Lookaside Buffer Invalidation By Range

    公开(公告)号:US20200218663A1

    公开(公告)日:2020-07-09

    申请号:US16243901

    申请日:2019-01-09

    Applicant: Apple Inc.

    Abstract: A system and method for efficiently performing maintenance on a cache. In various embodiments, control logic in a cache controller or elsewhere receives an indication for invalidating a range of virtual-to-physical mappings in a given translation lookaside buffer (TLB). The logic determines a first latency to invalidate entries of the TLB based on a number of addresses in the range and a number of supported page sizes simultaneously stored in the TLB. The logic determines a second latency based on a number of entries in the TLB. If the first latency is greater, then the logic traverses through each TLB entry and invalidates TLB entries storing a virtual address within the range. If the first latency is smaller, then the logic traverses through each address in the range and invalidates TLB entries storing a virtual address within the range.

    Reducing translation lookaside buffer searches for splintered pages

    公开(公告)号:US12079140B2

    公开(公告)日:2024-09-03

    申请号:US18189982

    申请日:2023-03-24

    Applicant: Apple Inc.

    Abstract: Systems, apparatuses, and methods for performing efficient translation lookaside buffer (TLB) invalidation operations for splintered pages are described. When a TLB receives an invalidation request for a specified translation context, and the invalidation request maps to an entry with a relatively large page size, the TLB does not know if there are multiple translation entries stored in the TLB for smaller splintered pages of the relatively large page. The TLB tracks whether or not splintered pages for each translation context have been installed. If a TLB invalidate (TLBI) request is received, and splintered pages have not been installed, no searches are needed for splintered pages. To refresh the sticky bits, whenever a full TLB search is performed, the TLB rescans for splintered pages for other translation contexts. If no splintered pages are found, the sticky bit can be cleared and the number of full TLBI searches is reduced.

    Cache replacement based on traversal tracking

    公开(公告)号:US11720501B2

    公开(公告)日:2023-08-08

    申请号:US17817748

    申请日:2022-08-05

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to controlling cache replacement. In some embodiments, a computing system performs multiple searches of a data structure, where one or more of the searches traverse multiple links between elements of the data structure. The system may cache, in a traversal cache, traversal information that is usable by searches to skip one or more links traversed by one or more prior searches. The system may store tracking information that indicates a location in the traversal cache at which prior traversal information for a first search is stored. The system may select, based on the tracking information, an entry in the traversal cache for new traversal information generated by the first search. The selection may override a default replacement policy for the traversal cache, e.g., to select the location in the traversal cache to replace the prior traversal information with the new traversal information.

    Atomic operation predictor to predict whether an atomic operation will complete successfully

    公开(公告)号:US12229557B2

    公开(公告)日:2025-02-18

    申请号:US18601640

    申请日:2024-03-11

    Applicant: Apple Inc.

    Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.

Patent Agency Ranking