-
公开(公告)号:US20240429938A1
公开(公告)日:2024-12-26
申请号:US18755302
申请日:2024-06-26
Applicant: Apple Inc.
Inventor: Tyson J. Bergland , Karthik Ramani , Stephan Lachowsky , Justin A. Hensley , Davoud A. Jamshidi
IPC: H03M7/30 , H04N19/176 , H04N19/182
Abstract: Techniques are disclosed relating to compression of pixel data using different quantization for different regions of a block of pixels being compressed. In some embodiments, compression circuitry is configured to determine, for multiple components included in pixels of the block of pixels being compressed, respective smallest and greatest component values in respective regions of the block of pixels. The compression circuitry may determine, based on the determined smallest and greatest component values, to use a first number of bits to represent delta values relative to a base value for a first component in a first region and a second, different number of bits to represent delta values relative to a base value for a second component in the first region. The compression circuitry may then quantize delta values for the first and second components of pixels in the first region of the block of pixels using the determined first and second numbers of bits. In some embodiments, the compression circuitry determines whether to provide cross-component bit sharing within a region.
-
公开(公告)号:US12026108B1
公开(公告)日:2024-07-02
申请号:US18065433
申请日:2022-12-13
Applicant: Apple Inc.
Inventor: Karthik Ramani , Mohamed Ismail , Tian You Wang
IPC: G06F13/16 , G06F11/34 , G06F12/0811 , G06F12/084
CPC classification number: G06F13/1689 , G06F11/3466 , G06F12/0811 , G06F12/084
Abstract: Techniques are disclosed relating to controlling performance state of a memory element based on latency information for a processor. In some embodiments, a level of a memory hierarchy is configured to operate at multiple different performance states at different times. Processor circuitry may execute programs that generate requests to access the memory hierarchy. Bandwidth-based control circuitry may generate, based on bandwidth conditions for the processor circuitry, bandwidth performance state signals. Latency-based control circuitry may generate, based on latency information for processor requests to access the memory hierarchy, latency performance state signals. Performance control circuitry may control the performance state of the level of the memory hierarchy based on the bandwidth performance state signals and the latency performance state signals. Disclosed techniques may improve processor performance in certain operating scenarios.
-
公开(公告)号:US20210336632A1
公开(公告)日:2021-10-28
申请号:US16855540
申请日:2020-04-22
Applicant: Apple Inc.
Inventor: Tyson J. Bergland , Karthik Ramani , Stephan Lachowsky , Justin A. Hensley , Davoud A. Jamshidi
IPC: H03M7/30 , H04N19/176 , H04N19/182
Abstract: Techniques are disclosed relating to compression of pixel data using different quantization for different regions of a block of pixels being compressed. In some embodiments, compression circuitry is configured to determine, for multiple components included in pixels of the block of pixels being compressed, respective smallest and greatest component values in respective regions of the block of pixels. The compression circuitry may determine, based on the determined smallest and greatest component values, to use a first number of bits to represent delta values relative to a base value for a first component in a first region and a second, different number of bits to represent delta values relative to a base value for a second component in the first region. The compression circuitry may then quantize delta values for the first and second components of pixels in the first region of the block of pixels using the determined first and second numbers of bits. In some embodiments, the compression circuitry determines whether to provide cross-component bit sharing within a region.
-
公开(公告)号:US10289565B2
公开(公告)日:2019-05-14
申请号:US15610008
申请日:2017-05-31
Applicant: Apple Inc.
Inventor: Wolfgang H. Klingauf , Kenneth C. Dyke , Karthik Ramani , Winnie W. Yeung , Anthony P. DeLaurier , Luc R. Semeria , David A. Gotwalt , Srinivasa Rangan Sridharan , Muditha Kanchana
IPC: G06F12/08 , G06F12/12 , G06F12/123 , G06F12/0808 , G06F12/0815 , G06F12/0804
Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.
-
公开(公告)号:US20250103501A1
公开(公告)日:2025-03-27
申请号:US18795416
申请日:2024-08-06
Applicant: Apple Inc.
Inventor: Mladen Wilder , Karthik Ramani , Tyson J. Bergland
IPC: G06F12/084 , G06F12/0817 , G06F12/0837
Abstract: Techniques are disclosed relating to data compression in graphics processors. In some embodiments, first and second graphics processor cores include respective shader processor circuitry configured to execute graphics shader programs. Cache circuitry may be configured to store surface data, including a compressed block of surface data and metadata for the compressed block of surface data. Lock control circuitry may lock metadata for the second graphics processor core for the compressed block of surface data based on an access to the metadata by the first graphics processor core and prevent read accesses to the compressed block by the second graphics processor core until the lock on the metadata is released. This may provide consistency across graphics cores for compressed data.
-
公开(公告)号:US20190266102A1
公开(公告)日:2019-08-29
申请号:US16410828
申请日:2019-05-13
Applicant: Apple Inc.
Inventor: Wolfgang H. Klingauf , Kenneth C. Dyke , Karthik Ramani , Winnie W. Yeung , Anthony P. DeLaurier , Luc R. Semeria , David A. Gotwalt , Srinivasa Rangan Sridharan , Muditha Kanchana
IPC: G06F12/123 , G06F12/0815 , G06F12/0804 , G06F12/0808
Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.
-
公开(公告)号:US20250104181A1
公开(公告)日:2025-03-27
申请号:US18795437
申请日:2024-08-06
Applicant: Apple Inc.
Inventor: Karthik Ramani , Tyson J. Bergland , Leela Kishore Kothamasu , Hongzhou Zhao , Winnie W. Yeung , Mladen Wilder
IPC: G06T1/60 , G06F12/0891 , G06T15/00
Abstract: Techniques are disclosed relating to data compression in graphics processors. In some embodiments, cache circuitry is coupled to shader processor circuitry and is configured to store graphics data that includes a compressed block of data associated with a surface and metadata for the compressed block of data. Metadata coherence circuitry may cache the metadata for the compressed block of data, receive an indication of a write command for non-compressed data associated with the surface, wherein the write command identifies the metadata and has a different address than the compressed block of data, and determine, based on the metadata and the indication, to invalidate the compressed block of data in the cache circuitry. This may maintain read/write coherence in a cache that stores both compressed and uncompressed data, in some embodiments.
-
公开(公告)号:US20230253979A1
公开(公告)日:2023-08-10
申请号:US18302513
申请日:2023-04-18
Applicant: Apple Inc.
Inventor: Tyson J. Bergland , Karthik Ramani , Stephan Lachowsky , Justin A. Hensley , Davoud A. Jamshidi
IPC: H03M7/30 , H04N19/182 , H04N19/176
CPC classification number: H03M7/3059 , H04N19/182 , H04N19/176
Abstract: Techniques are disclosed relating to compression of pixel data using different quantization for different regions of a block of pixels being compressed. In some embodiments, compression circuitry is configured to determine, for multiple components included in pixels of the block of pixels being compressed, respective smallest and greatest component values in respective regions of the block of pixels. The compression circuitry may determine, based on the determined smallest and greatest component values, to use a first number of bits to represent delta values relative to a base value for a first component in a first region and a second, different number of bits to represent delta values relative to a base value for a second component in the first region. The compression circuitry may then quantize delta values for the first and second components of pixels in the first region of the block of pixels using the determined first and second numbers of bits. In some embodiments, the compression circuitry determines whether to provide cross-component bit sharing within a region.
-
公开(公告)号:US11664816B2
公开(公告)日:2023-05-30
申请号:US16855540
申请日:2020-04-22
Applicant: Apple Inc.
Inventor: Tyson J. Bergland , Karthik Ramani , Stephan Lachowsky , Justin A. Hensley , Davoud A. Jamshidi
IPC: H03M7/30 , H04N19/182 , H04N19/176
CPC classification number: H03M7/3059 , H04N19/176 , H04N19/182
Abstract: Techniques are disclosed relating to compression of pixel data using different quantization for different regions of a block of pixels being compressed. In some embodiments, compression circuitry is configured to determine, for multiple components included in pixels of the block of pixels being compressed, respective smallest and greatest component values in respective regions of the block of pixels. The compression circuitry may determine, based on the determined smallest and greatest component values, to use a first number of bits to represent delta values relative to a base value for a first component in a first region and a second, different number of bits to represent delta values relative to a base value for a second component in the first region. The compression circuitry may then quantize delta values for the first and second components of pixels in the first region of the block of pixels using the determined first and second numbers of bits. In some embodiments, the compression circuitry determines whether to provide cross-component bit sharing within a region.
-
公开(公告)号:US10970223B2
公开(公告)日:2021-04-06
申请号:US16410828
申请日:2019-05-13
Applicant: Apple Inc.
Inventor: Wolfgang H. Klingauf , Kenneth C. Dyke , Karthik Ramani , Winnie W. Yeung , Anthony P. DeLaurier , Luc R. Semeria , David A. Gotwalt , Srinivasa Rangan Sridharan , Muditha Kanchana
IPC: G06F12/08 , G06F12/0891 , G06F12/0804 , G06F12/0808 , G06F12/0815 , G06F12/123 , G06F12/0895 , G06F12/126 , G06F12/12
Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.
-
-
-
-
-
-
-
-
-