-
1.
公开(公告)号:US20190079772A1
公开(公告)日:2019-03-14
申请号:US15701926
申请日:2017-09-12
Applicant: QUALCOMM Incorporated
Inventor: Anil Krishna , Yongseok Yi , Eric Rotenberg , Vignyan Reddy Kothinti Naresh , Gregory Michael Wright
Abstract: Providing variable interpretation of usefulness indicators for memory tables in processor-based systems is disclosed. In one aspect, a memory system comprises a memory table providing multiple memory table entries, each including a usefulness indicator. A memory controller of the memory system comprises a global polarity indicator representing how the usefulness indicator for each memory table entry is interpreted and updated by the memory controller. If the global polarity indicator is set, the memory controller interprets a value of each usefulness indicator as directly corresponding to the usefulness of the corresponding memory table entry. Conversely, if the global polarity indicator is not set, the polarity is reversed such that the memory controller interprets the usefulness indicator value as inversely corresponding to the usefulness of the corresponding memory table entry. In this manner, the interpretation and updating of usefulness indicators by the memory controller can be varied using the global polarity indicator.
-
公开(公告)号:US10725782B2
公开(公告)日:2020-07-28
申请号:US15701926
申请日:2017-09-12
Applicant: QUALCOMM Incorporated
Inventor: Anil Krishna , Yongseok Yi , Eric Rotenberg , Vignyan Reddy Kothinti Naresh , Gregory Michael Wright
Abstract: Providing variable interpretation of usefulness indicators for memory tables in processor-based systems is disclosed. In one aspect, a memory system comprises a memory table providing multiple memory table entries, each including a usefulness indicator. A memory controller of the memory system comprises a global polarity indicator representing how the usefulness indicator for each memory table entry is interpreted and updated by the memory controller. If the global polarity indicator is set, the memory controller interprets a value of each usefulness indicator as directly corresponding to the usefulness of the corresponding memory table entry. Conversely, if the global polarity indicator is not set, the polarity is reversed such that the memory controller interprets the usefulness indicator value as inversely corresponding to the usefulness of the corresponding memory table entry. In this manner, the interpretation and updating of usefulness indicators by the memory controller can be varied using the global polarity indicator.
-
3.
公开(公告)号:US11048509B2
公开(公告)日:2021-06-29
申请号:US16000580
申请日:2018-06-05
Applicant: QUALCOMM Incorporated
Inventor: Hadi Parandeh Afshar , Amrit Panda , Eric Rotenberg , Gregory Michael Wright
Abstract: Providing multi-element multi-vector (MEMV) register file access in vector-processor-based devices is disclosed. In this regard, a vector-processor-based device includes a vector processor comprising multiple processing elements (PEs) communicatively coupled via a corresponding plurality of channels to a vector register file comprising a plurality of memory banks. The vector processor provides a direct memory access (DMA) controller that is configured to receive a plurality of vectors that each comprise a plurality of vector elements representing operands for processing a loop iteration. The DMA controller arranges the vectors in the vector register file such that, for each group of vectors to be accessed in parallel, vector elements for each vector are stored consecutively, but corresponding vector elements of consecutive vectors are stored in different memory banks of the vector register file. As a result, multiple elements of multiple vectors may be accessed with a single vector register file access operation.
-
4.
公开(公告)号:US20200065098A1
公开(公告)日:2020-02-27
申请号:US16107136
申请日:2018-08-21
Applicant: QUALCOMM Incorporated
Inventor: Hadi Parandeh Afshar , Eric Rotenberg , Gregory Michael Wright
Abstract: Providing efficient handling of branch divergence in vectorizable loops by vector-processor-based devices is disclosed. In some aspects, a vector-processor-based device provides a plurality of processing elements (PEs) coupled to a scheduler circuit comprising a clock cycle threshold and a mask register comprising a plurality of bits corresponding to a plurality of loop iterations of a vectorizable loop to be executed. The scheduler circuit initiates a first execution interval, during which loop iterations of the vectorizable loop are assigned to PEs for parallel execution. If a loop iteration's execution time exceeds the clock cycle threshold, the scheduler circuit sets a mask register bit corresponding to the loop iteration indicating that the loop iteration is incomplete, and defers its execution. After the first execution interval is complete, the scheduler circuit initiates a second execution interval, during which incomplete loop iterations indicated by the mask register are executed in parallel by the PEs.
-
5.
公开(公告)号:US20200012618A1
公开(公告)日:2020-01-09
申请号:US16028072
申请日:2018-07-05
Applicant: QUALCOMM Incorporated
Inventor: Hadi Parandeh Afshar , Amrit Panda , Eric Rotenberg , Gregory Michael Wright
Abstract: Providing reconfigurable fusion of processing elements (PEs) in vector-processor-based devices is disclosed. In this regard, a vector-processor-based device provides a vector processor including a plurality of PEs and a decode/control circuit. The decode/control circuit receives an instruction block containing a vectorizable loop comprising a loop body. The decode/control circuit determines how many PEs of the plurality of PEs are required to execute the loop body, and reconfigures the plurality of PEs into one or more fused PEs, each including the determined number of PEs required to execute the loop body. The plurality of PEs, reconfigured into one or more fused PEs, then executes one or more loop iterations of the loop body. Some aspects further include a PE communications link interconnecting the plurality of PEs, to enable communications between PEs of a fused PE and communications of inter-iteration data dependencies between PEs without requiring vector register file access operations.
-
6.
公开(公告)号:US20190384606A1
公开(公告)日:2019-12-19
申请号:US16012347
申请日:2018-06-19
Applicant: QUALCOMM Incorporated
Inventor: Amrit Panda , Eric Rotenberg , Hadi Parandeh Afshar , Gregory Michael Wright
Abstract: Enabling parallel memory accesses by providing explicit affine instructions in vector-processor-based devices is disclosed. In this regard, a vector-processor-based device implementing a block-based dataflow instruction set architecture (ISA) includes a decoder circuit configured to provide an affine instruction that specifies a base parameter indicating a base value B, a stride parameter indicating a stride interval value S, and a count parameter indicating a count value C. The decoder circuit of the vector-processor-based device decodes the affine instruction, and generates an output stream comprising one or more output values, wherein a count of the output values of the output stream equals the count value C. Using an index X where 0≤X
-
-
-
-
-