-
公开(公告)号:US20170206035A1
公开(公告)日:2017-07-20
申请号:US15000667
申请日:2016-01-19
Applicant: QUALCOMM Incorporated
Inventor: Tushar Kumar , Aravind Natarajan , Dario Suarez Gracia
CPC classification number: G06F3/0656 , G06F3/061 , G06F3/0673 , G06F9/3834 , G06F9/467 , G06F9/544 , G06F12/1072 , G06F2212/65
Abstract: Methods, devices, and non-transitory processor-readable storage media for a computing device to merge concurrent writes from a plurality of processing units to a buffer associated with an application. An embodiment method executed by a processor may include identifying a plurality of concurrent requests to access the buffer that are sparse, disjoint, and write-only, configuring a write-set for each of the plurality of processing units, executing the plurality of concurrent requests to access the buffer using the write-sets, determining whether each of the plurality of concurrent requests to access the buffer is complete, obtaining a buffer index and data via the write-set of each of the plurality of processing units, and writing to the buffer using the received buffer index and data via the write-set of each of the plurality of processing units in response to determining that each of the plurality of concurrent requests to access the buffer is complete.
-
公开(公告)号:US20170083364A1
公开(公告)日:2017-03-23
申请号:US14862373
申请日:2015-09-23
Applicant: QUALCOMM Incorporated
Inventor: Han Zhao , Dario Suárez Gracia , Tushar Kumar
CPC classification number: G06F9/4818 , G06F9/4881 , G06F9/4893 , G06F9/5083 , G06F9/5088 , Y02D10/24 , Y02D10/32
Abstract: Various embodiments proactively balance workloads between a plurality of processing units of a multi-processor computing device by making work-stealing determinations based on operating state data. An embodiment method includes obtaining static characteristics data associated with each of a victim processor and one or more of a plurality of processing units that are ready to steal work items from the victim processor (work-ready processors), obtaining dynamic characteristics data for each of the processors, calculating priority values for each of the processors based on the obtained data, and transferring a number of work items assigned to the victim processor to a winning work-ready processor based on the calculated priority values. In some embodiments, the method may include acquiring control over a probabilistic lock for a shared data structure and updating the shared data structure to indicate the number of work items transferred to the winning work-ready processor.
-
公开(公告)号:US20160078246A1
公开(公告)日:2016-03-17
申请号:US14599609
申请日:2015-01-19
Applicant: QUALCOMM Incorporated
Inventor: Tushar Kumar , Pablo Montesinos Ortego , Arun Raman
CPC classification number: G06F9/542 , G06F9/3009 , G06F9/461 , G06F21/629 , G06F2221/2147
Abstract: A computing device may be configured to generate and execute a task that includes one or more blocking constructs that each encapsulate a blocking activity and a notification handler corresponding to each blocking activity. The computing device may launch the task, execute one or more of the blocking constructs, register the corresponding notification handler for the blocking activity that will be executed next with the runtime system, perform the blocking activity encapsulated by the blocking construct to request information from an external resource, cause the task to enter a blocked state while it waits for a response from the external resource, receive an unblocking notification from an external entity, and invoke the registered notification handler to cause the task to exit the blocked state and/or perform clean up operations to exit/terminate the task gracefully.
Abstract translation: 计算设备可以被配置为生成和执行包括一个或多个阻塞结构的任务,每个阻塞结构每个封装阻塞活动和对应于每个阻塞活动的通知处理器。 计算设备可以启动任务,执行一个或多个阻塞结构,为随后运行时系统执行的阻塞活动注册相应的通知处理程序,执行由阻塞结构封装的阻塞活动以从 外部资源,导致任务在等待来自外部资源的响应时进入阻塞状态,从外部实体接收解除阻止的通知,并调用注册的通知处理程序以使任务退出阻止状态和/或执行 清理操作以优雅地退出/终止任务。
-
公开(公告)号:US10360063B2
公开(公告)日:2019-07-23
申请号:US14862373
申请日:2015-09-23
Applicant: QUALCOMM Incorporated
Inventor: Han Zhao , Dario Suárez Gracia , Tushar Kumar
Abstract: Various embodiments proactively balance workloads between a plurality of processing units of a multi-processor computing device by making work-stealing determinations based on operating state data. An embodiment method includes obtaining static characteristics data associated with each of a victim processor and one or more of a plurality of processing units that are ready to steal work items from the victim processor (work-ready processors), obtaining dynamic characteristics data for each of the processors, calculating priority values for each of the processors based on the obtained data, and transferring a number of work items assigned to the victim processor to a winning work-ready processor based on the calculated priority values. In some embodiments, the method may include acquiring control over a probabilistic lock for a shared data structure and updating the shared data structure to indicate the number of work items transferred to the winning work-ready processor.
-
公开(公告)号:US10325390B2
公开(公告)日:2019-06-18
申请号:US15192051
申请日:2016-06-24
Applicant: QUALCOMM Incorporated
Inventor: Tushar Kumar , Wenhao Jia , Arun Raman , Hui Chao , Wenjia Ruan
Abstract: Various embodiments may include methods executed by processors of computing devices for geometry based work execution prioritization. The processor may receive events, such as images. The processor may overlay a boundary shape on the event to identify discard regions of the event lying outside the boundary shape. The processor may identify work regions of the events lying within the working boundary shape. The devices may determine a cancellation likelihood for each of the identified work regions of the events. The processor may assign a trimming weight to each of the identified work regions based on the determined cancellation likelihoods. The processor may then add each of the identified work regions as a work item to an execution work list in an order based on the assigned trimming weights. The work items may be processed in order of trimming weight priority.
-
公开(公告)号:US20180046238A1
公开(公告)日:2018-02-15
申请号:US15417605
申请日:2017-01-27
Applicant: QUALCOMM Incorporated
Inventor: Wenjia Ruan , Han Zhao , Tushar Kumar
CPC classification number: G06F1/329 , G06F1/28 , G06F1/3228 , G06F9/4893 , G06F9/5094 , Y02D10/22 , Y02D10/24
Abstract: Various embodiments provide methods, devices, and non-transitory processor-readable storage media enabling joint goals, such as joint power and performance goals, to be realized on a per heterogeneous processing device basis for heterogeneous parallel computing constructs. Various embodiments may enable assignments of power states for heterogeneous processing devices on a per heterogeneous processing device basis to satisfy an overall goal on the heterogeneous processing construct. Various embodiments may enable dynamic adjustment of power states for heterogeneous processing devices on a per heterogeneous processing device basis.
-
17.
公开(公告)号:US20170286182A1
公开(公告)日:2017-10-05
申请号:US15085108
申请日:2016-03-30
Applicant: QUALCOMM Incorporated
Inventor: Dario Suarez Gracia , Gheorghe Cascaval , Han Zhao , Tushar Kumar , Aravind Natarajan , Arun Raman
IPC: G06F9/52
Abstract: Embodiments include computing devices, systems, and methods identifying enhanced synchronization operation outcomes. A computing device may receive a first resource access request for a first resource of a computing device including a first requester identifier from a first computing element of the computing device. The computing device may also receive a second resource access request for the first resource including a second requester identifier from a second computing element of the computing device. The computing device may grant the first computing element access to the first resource based on the first resource access request, and return a response to the second computing element including the first requester identifier as a winner computing element identifier.
-
公开(公告)号:US09733978B2
公开(公告)日:2017-08-15
申请号:US14837156
申请日:2015-08-27
Applicant: QUALCOMM Incorporated
Inventor: Dario Suarez Gracia , Tushar Kumar , Aravind Natarajan , Ravish Hastantram , Gheorghe Calin Cascaval , Han Zhao
IPC: G06F9/46 , G06F15/173 , G06F1/00 , G06F9/48 , G06F9/50 , G06F12/0806 , G06F12/0862 , G06F12/12 , G06F11/34
CPC classification number: G06F9/48 , G06F9/4806 , G06F9/4831 , G06F9/4837 , G06F9/4843 , G06F9/4881 , G06F9/4893 , G06F9/50 , G06F9/5005 , G06F9/5011 , G06F9/5016 , G06F9/5022 , G06F9/5027 , G06F9/5033 , G06F9/5038 , G06F9/5044 , G06F9/505 , G06F9/5055 , G06F9/5083 , G06F9/5094 , G06F11/3419 , G06F11/3466 , G06F12/0806 , G06F12/0862 , G06F12/12 , G06F12/128 , G06F2212/1016 , G06F2212/62 , Y02D10/22 , Y02D10/24
Abstract: Various embodiments include methods for data management in a computing device utilizing a plurality of processing units. Embodiment methods may include generating a data transfer heuristic model based on measurements from a plurality of sample data transfers between a plurality of data storage units. The generated data transfer heuristic model may be used to calculate data transfer costs for each of a plurality of tasks. The calculated data transfer costs may be used to schedule execution of the plurality of tasks in an execution order on selected ones of the plurality of processing units. The data transfer heuristic model may be updated based on measurements of data transfers occurring during the executions of the plurality of tasks (e.g., time, power consumption, etc.). Code executing on the processing units may indicate to a runtime when certain data blocks are no longer needed and thus may be evicted and/or pre-fetched for others.
-
公开(公告)号:US09710315B2
公开(公告)日:2017-07-18
申请号:US14599609
申请日:2015-01-19
Applicant: QUALCOMM Incorporated
Inventor: Tushar Kumar , Pablo Montesinos Ortego , Arun Raman
CPC classification number: G06F9/542 , G06F9/3009 , G06F9/461 , G06F21/629 , G06F2221/2147
Abstract: A computing device may be configured to generate and execute a task that includes one or more blocking constructs that each encapsulate a blocking activity and a notification handler corresponding to each blocking activity. The computing device may launch the task, execute one or more of the blocking constructs, register the corresponding notification handler for the blocking activity that will be executed next with the runtime system, perform the blocking activity encapsulated by the blocking construct to request information from an external resource, cause the task to enter a blocked state while it waits for a response from the external resource, receive an unblocking notification from an external entity, and invoke the registered notification handler to cause the task to exit the blocked state and/or perform clean up operations to exit/terminate the task gracefully.
-
公开(公告)号:US20170109214A1
公开(公告)日:2017-04-20
申请号:US14885226
申请日:2015-10-16
Applicant: QUALCOMM Incorporated
Inventor: Arun Raman , Tushar Kumar
CPC classification number: G06F9/52 , G06F9/4881 , G06F16/245 , G06F16/9024
Abstract: Embodiments include computing devices, apparatus, and methods implemented by a computing device for accelerating execution of a plurality of tasks belonging to a common property task graph. The computing device may identify a first successor task dependent upon a bundled task such that an available synchronization mechanism is a common property for the bundled task and the first successor task, and such that the first successor task only depends upon predecessor tasks for which the available synchronization mechanism is a common property. The computing device may add the first successor task to a common property task graph and add the plurality of tasks belonging to the common property task graph to a ready queue. The computing device may recursively identify successor tasks. The synchronization mechanism may include a synchronization mechanism for control logic flow or a synchronization mechanism for data access.
-
-
-
-
-
-
-
-
-