Abstract:
Methods, devices, and systems for automatically determining how an application program may be partitioned and offloaded for execution by a general purpose applications processor and an auxiliary processor (e.g., a DSP, GPU, etc.) within a mobile device. The mobile device may determine the portions of the application code that are best suited for execution on the auxiliary processor based on pattern-matching of directed acyclic graphs (DAGS). In particular, the mobile device may identify one or more patterns in the code, particularly in a data flow graph of the code, comparing each identified code pattern to predefined graph patterns known to have a certain benefit when executed on the auxiliary processor (e.g., a DSP). The mobile device may determine the costs and/or benefits of executing the potions of code on the auxiliary processor, and may offload portions that have low costs and/or high benefits related to the auxiliary processor.
Abstract:
The aspects enable a processor to concurrently execute a first serial language code embedding a second serial language code during a page load by a browser. A parser parses the first serial language code until a segment of the embedded second serial language code is encountered. The segment of embedded second serial language code is extracted for execution by an execution engine, which proceeds concurrently with speculative parsing of the first serial language code. Code generated by execution of second serial language code is evaluated to determine if it is well-formed, and partial rollback and re-parsing of the first serial language code is performed if the code is not well-formed. Concurrent parsing of first serial language code and execution of second language code, with partial roll back and reparsing when necessary, continues until the first language code has been parsed and the second serial language code has been executed.
Abstract:
Mobile computing devices may be configured to compile and execute portions of a general purpose software application in an auxiliary processor (e.g., a DSP) of a multiprocessor system by reading and writing information to a shared memory. A first process (P1) on the applications processor may request address negotiation with a second process (P2) on the auxiliary processor, obtain a first address map from a first operating system, and send the first address map to the auxiliary processor. The second process (P2) may receive the first address map, obtain a second address map from a second operating system, identify matching addresses in the first and second address maps, store the matching addresses as common virtual addresses, and send the common virtual addresses back to the applications processor. The first and second processes (i.e., P1 and P2) may each use the common virtual addresses to map physical pages to the memory.
Abstract:
Mobile computing devices may be configured to intelligently select, compile, and execute portions of a general purpose software application in an auxiliary processor (e.g., a DSP) of a multiprocessor system. A processor of the mobile device may be configured to determine whether portions of a software application are suitable for execution in an auxiliary processor, monitor operating conditions of the system, determine a historical context based on the monitoring, and determine whether the portions that were determined to suitable for execution in an auxiliary processor should be compiled for execution in the auxiliary processor based on the historical context. The processor may also be configured to continue monitoring the system, update the historical context information, and determine whether code previously compiled for execution on the auxiliary processor should be invoked or executed in the auxiliary processor based on the updated historical context information.
Abstract:
The various aspects leverage the novel observation that the number of call sites in code is directly correlated with the code's compile time and provide methods implemented by a compiler operating on a computing device (e.g., a smartphone) for performing inline throttling based on a projected number of call sites in the code that would exist after performing inline expansion. The various aspects enable the compiler to improve the performance of the generated code by aggressive inlining while carefully managing increases in compile time, thereby decreasing the power required to compile the code while increasing performance of the computing device. Thus, by inlining enough call sites to reduce the costs of handling calls while accounting for the costs of inlining, the various aspects provide for an effective balance of short compile times and effective code performance.
Abstract:
Mobile computing devices may be configured to compile and execute portions of a general purpose software application in an auxiliary processor (e.g., a DSP) of a multiprocessor system by reading and writing information to a shared memory. A first process (P1) on the applications processor may request address negotiation with a second process (P2) on the auxiliary processor, obtain a first address map from a first operating system, and send the first address map to the auxiliary processor. The second process (P2) may receive the first address map, obtain a second address map from a second operating system, identify matching addresses in the first and second address maps, store the matching addresses as common virtual addresses, and send the common virtual addresses back to the applications processor. The first and second processes (i.e., P1 and P2) may each use the common virtual addresses to map physical pages to the memory.
Abstract:
Methods, devices, and systems for automatically determining how an application program may be partitioned and offloaded for execution by a general purpose applications processor and an auxiliary processor (e.g., a DSP, GPU, etc.) within a mobile device. The mobile device may determine the portions of the application code that are best suited for execution on the auxiliary processor based on pattern-matching of directed acyclic graphs (DAGS). In particular, the mobile device may identify one or more patterns in the code, particularly in a data flow graph of the code, comparing each identified code pattern to predefined graph patterns known to have a certain benefit when executed on the auxiliary processor (e.g., a DSP). The mobile device may determine the costs and/or benefits of executing the portions of code on the auxiliary processor, and may offload portions that have low costs and/or high benefits related to the auxiliary processor.
Abstract:
The various aspects provide a dynamic compilation framework that includes a machine-independent optimization module operating on a computing device and methods for optimizing code with the machine-independent optimization module using a single, combined-forwards-backwards pass of the code. In the various aspects, the machine-independent optimization module may generate a graph of nodes from the IR, optimize nodes in the graph using forwards and backwards optimizations, and propagating the forwards and backwards optimizations to nodes in a bounded subgraph recognized or defined based on the position of the node currently being optimized. In the various aspects, the machine-independent optimization module may optimize the graph by performing forwards and/or backwards optimizations during a single pass through the graph, thereby achieving an effective degree of optimization and shorter overall compile times. Thus, the various aspects may provide a global optimization framework for dynamic compilers that is faster and more efficient than existing solutions.
Abstract:
The various aspects provide a dynamic compilation framework that includes a machine-independent optimization module operating on a computing device and methods for optimizing code with the machine-independent optimization module using a single, combined-forwards-backwards pass of the code. In the various aspects, the machine-independent optimization module may generate a graph of nodes from the IR, optimize nodes in the graph using forwards and backwards optimizations, and propagating the forwards and backwards optimizations to nodes in a bounded subgraph recognized or defined based on the position of the node currently being optimized. In the various aspects, the machine-independent optimization module may optimize the graph by performing forwards and/or backwards optimizations during a single pass through the graph, thereby achieving an effective degree of optimization and shorter overall compile times. Thus, the various aspects may provide a global optimization framework for dynamic compilers that is faster and more efficient than existing solutions.