Abstract:
Methods and apparatus for enabling a peripheral processor to retrieve and load firmware for execution within the constraints of its memory. The peripheral processor is allocated a portion of the host processor's memory, to function as a logical secondary and tertiary memory for memory cache operation. The described embodiments enable the peripheral processor to support much larger and more complex firmware. Additionally, a multi-facetted locking mechanism is described which enables the peripheral processor and the host processor to access the secondary memory, while minimally impacting the other processor.
Abstract:
Methods and apparatus for locking at least a portion of a shared memory resource. In one embodiment, an electronic device configured to lock at least a portion of a shared memory is disclosed. The electronic device includes a host processor, at least one peripheral processor and a physical bus interface configured to couple the host processor to the peripheral processor. The electronic device further includes a software framework that is configured to: attempt to lock a portion of the shared memory; verify that the peripheral processor has not locked the shared memory; when the portion of the shared memory is successfully locked via the verification that the peripheral processor has not locked the portion of the shared memory, execute a critical section of the shared memory; and otherwise attempt to lock the at least the portion of the shared memory at a later time.
Abstract:
Methods and apparatus for an inter-processor communication (IPC) link between two (or more) independently operable processors. In one aspect, the IPC protocol is based on a “shared” memory interface for run-time processing (i.e., the independently operable processors each share (either virtually or physically) a common memory interface). In another aspect, the IPC communication link is configured to support a host driven boot protocol used during a boot sequence to establish a basic communication path between the peripheral and the host processors. Various other embodiments described herein include sleep procedures (as defined separately for the host and peripheral processors), and error handling.
Abstract:
Methods and apparatus for non-sequential packet transfer. Prior art multi-processor devices implement a complete network communications stack at each processor. The disclosed embodiments provide techniques for delivering network layer (L3) and/or transport layer (L4) data payloads in the order of receipt, rather than according to the data link layer (L2) order. The described techniques enable e.g., earlier packet delivery. Such design topologies can operate within a substantially smaller memory footprint compared to prior art solutions. As a related benefit, applications that are unaffected by data link layer corruptions can receive data immediately (rather than waiting for the re-transmission of an unrelated L4 data flow) and thus the overall network latency can be greatly reduced and user experience can be improved.
Abstract:
Methods and apparatus for efficient data transfer within a user space network stack. Unlike prior art monolithic networking stacks, the exemplary networking stack architecture described hereinafter includes various components that span multiple domains (both in-kernel, and non-kernel). For example, unlike traditional “socket” based communication, disclosed embodiments can transfer data directly between the kernel and user space domains. Direct transfer reduces the per-byte and per-packet costs relative to socket based communication. A user space networking stack is disclosed that enables extensible, cross-platform-capable, user space control of the networking protocol stack functionality. The user space networking stack facilitates tighter integration between the protocol layers (including TLS) and the application or daemon. Exemplary systems can support multiple networking protocol stack instances (including an in-kernel traditional network stack).
Abstract:
Methods and apparatus for synchronization of time between independently operable processors. Time synchronization between independently operable processors is complicated by a variety of factors. For example, neither independently operable processor controls the other processor's task scheduling, power, or clocking. In one exemplary embodiment, a processor can initiates a time synchronization process by disabling power state machines and transacting timestamps for a commonly observed event. In one such embodiment, timestamps may be transferred via inter-processor communication (IPC) mechanisms (e.g., transfer descriptors (TDs), and completion descriptors (CDs)). Both processors may thereafter coordinate in time synchronization efforts (e.g., speeding up or slowing down their respective clocks, etc.).
Abstract:
Methods and apparatus for correcting out-of-order data transactions over an inter-processor communication (IPC) link between two (or more) independently operable processors. In one embodiment, a peripheral-side processor receives data from an external device and stores it to memory. The host processor writes data structures (transfer descriptors) describing the received data, regardless of the order the data was received from the external device. The transfer descriptors are written to a memory structure (transfer descriptor ring) in memory shared between the host and peripheral processors. The peripheral reads the transfer descriptors and writes data structures (completion descriptors) to another memory structure (completion descriptor ring). The completion descriptors are written to enable the host processor to retrieve the stored data in the correct order. In optimized variants, a completion descriptor describes groups of transfer descriptors. In some variants, the peripheral processor caches the transfer descriptors to offload them from the transfer descriptor ring.
Abstract:
Methods and apparatus for registering and handling access violations of host memory. In one embodiment, a peripheral processor receives one or more window registers defining an extent of address space accessible from a host processor; responsive to an attempt to access an extent of address space outside of the extent of accessible address space, generates an error message; stores the error message within a violation register; and resumes operation of the peripheral processor upon clearance of the stored error message.
Abstract:
Methods and apparatus for transacting multiple data flows between multiple processors. In one such implementation, multiple data pipes are aggregated over a common transfer data structure. Completion status information corresponding to each data pipe is provided over individual completion data structures. Allocating a common fixed pool of resources for data transfer can be used in a variety of different load balancing and/or prioritization schemes; however, individualized completion status allows for individualized data pipe reclamation. Unlike prior art solutions which dynamically created and pre-allocated memory space for each data pipe individually, the disclosed embodiments can only request resources from a fixed pool. In other words, outstanding requests are queued (rather than immediately serviced with a new memory allocation), thus overall bandwidth remains constrained regardless of the number of data pipes that are opened and/or closed.
Abstract:
Methods and apparatus for isolation of sub-system resources (such as clocks, power, and reset) within independent domains. In one embodiment, each sub-system of a system has one or more dedicated power and clock domains that operate independent of other sub-system operation. For example, in an exemplary mobile device with cellular, WLAN and PAN connectivity, each such sub-system is connected to a common memory mapped bus function, yet can operate independently. The disclosed architecture advantageously both satisfies the power consumption limitations of mobile devices, and concurrently provides the benefits of memory mapped connectivity for high bandwidth applications on such mobile devices.