Abstract:
Systems and processes are disclosed for controlling television user interactions using a virtual assistant. A virtual assistant can interact with a television set-top box to control content shown on a television. Speech input for the virtual assistant can be received from a device with a microphone. User intent can be determined from the speech input, and the virtual assistant can execute tasks according to the user's intent, including causing playback of media on the television. Virtual assistant interactions can be shown on the television in interfaces that expand or contract to occupy a minimal amount of space while conveying desired information. Multiple devices associated with multiple displays can be used to determine user intent from speech input as well as to convey information to users. In some examples, virtual assistant query suggestions can be provided to the user based on media content shown on a display.
Abstract:
Systems and processes for operating a digital assistant are provided. In one example, a method includes receiving a first speech input from a user. The method further includes identifying context information and determining a user intent based on the first speech input and the context information. The method further includes determining whether the user intent is to perform a task using a searching process or an object managing process. The searching process is configured to search data, and the object managing process is configured to manage objects. The method further includes, in accordance with a determination the user intent is to perform the task using the searching process, performing the task using the searching process; and in accordance with the determination that the user intent is to perform the task using the object managing process, performing the task using the object managing process.
Abstract:
An electronic device with one or more processors and memory includes a procedure for enabling conversation persistence across two or more instances of a digital assistant. In some embodiments, the device displays a first dialogue in a first instance of a digital assistant user interface. In response to a request to display a user interface different from the digital assistant user interface, the device displays the user interface different from the digital assistant user interface. In response to a request to invoke the digital assistant, the device displays a second instance of the digital assistant user interface, including displaying a second dialogue in the second instance of the digital assistant user interface, where the first dialogue remains available for display in the second instance of the digital assistant user interface.
Abstract:
Systems and processes for operating an intelligent automated assistant are provided. For example, a first speech input directed to a digital assistant is received from a user. A first response is provided based on the first speech input. A session window is initiated, wherein the session window is associated with a variable speech threshold. A second speech input is received during the session window. In accordance with a determination that the second speech input includes speech directed to the digital assistant, a duration associated with the session window is increased. In accordance with a determination that the variable speech threshold does not exceed a predetermined speech threshold, the session window is ended.
Abstract:
In one implementation, a method of changing a state of an object is performed at a device including an image sensor, one or more processors, and non-transitory memory. The method includes receiving a vocal command. The method includes obtaining, using the image sensor, an image of a physical environment. The method includes detecting, in the image of the physical environment, an object based on a visual model of the object stored in the non-transitory memory in association with an object identifier of the object. The method includes generating, based on the vocal command and detection of the object, an instruction including the object identifier of the object. The method includes effectuating the instruction to change a state of the object.
Abstract:
Systems and processes are disclosed for controlling television user interactions using a virtual assistant. In an example process, a virtual assistant can interact with a television set-top box to control content shown on a television display. Speech input for the virtual assistant can be received from a device with a microphone. The speech input can comprise a query associated with content shown on the television display. A user intent of the query can be determined based on one or more of the content shown on the television display and a viewing history of media content. A result of the query can be caused to be displayed based on the determined user intent.
Abstract:
A method includes outputting an alert corresponding to an information item. In some implementations, the alert is a sound. In some implementations, the alert is ambiguous (e.g., the sound indicates several possible information items). The method further includes receiving a speech input after outputting the alert. The method further includes determining whether the speech input includes a request for information about the alert. The method further includes, in response to determining that the speech input includes a request for information about the alert, providing a first speech output including information about the alert.
Abstract:
Systems and processes for generating output dialogs for virtual assistants are provided. An output dialog can be generated from multiple output segments that can each include a string of one or more characters or words. The contents of an output segment can be selected from multiple possible outputs based on a predetermined order, conditional logic, or a random selection. The output segments can be concatenated to form the output dialog. In one example, a dialog generation file that defines the possible outputs for each output segment, an ordering of the output segments within the output dialog, and format for the output dialog can be used to generate the output dialog. The dialog generation file can include any number of functional blocks, which can each output an output segment, that can be arranged hierarchically and in a particular order to generate a desired output dialog.
Abstract:
Systems and processes are disclosed for controlling television user interactions using a virtual assistant. A virtual assistant can interact with a television set-top box to control content shown on a television. Speech input for the virtual assistant can be received from a device with a microphone. User intent can be determined from the speech input, and the virtual assistant can execute tasks according to the user's intent, including causing playback of media on the television. Virtual assistant interactions can be shown on the television in interfaces that expand or contract to occupy a minimal amount of space while conveying desired information. Multiple devices associated with multiple displays can be used to determine user intent from speech input as well as to convey information to users. In some examples, virtual assistant query suggestions can be provided to the user based on media content shown on a display.
Abstract:
An electronic device receives a first input that corresponds to a request to open a respective application, and in response to receiving the first input, in accordance with a determination that the device is being operated in a limited-distraction context, provides a limited-distraction user interface that includes providing for display fewer selectable user interface objects than are displayed in a non-limited user interface for the respective application, and in accordance with a determination that the device is not being operated in a limited-distraction context, provides a non-limited user interface for the respective application.