Enhanced Image Processing Techniques for Deep Neural Networks

    公开(公告)号:US20200380639A1

    公开(公告)日:2020-12-03

    申请号:US16794824

    申请日:2020-02-19

    Applicant: Apple Inc.

    Abstract: Artistic styles extracted from source images may be applied to target images to generate stylized images and/or video sequences. The extracted artistic styles may be stored as a plurality of layers in one or more neural networks, which neural networks may be further optimized, e.g., via the fusion of various elements of the networks' architectures. The artistic style may be applied to the target images and/or video sequences using various optimization methods, such as the use of a first version of the neural network by a first processing device at a first resolution to generate one or more sets of parameters (e.g., scaling and/or biasing parameters), which parameters may then be mapped for use by a second version of the neural network by a second processing device at a second resolution. Analogous multi-processing device and/or multi-network solutions may also be applied to other complex image processing tasks for increased efficiency.

    Multi media computing or entertainment system for responding to user presence and activity

    公开(公告)号:US10444854B2

    公开(公告)日:2019-10-15

    申请号:US16055994

    申请日:2018-08-06

    Applicant: Apple Inc.

    Abstract: Intelligent systems are disclosed that respond to user intent and desires based upon activity that may or may not be expressly directed at the intelligent system. In some embodiments, the intelligent system acquires a depth image of a scene surrounding the system. A scene geometry may be extracted from the depth image and elements of the scene may be monitored. In certain embodiments, user activity in the scene is monitored and analyzed to infer user desires or intent with respect to the system. The interpretation of the user's intent as well as the system's response may be affected by the scene geometry surrounding the user and/or the system. In some embodiments, techniques and systems are disclosed for interpreting express user communication, e.g., expressed through hand gesture movements. In some embodiments, such gesture movements may be interpreted based on real-time depth information obtained from, e.g., optical or non-optical type depth sensors.

    Multi media computing or entertainment system for responding to user presence and activity

    公开(公告)号:US10048765B2

    公开(公告)日:2018-08-14

    申请号:US14865850

    申请日:2015-09-25

    Applicant: Apple Inc.

    Abstract: Varying embodiments of intelligent systems are disclosed that respond to user intent and desires based upon activity that may or may not be expressly directed at the intelligent system. In some embodiments, the intelligent system acquires a depth image of a scene surrounding the system. A scene geometry may be extracted from the depth image and elements of the scene, such as walls, furniture, and humans may be evaluated and monitored. In certain embodiments, user activity in the scene is monitored and analyzed to infer user desires or intent with respect to the system. The interpretation of the user's intent or desire as well as the system's response may be affected by the scene geometry surrounding the user and/or the system. In some embodiments, techniques and systems are disclosed for interpreting express user communication, for example, expressed through fine hand gesture movements. In some embodiments, such gesture movements may be interpreted based on real-time depth information obtained from, for example, optical or non-optical type depth sensors. The depth information may be interpreted in “slices” (three-dimensional regions of space having a relatively small depth) until one or more candidate hand structures are detected. Once detected, each candidate hand structure may be confirmed or rejected based on its own unique physical properties (e.g., shape, size and continuity to an arm structure). Each confirmed hand structure may be submitted to a depth-aware filtering process before its own unique three-dimensional features are quantified into a high-dimensional feature vector. A two-step classification scheme may be applied to the feature vectors to identify a candidate gesture (step 1), and to reject candidate gestures that do not meet a gesture-specific identification operation (step-2). The identified gesture may be used to initiate some action controlled by a computer system.

    Three-Dimensional Hand Tracking Using Depth Sequences
    15.
    发明申请
    Three-Dimensional Hand Tracking Using Depth Sequences 有权
    使用深度序列的三维手跟踪

    公开(公告)号:US20160048726A1

    公开(公告)日:2016-02-18

    申请号:US14706649

    申请日:2015-05-07

    Applicant: Apple Inc.

    Abstract: In the field of Human-computer interaction (HCI), i.e., the study of the interfaces between people (i.e., users) and computers, understanding the intentions and desires of how the user wishes to interact with the computer is a very important problem. The ability to understand human gestures, and, in particular, hand gestures, as they relate to HCI, is a very important aspect in understanding the intentions and desires of the user in a wide variety of applications. In this disclosure, a novel system and method for three-dimensional hand tracking using depth sequences is described. Some of the major contributions of the hand tracking system described herein include: 1.) a robust hand detector that is invariant to scene background changes; 2.) a bi-directional tracking algorithm that prevents detected hands from always drifting closer to the front of the scene (i.e., forward along the z-axis of the scene); and 3.) various hand verification heuristics.

    Abstract translation: 在人机互动(HCI)领域,即研究人(即用户)和计算机之间的界面,理解用户希望如何与计算机交互的意图和欲望是非常重要的问题。 了解人类手势,特别是手势,因为它们与HCI相关的能力在理解用户在各种应用中的意图和欲望方面是一个非常重要的方面。 在本公开中,描述了使用深度序列的三维手跟踪的新颖系统和方法。 本文描述的手持跟踪系统的一些主要贡献包括:1.)对场景背景变化不变的鲁棒手指检测器; 2.)双向跟踪算法,其防止检测到的手总是漂移到靠近场景的前方(即,沿着场景的z轴向前); 和3.)各种手验证启发式。

    USE OF PIPELINED HIERARCHICAL MOTION ESTIMATOR IN VIDEO CODING
    16.
    发明申请
    USE OF PIPELINED HIERARCHICAL MOTION ESTIMATOR IN VIDEO CODING 审中-公开
    管道分层运动估计器在视频编码中的应用

    公开(公告)号:US20150341659A1

    公开(公告)日:2015-11-26

    申请号:US14696162

    申请日:2015-04-24

    Applicant: Apple Inc.

    CPC classification number: H04N19/577 H04N19/105 H04N19/53

    Abstract: A pipelined video coding system may include a motion estimation stage and an encoding stage. The motion estimation stage may operate on an input frame of video data in a first stage of operation and may generate estimates of motion and other statistical analyses. The encoding stage may operate on the input frame of video data in a second stage of operation later than the first stage. The encoding stage may perform predictive coding using coding parameters that are selected, at least in part, from the estimated motion and statistical analysis generated by the motion estimator. Because the motion estimation is performed at a processing stage that precedes the encoding, a greater amount of processing time may be devoted to such processes than in systems that performed both operations in a single processing stage.

    Abstract translation: 流水线视频编码系统可以包括运动估计级和编码级。 运动估计阶段可以在第一操作阶段中对视频数据的输入帧进行操作,并且可以产生运动和其他统计分析的估计。 编码级可以在晚于第一级的第二操作阶段中对视频数据的输入帧进行操作。 编码阶段可以使用至少部分地由运动估计器生成的估计运动和统计分析来选择的编码参数来执行预测编码。 因为在编码之前的处理阶段执行运动估计,所以与在单个处理阶段中执行两个操作的系统相比,可以将更多的处理时间用于这样的处理。

    Flexible resolution support for image and video style transfer

    公开(公告)号:US10909657B1

    公开(公告)日:2021-02-02

    申请号:US16032844

    申请日:2018-07-11

    Applicant: Apple Inc.

    Abstract: Artistic styles extracted from one or more source images may be applied to one or more target images, e.g., in the form of stylized images and/or stylized video sequences. The extracted artistic style may be stored as a plurality of layers in a neural network, which neural network may be further optimized, e.g., via the fusion of various elements of the network's architectures. An optimized network architecture may be determined for each processing environment in which the network will be applied. The artistic style may be applied to the obtained images and/or video sequence of images using various optimization methods, such as the use of scalars to control the resolution of the unstylized and stylized images, temporal consistency constraints, as well as the use of dynamically adjustable or selectable versions of Deep Neural Networks (DNN) that are responsive to system performance parameters, such as available processing resources and thermal capacity.

    COMPLEXITY-AWARE ENCODING
    20.
    发明申请

    公开(公告)号:US20150003515A1

    公开(公告)日:2015-01-01

    申请号:US14479014

    申请日:2014-09-05

    Applicant: Apple Inc.

    CPC classification number: H04N19/154 H04N19/156 H04N19/164

    Abstract: Techniques for encoding data based at least in part upon an awareness of the decoding complexity of the encoded data and the ability of a target decoder to decode the encoded data are disclosed. In some embodiments, a set of data is encoded based at least in part upon a state of a target decoder to which the encoded set of data is to be provided. In some embodiments, a set of data is encoded based at least in part upon the states of multiple decoders to which the encoded set of data is to be provided.

Patent Agency Ranking