SYSTEM AND METHOD FOR CONVERTING IMAGE DATA INTO A NATURAL LANGUAGE DESCRIPTION

    公开(公告)号:US20200372058A1

    公开(公告)日:2020-11-26

    申请号:US16941299

    申请日:2020-07-28

    Abstract: For image captioning such as for computer game images or other images, bottom-up attention is combined with top-down attention to provide a multi-level residual attention-based image captioning model. A residual attention mechanism is first applied in the Faster R-CNN network to learn better feature representations for each region by taking spatial information into consideration. In the image captioning network, taking the extracted regional features as input, a second residual attention network is implemented to fuse the regional features attentionally for subsequent caption generation.

    Augmenting virtual reality content with real world content

    公开(公告)号:US10191541B2

    公开(公告)日:2019-01-29

    申请号:US15385794

    申请日:2016-12-20

    Inventor: Ruxin Chen

    Abstract: Methods, devices, and computer programs for augmenting a virtual reality scene with real world content are provided. One example method includes an operation for obtaining sensor data from an HMD of a user to determine that a criteria is met to overlay one or more real world objects into the virtual reality scene to provide an augmented virtual reality scene. In certain examples, the criteria corresponds to predetermined indicators suggestive of disorientation of a user when wearing the HMD and being presented a virtual reality scene. In certain other examples, the one or more real world objects are selected based on their effectiveness at reorienting a disoriented user.

    DEEP REINFORCEMENT LEARNING FRAMEWORK FOR CHARACTERIZING VIDEO CONTENT

    公开(公告)号:US20210124930A1

    公开(公告)日:2021-04-29

    申请号:US17141028

    申请日:2021-01-04

    Abstract: Methods and systems for performing sequence level prediction of a video scene are described. Video information in a video scene is represented as a sequence of features depicted each frame. One or more scene affective labels are provided at the end of the sequence. Each label pertains to the entire sequence of frames of data. An action is taken with an agent controlled by a machine learning algorithm for a current frame of the sequence at a current time step. An output of the action represents affective label prediction for the frame at the current time step. A pool of actions taken up until the current time step including the action taken with the agent is transformed into a predicted affective history for a subsequent time step. A reward is generated on predicted actions up to the current time step by comparing the predicted actions against corresponding annotated scene affective labels.

    UAV battery form factor and insertion/ejection methodologies

    公开(公告)号:US10850838B2

    公开(公告)日:2020-12-01

    申请号:US15394473

    申请日:2016-12-29

    Abstract: The present disclosure is related to unmanned aerial vehicles or drones that have a capability of quickly swapping batteries. This may be accomplished even as the drone continues to fly. A drone consistent with the present disclosure may drop one battery and pickup another using an attachment mechanism. Attachment mechanisms of the present disclosure may include electro-magnets, mechanical actuators, pins, or hooks. Systems consistent with the present disclosure may also include locations where replacement batteries may be provided to aircraft via actuation devices coupled to a physical location.

    Robot Utility and Interface Device
    10.
    发明申请

    公开(公告)号:US20190099681A1

    公开(公告)日:2019-04-04

    申请号:US15721673

    申请日:2017-09-29

    Abstract: Methods and systems are provided for providing real world assistance by a robot utility and interface device (RUID) are provided. A method provides for identifying a position of a user in a physical environment and a surface within the physical environment for projecting an interactive interface. The method also provides for moving to a location within the physical environment based on the position of the user and the surface for projecting the interactive interface. Moreover, the method provides for capturing a plurality of images of the interactive interface while the interactive interface is being interacted with by the use and for determining a selection of an input option made by the user.

Patent Agency Ranking