-
公开(公告)号:US11219837B2
公开(公告)日:2022-01-11
申请号:US15721673
申请日:2017-09-29
Applicant: Sony Interactive Entertainment Inc.
Inventor: Javier Fernandez Rico , Erik Beran , Michael Taylor , Ruxin Chen
IPC: G06F3/00 , A63F13/90 , B25J5/00 , H04N9/31 , B25J11/00 , B25J15/10 , B25J9/04 , B25J15/04 , A63F13/213 , B25J13/08 , G06F3/01 , G06F3/042 , G06F3/0481 , G06F3/16
Abstract: Methods and systems are provided for providing real world assistance by a robot utility and interface device (RUID) are provided. A method provides for identifying a position of a user in a physical environment and a surface within the physical environment for projecting an interactive interface. The method also provides for moving to a location within the physical environment based on the position of the user and the surface for projecting the interactive interface. Moreover, the method provides for capturing a plurality of images of the interactive interface while the interactive interface is being interacted with by the use and for determining a selection of an input option made by the user.
-
32.
公开(公告)号:US10657701B2
公开(公告)日:2020-05-19
申请号:US15404074
申请日:2017-01-11
Applicant: Sony Interactive Entertainment Inc.
Inventor: Steven Osman , Javier Fernandez Rico , Ruxin Chen
IPC: G06T15/20 , G06F1/16 , A63F13/847 , A63F13/48 , A63F13/795 , A63F13/5255 , A63F13/25 , A63F13/56 , G02B27/01 , G06F3/01 , G06T7/70 , G06F3/03 , G06T13/40 , G06T19/00 , G02B27/00 , A63F13/26 , A63F13/21 , A63F13/86
Abstract: Systems and methods for processing operations for head mounted display (HMD) users to join virtual reality (VR) scenes are provided. A computer-implemented method includes providing a first perspective of a VR scene to a first HMD of a first user and receiving an indication that a second user is requesting to join the VR scene provided to the first HMD. The method further includes obtaining real-world position and orientation data of the second HMD relative to the first HMD and then providing, based on said data, a second perspective of the VR scene. The method also provides that the first and second perspectives are each controlled by respective position and orientation changes while viewing the VR scene.
-
公开(公告)号:US10376785B2
公开(公告)日:2019-08-13
申请号:US15199384
申请日:2016-06-30
Applicant: Sony Interactive Entertainment Inc.
Inventor: Gustavo Hernandez-Abrego , Xavier Menendez-Pidal , Steven Osman , Ruxin Chen , Rishi Deshpande , Care Michaud-Wideman , Richard Marks , Eric J. Larsen , Xiaodong Mao
IPC: A63F13/428 , G10L17/04 , A63F13/213 , A63F13/217 , G06F3/01 , G10L17/00 , G06F3/00
Abstract: Consumer electronic devices have been developed with enormous information processing capabilities, high quality audio and video outputs, large amounts of memory, and may also include wired and/or wireless networking capabilities. Additionally, relatively unsophisticated and inexpensive sensors, such as microphones, video camera, GPS or other position sensors, when coupled with devices having these enhanced capabilities, can be used to detect subtle features about users and their environments. A variety of audio, video, simulation and user interface paradigms have been developed to utilize the enhanced capabilities of these devices. These paradigms can be used separately or together in any combination. One paradigm automatically creating user identities using speaker identification. Another paradigm includes a control button with 3-axis pressure sensitivity for use with game controllers and other input devices.
-
公开(公告)号:US20190163977A1
公开(公告)日:2019-05-30
申请号:US16171018
申请日:2018-10-25
Applicant: Sony Interactive Entertainment Inc.
Inventor: Ruxin Chen , Naveen Kumar , Haoqi Li
Abstract: Methods and systems for performing sequence level prediction of a video scene are described. Video information in a video scene is represented as a sequence of features depicted each frame. An environment state for each time step t corresponding to each frame is represented by the video information for time step t and predicted affective information from a previous time step t−1. An action A(t) as taken with an agent controlled by a machine learning algorithm for the frame at step t, wherein an output of the action A(t) represents affective label prediction for the frame at the time step t. A pool of predicted actions is transformed to a predicted affective history at a next time step t+1. The predictive affective history is included as part of the environment state for the next time step t+1. A reward R is generated on predicted actions up to the current time step t, by comparing them against corresponding annotated movie scene affective labels.
-
公开(公告)号:US20180095463A1
公开(公告)日:2018-04-05
申请号:US15394391
申请日:2016-12-29
Applicant: SONY INTERACTIVE ENTERTAINMENT INC.
Inventor: Dennis Dale Castleman , Ruxin Chen , Frank Zhao , Glenn Black
CPC classification number: G11B27/102 , B64C39/024 , B64C2201/127 , B64C2201/141 , B64C2201/146 , G05D1/0022 , G06F3/04847
Abstract: A flight path management system manages flight paths for an unmanned aerial vehicle (UAV). The flight path management system receives a sequence of controller inputs for the UAV, and stores the sequence of controller inputs in a memory. The flight path management system accesses the memory and selects a selected section of the sequence of controller inputs corresponding to a time period. The flight management system outputs the selected section to a playback device in real time over a length of the time period.
-
公开(公告)号:US11568265B2
公开(公告)日:2023-01-31
申请号:US15684830
申请日:2017-08-23
Applicant: Sony Interactive Entertainment Inc.
Inventor: Michael Taylor , Javier Fernandez-Rico , Sergey Bashkirov , Jaekwon Yoo , Ruxin Chen
Abstract: An autonomous personal companion executing a method including capturing data related to user behavior. Patterns of user behavior are identified in the data and classified using predefined patterns associated with corresponding predefined tags to generate a collected set of one or more tags. The collected set is compared to sets of predefined tags of a plurality of scenarios, each to one or more predefined patterns of user behavior and a corresponding set of predefined tags. A weight is assigned to each of the sets of predefined tags, wherein each weight defines a corresponding match quality between the collected set of tags and a corresponding set of predefined tags. The sets of predefined tags are sorted by weight in descending order. A matched scenario is selected for the collected set of tags that is associated with a matched set of predefined tags having a corresponding weight having the highest match quality.
-
公开(公告)号:US11386657B2
公开(公告)日:2022-07-12
申请号:US17141028
申请日:2021-01-04
Applicant: Sony Interactive Entertainment Inc.
Inventor: Ruxin Chen , Naveen Kumar , Haoqi Li
Abstract: Methods and systems for performing sequence level prediction of a video scene are described. Video information in a video scene is represented as a sequence of features depicted each frame. One or more scene affective labels are provided at the end of the sequence. Each label pertains to the entire sequence of frames of data. An action is taken with an agent controlled by a machine learning algorithm for a current frame of the sequence at a current time step. An output of the action represents affective label prediction for the frame at the current time step. A pool of actions taken up until the current time step including the action taken with the agent is transformed into a predicted affective history for a subsequent time step. A reward is generated on predicted actions up to the current time step by comparing the predicted actions against corresponding annotated scene affective labels.
-
公开(公告)号:US20200175053A1
公开(公告)日:2020-06-04
申请号:US16206439
申请日:2018-11-30
Applicant: Sony Interactive Entertainment Inc.
Inventor: Jian Zheng , Ruxin Chen
IPC: G06F16/383 , G06N5/04 , G06K9/32 , G06F16/583
Abstract: For image captioning such as for computer game images or other images, bottom-up attention is combined with top-down attention to provide a multi-level residual attention-based image captioning model. A residual attention mechanism is first applied in the Faster R-CNN network to learn better feature representations for each region by taking spatial information into consideration. In the image captioning network, taking the extracted regional features as input, a second residual attention network is implemented to fuse the regional features attentionally for subsequent caption generation.
-
公开(公告)号:US20190065960A1
公开(公告)日:2019-02-28
申请号:US15684830
申请日:2017-08-23
Applicant: Sony Interactive Entertainment Inc.
Inventor: Michael Taylor , Javier Fernandez-Rico , Sergey Bashkirov , Jaekwon Yoo , Ruxin Chen
IPC: G06N3/08
Abstract: An autonomous personal companion executing a method including capturing data related to user behavior. Patterns of user behavior are identified in the data and classified using predefined patterns associated with corresponding predefined tags to generate a collected set of one or more tags. The collected set is compared to sets of predefined tags of a plurality of scenarios, each to one or more predefined patterns of user behavior and a corresponding set of predefined tags. A weight is assigned to each of the sets of predefined tags, wherein each weight defines a corresponding match quality between the collected set of tags and a corresponding set of predefined tags. The sets of predefined tags are sorted by weight in descending order. A matched scenario is selected for the collected set of tags that is associated with a matched set of predefined tags having a corresponding weight having the highest match quality.
-
公开(公告)号:US20190013015A1
公开(公告)日:2019-01-10
申请号:US15645985
申请日:2017-07-10
Applicant: Sony Interactive Entertainment Inc.
Inventor: Xavier Menendez-Pidal , Ruxin Chen
CPC classification number: G10L15/144 , G10L15/02 , G10L15/063 , G10L15/16 , G10L2015/025
Abstract: A method for improved initialization of speech recognition system comprises mapping a trained hidden markov model based recognition node network (HMM) to a Connectionist Temporal Classification (CTC) based node label scheme. The central state of each frame in the HMM are mapped to CTC-labeled output nodes and the non-central states of each frame are mapped to CTC-blank nodes to generate a CTC-labeled HMM and each central state represents a phoneme from human speech detected and extracted by a computing device. Next the CTC-labeled HMM is trained using a cost function, wherein the cost function is not part of a CTC cost function. Finally the CTC-labeled HMM is trained using a CTC cost function to produce a CTC node network. The CTC node network may be iteratively trained by repeating the initialization steps.
-
-
-
-
-
-
-
-
-