AI-assisted sound effect generation for silent video

    公开(公告)号:US11381888B2

    公开(公告)日:2022-07-05

    申请号:US16848512

    申请日:2020-04-14

    Abstract: Sound effect recommendations for visual input are generated by training machine learning models that learn coarse-grained and fine-grained audio-visual correlations from a reference visual, a positive audio signal, and a negative audio signal. A trained Sound Recommendation Network is configured to output an audio embedding and a visual embedding and use the audio embedding and visual embedding to compute a correlation distance between an image frame or video segment and one or more audio segments retrieved from a database. The correlation distances for the one or more audio segments in the database are sorted and one or more audio segments with the closest correlation distance from the sorted audio correlation distances are determined. The audio segment with the closest audio correlation distance is applied to the input image frame or video segment.

    CLUSTERING AUDIENCE BASED ON EXPRESSIONS CAPTURED FROM DIFFERENT SPECTATORS OF THE AUDIENCE

    公开(公告)号:US20220168644A1

    公开(公告)日:2022-06-02

    申请号:US17220709

    申请日:2021-04-01

    Abstract: Methods and systems for representing emotions of an audience of spectators viewing online gaming of a video game include capturing interaction data from spectators in an audience engaged in watching gameplay of the video game. The captured interaction data is aggregated by clustering the spectators into different groups in accordance to emotions detected from the spectators in the audience. An avatar is generated to represent emotion of each group and expressions of the avatar are dynamically adjusted to match changes in the expressions of the spectators of the respective group. The avatars representing distinct emotions of different group of spectators is presented alongside content of the video game. A size of the avatar for each distinct emotion is influenced by the confidence score associated with the respective group of spectators.

    SELF-SUPERVISED AI-ASSISTED SOUND EFFECT RECOMMENDATION FOR SILENT VIDEO

    公开(公告)号:US20210319321A1

    公开(公告)日:2021-10-14

    申请号:US16848484

    申请日:2020-04-14

    Abstract: Sound effect recommendations for visual input are generated by training machine learning models that learn coarse-grained and fine-grained audio-visual correlations from a reference image, a positive audio signals, and a negative audio signal. A positive audio embedding related to the reference image is generated from the positive audio signal and a negative audio embedding is generated from a negative audio signal. A machine learning algorithm uses the reference image, the positive audio embedding and the negative audio embedding as inputs to train a visual-to-audio correlation neural network to output a smaller distance between the positive audio embedding and the reference image than the negative audio embedding and the reference image.

    Mapping visual tags to sound tags using text similarity

    公开(公告)号:US11030479B2

    公开(公告)日:2021-06-08

    申请号:US16399640

    申请日:2019-04-30

    Abstract: Sound effects (SFX) are registered in a database for efficient search and retrieval. This may be accomplished by classifying SFX and using a machine learning engine to output a first of the classified SFX for a first computer simulation based on learned correlations between video attributes of the first computer simulation and the classified SFX. Subsequently, videos without sound may be processed for object, action, and caption recognition to generate video tags which are semantically matched with SFX tags to associate SFX with the video.

    Personalized data driven game training system

    公开(公告)号:US10828567B2

    公开(公告)日:2020-11-10

    申请号:US16173784

    申请日:2018-10-29

    Abstract: A video game console, a video game system, and a computer-implemented method are described. Generally, a video game and video game assistance are adapted to a player. For example, a narrative of the video game is personalized to an experience level of the player. Similarly, assistance in interacting with a particular context of the video game is also personalized. The personalization learns from historical interactions of players with the video game and, optionally, other video games. In an example, a deep learning neural network is implemented to generate knowledge from the historical interactions. The personalization is set according to the knowledge.

    MAPPING VISUAL TAGS TO SOUND TAGS USING TEXT SIMILARITY

    公开(公告)号:US20200349387A1

    公开(公告)日:2020-11-05

    申请号:US16399640

    申请日:2019-04-30

    Abstract: Sound effects (SFX) are registered in a database for efficient search and retrieval. This may be accomplished by classifying SFX and using a machine learning engine to output a first of the classified SFX for a first computer simulation based on learned correlations between video attributes of the first computer simulation and the classified SFX. Subsequently, videos without sound may be processed for object, action, and caption recognition to generate video tags which are semantically matched with SFX tags to associate SFX with the video.

    PERSONALIZED DATA DRIVEN GAME TRAINING SYSTEM

    公开(公告)号:US20190060759A1

    公开(公告)日:2019-02-28

    申请号:US16173755

    申请日:2018-10-29

    CPC classification number: A63F13/5375 A63F13/422 A63F13/67 A63F13/79

    Abstract: A video game console, a video game system, and a computer-implemented method are described. Generally, a video game and video game assistance are adapted to a player. For example, a narrative of the video game is personalized to an experience level of the player. Similarly, assistance in interacting with a particular context of the video game is also personalized. The personalization learns from historical interactions of players with the video game and, optionally, other video games. In an example, a deep learning neural network is implemented to generate knowledge from the historical interactions. The personalization is set according to the knowledge.

    TRAINING A SOUND EFFECT RECOMMENDATION NETWORK

    公开(公告)号:US20250165789A1

    公开(公告)日:2025-05-22

    申请号:US19013693

    申请日:2025-01-08

    Abstract: A Sound effect recommendation network is trained using a machine learning algorithm with a reference image, a positive audio embedding and a negative audio embedding as inputs to train a visual-to-audio correlation neural network to output a smaller distance between the positive audio embedding and the reference image than the negative audio embedding and the reference image. The visual-to-audio correlation neural network is trained to identify one or more visual elements in the reference image and map the one or more visual elements to one or more sound categories or subcategories within an audio database.

Patent Agency Ranking