Abstract:
A system trains a machine learning model to generate a high-resolution depth image. During a training phase, the system generates an accurate three dimensional reconstruction of a training scene such that the machine learning model is iteratively trained to minimize an error between the higher resolution depth image and the depth information in the accurate three dimensional reconstruction. During a real-time phase, the system applies the trained machine learning model to images captured from a scene of interest and generates a higher resolution depth image with higher accuracy. Thus, the higher resolution depth image can be subsequently used to solve computer vision problems.
Abstract:
In one embodiment, a method includes accessing a point cloud comprising several points, wherein each point corresponds to a location on a surface of an object located in three-dimensional space; determining whether each point in the point cloud is part of a linear structure, a planar structure, or a volumetric structure; identifying a plurality of point clusters, wherein each point cluster comprises one or more points that are located within a grid segment on a two-dimensional grid derived from the three-dimensional space; determining, for each point cluster, whether the point cluster represents a vertical-linear structure or a portion of a vertical-linear structure; identifying one or more point-cluster pairs, wherein each point-cluster pair includes two point clusters corresponding to one or more vertical-linear structures within a threshold distance in the three-dimensional space; and determining, for each point-cluster pair, whether a line-of-sight exists between each point-cluster in the point-cluster pair.
Abstract:
The techniques introduced here include a system and method for transcoding multimedia content based on the results of content analysis. The determination of specific transcoding parameters, used for transcoding multimedia content, can be performed by utilizing the results of content analysis of the multimedia content. One of the results of the content analysis is the determination of image type of any images included in the multimedia content. The content analysis uses one or more of several techniques, including analyzing content metadata, examining colors of contiguous pixels in the content, using histogram analysis, using compression distortion analysis, analyzing image edges, or examining user provided inputs. Transcoding the multimedia content can include adapting the content to the constraints in delivery and display, processing and storage of user computing devices.
Abstract:
In one embodiment, a method includes detecting one or more objects in an image, generating at least one mask for each of the detected objects, wherein each of the masks is defined by a perimeter, classifying the detected objects, receiving gesture input in relation to the image, determining whether one or more locations associated with the gesture input correlate with any of the masks, and providing feedback regarding the image in response to the gesture input. Each of the masks may include data identifying the corresponding detected object, and the perimeter of each mask may correspond to a perimeter of the corresponding detected object. The perimeter of the corresponding detected object may separate the detected object from one or more portions of the image that are distinct from the detected object.
Abstract:
In one embodiment, a method includes detecting one or more objects in an image, generating at least one mask for each of the detected objects, wherein each of the masks is defined by a perimeter, classifying the detected objects, receiving gesture input in relation to the image, determining whether one or more locations associated with the gesture input correlate with any of the masks, and providing feedback regarding the image in response to the gesture input. Each of the masks may include data identifying the corresponding detected object, and the perimeter of each mask may correspond to a perimeter of the corresponding detected object. The perimeter of the corresponding detected object may separate the detected object from one or more portions of the image that are distinct from the detected object.
Abstract:
In one embodiment, a method may include receiving a first content item. A first embedding of the first content item may be determined and may corresponds to a first point in an embedding space. The embedding space may include a plurality of second points corresponding to a plurality of second embeddings of second content items. The embeddings are determined using a deep-learning model. The points are located in one or more clusters in the embedding space, which are each associated with a class of content items. Locations of points within clusters may be based on one or more attributes of the respective corresponding content items. Second content items that are similar to the first content item may be identified based on the locations of the first point and the second points and on particular clusters that the second points corresponding to the identified second content items are located in.
Abstract:
The techniques introduced here include a system and method for transcoding multimedia content based on the results of content analysis. The determination of specific transcoding parameters, used for transcoding multimedia content, can be performed by utilizing the results of content analysis of the multimedia content. One of the results of the content analysis is the determination of image type of any images included in the multimedia content. The content analysis uses one or more of several techniques, including analyzing content metadata, examining colors of contiguous pixels in the content, using histogram analysis, using compression distortion analysis, analyzing image edges, or examining user provided inputs. Transcoding the multimedia content can include adapting the content to the constraints in delivery and display, processing and storage of user computing devices.
Abstract:
The techniques introduced here include a system and method for transcoding multimedia content based on the results of content analysis. The determination of specific transcoding parameters, used for transcoding multimedia content, can be performed by utilizing the results of content analysis of the multimedia content. One of the results of the content analysis is the determination of image type of any images included in the multimedia content. The content analysis uses one or more of several techniques, including analyzing content metadata, examining colors of contiguous pixels in the content, using histogram analysis, using compression distortion analysis, analyzing image edges, or examining user provided inputs. Transcoding the multimedia content can include adapting the content to the constraints in delivery and display, processing and storage of user computing devices.
Abstract:
The techniques introduced here include a system and method for transcoding multimedia content based on the results of content analysis. The determination of specific transcoding parameters, used for transcoding multimedia content, can be performed by utilizing the results of content analysis of the multimedia content. One of the results of the content analysis is the determination of image type of any images included in the multimedia content. The content analysis uses one or more of several techniques, including analyzing content metadata, examining colors of contiguous pixels in the content, using histogram analysis, using compression distortion analysis, analyzing image edges, or examining user provided inputs. Transcoding the multimedia content can include adapting the content to the constraints in delivery and display, processing and storage of user computing devices.
Abstract:
In one embodiment, a method includes detecting one or more objects in an image, generating at least one mask for each of the detected objects, wherein each of the masks is defined by a perimeter, classifying the detected objects, receiving gesture input in relation to the image, determining whether one or more locations associated with the gesture input correlate with any of the masks, and providing feedback regarding the image in response to the gesture input. Each of the masks may include data identifying the corresponding detected object, and the perimeter of each mask may correspond to a perimeter of the corresponding detected object. The perimeter of the corresponding detected object may separate the detected object from one or more portions of the image that are distinct from the detected object.