Abstract:
A recording medium storing a program for causing a computer to execute processing including: acquiring, from a first model trained based on training data in which the first object is labeled in an image, a first portion specifying a region in an image that includes a first object; generating a third model by combining the first portion and a third portion of a second model being a model that includes a second portion and the third portion and that is trained based on training data in which position information regarding the second object is labeled in an image, the second portion being a portion that specifies a region in an image including a second object, the third portion being a portion that determines a position in an image of a specified region; and outputting a detection result of an object by inputting an image to the third model.
Abstract:
A flying machine frame structural body including: a frame that surrounds a flying machine body including a rotating blade, and to which the flying machine body is fixed; and plural wheels that are rotatably supported by the frame.
Abstract:
A non-transitory computer-readable recording medium storing a machine learning program causing a computer to execute a process including: generating vector information based on a first feature of a target included in an image, a second feature of the target, and conversion parameters; and executing training of a machine learning model and update of the conversion parameters by inputting the image and the vector information to the machine learning model, the machine learning model including a first machine learning model portion configured to identify the first feature and a second machine learning model portion configured to identify the second feature.
Abstract:
A method includes: correcting a vector of a first modal by using a correlation between the vector of the first modal and a vector of a second modal different from the first modal; correcting the vector of the second modal by using the correlation between the vector of the first modal and the vector of the second modal; generating a first vector by using a correlation of two different types of vectors obtained from the corrected vector of the first modal; generating a second vector by using the correlation of the two different types of vectors obtained from the corrected vector of the second modal; generating a third vector in which the first and second vectors are aggregated by using the correlation of the two different types of vectors obtained from a combined vector including a predetermined vector, the generated first and second vectors; and outputting the generated third vector.
Abstract:
A computer-implemented method for generating an image involves receiving input text and layout information, decomposing the text into global and local features, and encoding these features into vectors. The method initializes an initial image representation and uses a trained neural network to predict global noise from the global vector and initial image representation. It also predicts initial local noise for each local feature using their respective vectors and the initial image representation. The final local noise for each feature is determined using the initial local noise and the global noise. The predicted global and final local noises are combined based on the layout information. Finally, the initial image representation is denoised using this combined noise to produce the next image representation, ultimately generating the image.
Abstract:
A non-transitory computer-readable storage medium storing a determination model generation program that causes at least one computer to execute a process, the process includes generating, based on first training data in which image data and character string data that corresponds to the image data are associated with each other, second training data by replacing one of data included in the first training data selected from the image data and the character string data with another data; and generating, by using the first training data and the second training data as input data, a determination model that outputs information that indicates which training data selected from the first training data and the second training data is training data in which correspondence between the image data and the character string data is correct.
Abstract:
A non-transitory computer-readable recording medium stores therein an information processing program that causes a computer to execute a process including: acquiring images photographed with a first angle of view and a second angle of view wider than the first angle of view; specifying a position and an attitude of a camera that photographed the image photographed with the first angle of view based on the image photographed with the second angle of view; stitching a plurality of images photographed with the first angle of view to generate a panoramic image; correcting the position on the generated panoramic image, of the image photographed with the first angle of view based on the specified position and attitude of the camera; and mapping the panoramic image to a three-dimensional model by texture mapping based on the corrected position.
Abstract:
An input device, that is worn on a portion of a user's body, includes: a sensor unit configured to obtain angular velocities and an acceleration in a first coordinate system fixed in the input device; a reference posture specification unit configured to generate a second coordinate system for a reference posture of the user; a rotation matrix calculator configured to calculate a rotation matrix that transforms the angular velocities in the first coordinate system into angular velocities in the second coordinate system using the acceleration in the first coordinate system; a characteristic value calculator configured to calculate characteristic values in the second coordinate system using the angular velocities in the second coordinate system; a command specification unit configured to specify a command according to the characteristic values; and a transmitter configured to transmit the command to the controller.
Abstract:
An object detection device includes a processor that executes a procedure. The procedure includes: converting an input image into a first vector such that information related to an area of an object in the image is contained in the first vector; converting input text into a second vector such that information related to an order of appearance in the text of one or more word strings each indicating a detection target object included in the text is contained in the second vector; generating a third vector in which the first vector and the second vector have been reflected in a vector of initial values corresponding to detection target objects; and estimating whether or not a feature indicated by the third vector corresponds to a detection target object that appears at which number place in the text, and estimating a position of the detection target object in the image.
Abstract:
A computer-implemented outputting method including: generating a correction vector that corrects a vector based on information of a first modal on the basis of correlation between the vector based on the information of the first modal and a vector based on information of a second modal; combining the generated correction vector with the vector based on the information of the first modal; compressing the combined vector based on the information of the first modal according to a predetermined rule; performing normalization processing for the compressed vector based on the information of the first modal; and outputting a vector obtained by the normalization processing.