Abstract:
Disclosed are a method and a device for downloading a file. The method includes: when receiving a download request of a file, acquiring attribute information about the file, and learning information about the length of the file according to the attribute information about the file; when the length of the file goes beyond a preset value, segmenting the download request into at least two fragments of download requests; sending the at least two fragments of download requests to at least two data nodes to request to download corresponding fragments, and obtaining the at least two fragments; and according to the at least two fragments obtained by download, obtaining the file. By segmenting the download request into a plurality of fragments of requests, the present invention achieves the parallel download of the plurality of fragments of requests, thereby greatly improving the download efficiency of a file.
Abstract:
A parallel data processing method based on multiple graphic processing units (GPUs) is provided, including: creating, in a central processing unit (CPU), a plurality of worker threads for controlling a plurality of worker groups respectively, the worker groups including a plurality of GPUs; binding each worker thread to a corresponding GPU; loading one batch of training data from a nonvolatile memory to a GPU video memory corresponding to one worker group; transmitting, between a plurality of GPUs corresponding to one worker group, data required by data processing performed by the GPUs through peer to peer; and controlling the plurality of GPUs to perform data processing in parallel through the worker threads.
Abstract:
A parallel data processing method based on multiple graphic processing units (GPUs) is provided, including: creating, in a central processing unit (CPU), a plurality of worker threads for controlling a plurality of worker groups respectively, the worker groups including one or more GPUs; binding each worker thread to a corresponding GPU; loading a plurality of batches of training data from a nonvolatile memory to GPU video memories in the plurality of worker groups; and controlling the plurality of GPUs to perform data processing in parallel through the worker threads. The method can enhance efficiency of multi-GPU parallel data processing. In addition, a parallel data processing apparatus is further provided.
Abstract:
Disclosed are a method and a device for downloading a file. The method includes: when receiving a download request of a file, acquiring attribute information about the file, and learning information about the length of the file according to the attribute information about the file; when the length of the file goes beyond a preset value, segmenting the download request into at least two fragments of download requests; sending the at least two fragments of download requests to at least two data nodes to request to download corresponding fragments, and obtaining the at least two fragments; and according to the at least two fragments obtained by download, obtaining the file. By segmenting the download request into a plurality of fragments of requests, the present invention achieves the parallel download of the plurality of fragments of requests, thereby greatly improving the download efficiency of a file.
Abstract:
A method and device for processing data in the field of data process are disclosed. The method includes: sorting samples according to primary keys, wherein the primary key includes a feature serial number and a sample serial number, and wherein a column value corresponding to the primary key is used as a feature value for the sample; acquiring a statistic of each feature in each category by taking the primary key and the feature value as an input key-value pair and calculating with a first algorithm model, and outputting the feature serial number and the statistic as an output key-value pair; and acquiring a contribution value of each feature to the category by performing calculation on the output key-value pair with a second algorithm model, and selecting a feature based on the contribution value. The device includes a sorting module, a first processing module and a second processing module.
Abstract:
A parallel data processing method based on multiple graphic processing units (GPUs) is provided, including: creating, in a central processing unit (CPU), a plurality of worker threads for controlling a plurality of worker groups respectively, the worker groups including one or more GPUs; binding each worker thread to a corresponding GPU; loading a plurality of batches of training data from a nonvolatile memory to GPU video memories in the plurality of worker groups; and controlling the plurality of GPUs to perform data processing in parallel through the worker threads. The method can enhance efficiency of multi-GPU parallel data processing. In addition, a parallel data processing apparatus is further provided.
Abstract:
A parallel data processing method based on multiple graphic processing units (GPUs) is provided, including: creating, in a central processing unit (CPU), a plurality of worker threads for controlling a plurality of worker groups respectively, the worker groups including a plurality of GPUs; binding each worker thread to a corresponding GPU; loading one batch of training data from a nonvolatile memory to a GPU video memory corresponding to one worker group; transmitting, between a plurality of GPUs corresponding to one worker group, data required by data processing performed by the GPUs through peer to peer; and controlling the plurality of GPUs to perform data processing in parallel through the worker threads.