-
公开(公告)号:US20230334320A1
公开(公告)日:2023-10-19
申请号:US17722003
申请日:2022-04-15
Applicant: Microsoft Technology Licensing, LLC
Inventor: Li ZHANG , Youkow HOMMA , Yujing WANG , Min WU , Mao YANG , Ruofei ZHANG , Ting CAO , Wei SHEN
CPC classification number: G06N3/082 , G06N3/0454 , G06N3/10
Abstract: A neural architecture search (NAS) system generates a machine-trained model that satisfies specified real-time latency objectives by selecting among a collection of layer-wise sparse candidate models. In operation, the NAS system selects a parent model from among the candidate models. The NAS system then identifies a particular layer of the parent model, and then determines how the layer is to be mutated, to yield a child model. The NAS system calculates a reward score for the child model based on its latency and accuracy. The NAS system then uses reinforcement learning to update the trainable logic used to perform the mutating based on the reward score. The NAS system repeats the above process a plurality of times. An online application system can use the machine-trained model eventually produced by the NAS system to deliver real-time responses to user queries.
-
2.
公开(公告)号:US20200372103A1
公开(公告)日:2020-11-26
申请号:US16422992
申请日:2019-05-25
Applicant: Microsoft Technology Licensing, LLC
Inventor: Qun LI , Changbo HU , Keng-hao CHANG , Ruofei ZHANG
Abstract: Technologies are described herein that relate to identifying supplemental content items that are related to objects captured in images of webpages. A computing system receives an indication that a client computing device has a webpage displayed thereon that includes an image. The image is provided to a first DNN that is configured to identify a portion of the image that includes an object of a type from amongst a plurality of predefined types. Once the portion of the image is identified, the portion of the image is provided to a plurality of DNNs, with each of the DNNs configured to output a word or phrase that represents a value of a respective attribute of the object. A sequence of words or phrases output by the plurality of DNNs is provided to a search computing system, which identifies a supplemental content item based upon the sequence of words or phrases.
-
公开(公告)号:US20190114348A1
公开(公告)日:2019-04-18
申请号:US15784057
申请日:2017-10-13
Applicant: Microsoft Technology Licensing, LLC
Inventor: Bin GAO , Ruofei ZHANG , Mu-Chu LEE
Abstract: A computer-implemented technique is described herein for providing a digital content item using a generator component. The generator component corresponds to a sequence-to-sequence neural network that is trained using an adversarial generative network (GAN) system. In one approach, the technique involves: receiving a query from a user computing device over a computer network; generating random information; generating a key term using the generator component based on the query and the random information; selecting at least one content item based on the key term; and sending the content item(s) over the computer network to the user computing device.
-
公开(公告)号:US20230334350A1
公开(公告)日:2023-10-19
申请号:US17659318
申请日:2022-04-14
Applicant: Microsoft Technology Licensing, LLC
Inventor: Hua LI , Amit SHARMA , Jian JIAO , Ruofei ZHANG
IPC: G06N7/00 , G06N5/04 , G06Q30/02 , G06F16/2458 , G06N20/00
CPC classification number: G06N7/005 , G06N5/04 , G06Q30/0242 , G06F16/2477 , G06N20/00
Abstract: A computing device including a processor configured to receive data indicating, for a query category within a sampled time period, a matching density defined as a number of matches per query. The processor may generate a structural causal model (SCM) of the data within the sampled time period. The SCM may include a plurality of structural equations. Based at least in part on the plurality of structural equations, the processor may estimate a structural equation error value for the matching density. The processor may update a value of a target SCM output variable to a counterfactual updated value. Based at least in part on the SCM, the counterfactual updated value, and the structural equation error value, the processor may compute a predicted matching density when the target SCM output variable has the counterfactual updated value. The processor may output the predicted matching density.
-
公开(公告)号:US20220100676A1
公开(公告)日:2022-03-31
申请号:US17178385
申请日:2021-02-18
Applicant: Microsoft Technology Licensing, LLC
Inventor: Yu YAN , Jiusheng CHEN , Ruofei ZHANG
IPC: G06F12/122 , G06N3/04 , G06F40/40
Abstract: Systems and methods for dynamically modifying a cache associated with a neural network model of a natural language generator are described. In examples, a neural network model employs a beam search algorithm at a decoder when decoding output and generating predicted output candidates. The decoder utilizes caching techniques to improve a speed at which the neural network operations. When an amount of memory utilized by one or more caches of the neural network model is determined to exceed a threshold memory size, a layer-specific portion of a cache associated with a layer of the neural network model is identified. The identified layer-specific portion of the cache can be deleted when the amount of memory utilized by the cache of the neural network model exceeds the threshold memory size. In examples, data in the cache is deduplicated and/or deleted.
-
6.
公开(公告)号:US20220067030A1
公开(公告)日:2022-03-03
申请号:US17093426
申请日:2020-11-09
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jian JIAO , Xiaodong LIU , Ruofei ZHANG , Jianfeng GAO
Abstract: Knowledge graphs can greatly improve the quality of content recommendation systems. There is a broad variety of knowledge graphs in the domain including clicked user-ad graphs, clicked query-ad graphs, keyword-display URL graphs etc. A hierarchical Transformer model learns entity embeddings in knowledge graphs. The model consists of two different Transformer blocks where the bottom block generates relation-dependent embeddings for the source entity and its neighbors, and the top block aggregates the outputs from the bottom block to produce the target entity embedding. To balance the information from contextual entities and the source entity itself, a masked entity model (MEM) task is combined with a link prediction task in model training.
-
公开(公告)号:US20210334606A1
公开(公告)日:2021-10-28
申请号:US16861162
申请日:2020-04-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Tianchuan DU , Keng-hao CHANG , Ruofei ZHANG , Paul LIU
Abstract: Neural network-based categorization can be improved by incorporating graph neural networks that operate on a graph representing the taxonomy of the categories into which a given input is to be categorized by the neural network based-categorization. The output of a graph neural network, operating on a graph representing the taxonomy of categories, can be combined with the output of a neural network operating upon the input to be categorized, such as through an interaction of multidimensional output data, such as a dot product of output vectors. In such a manner, information conveying the explicit relationships between categories, as defined by the taxonomy, can be incorporated into the categorization. To recapture information, incorporate new information, or reemphasize information a second neural network can also operate upon the input to be categorized, with the output of such a second neural network being merged with the output of the interaction.
-
公开(公告)号:US20240054326A1
公开(公告)日:2024-02-15
申请号:US18278361
申请日:2021-04-12
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kushal DAVE , Deepak SAINI , Arnav Kumar JAIN , Jian JIAO , Amit Kumar Rambachan SINGH , Ruofei ZHANG , Manik VARMA
IPC: G06N3/0464 , G06N3/096
CPC classification number: G06N3/0464 , G06N3/096
Abstract: Systems and methods are provided for learning classifiers for annotating a document with predicted labels under extreme classification where there are over a million labels. The learning includes receiving a joint graph including documents and labels as nodes. Multi-dimensional vector representations of a document (i.e., document representations) are generated based on graph convolution of the joint graph. Each document representation varies an extent of reliance on neighboring nodes to accommodate context. The document representations are feature-transformed using a residual layer. Per-label document representations are generated from the transformed document representations based on neighboring label attention. A classifier is trained for each of over a million labels based on joint learning using training data and the per-label document representation. The trained classifier performs highly efficiently as compared to other classifiers trained using disjoint graphs of documents and labels.
-
公开(公告)号:US20230385315A1
公开(公告)日:2023-11-30
申请号:US18031789
申请日:2020-10-14
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jian JIAO , Yeyun GONG , Nan DUAN , Ruofei ZHANG , Ming ZHOU
IPC: G06F16/33 , G06Q30/0251
CPC classification number: G06F16/3338 , G06Q30/0256 , G06Q30/0254
Abstract: Systems and methods are provided for generating a keyword sequence from an input query. A first text sequence corresponding to an input query may be received and encoded into a source sequence representation using an encoder of a machine learning model. A keyword sentence may then be generated from the source sequence representation using a decoder of the machine learning model. The decoder may generate a modified generation score for a plurality of prediction tokens, wherein the modified generation score is based on the respective prediction token generation score and a maximum generation score for a suffix of each prediction token. The decoder may then select the prediction token of the plurality of prediction tokens based on the modified generation score, and add the selected prediction token to the previously decoded partial hypothesis provided by the decoder.
-
10.
公开(公告)号:US20230267308A1
公开(公告)日:2023-08-24
申请号:US18143430
申请日:2023-05-04
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jian JIAO , Xiaodong LIU , Ruofei ZHANG , Jianfeng GAO
IPC: G06N3/045
CPC classification number: G06N3/045
Abstract: Knowledge graphs can greatly improve the quality of content recommendation systems. There is a broad variety of knowledge graphs in the domain including clicked user-ad graphs, clicked query-ad graphs, keyword-display URL graphs etc. A hierarchical Transformer model learns entity embeddings in knowledge graphs. The model consists of two different Transformer blocks where the bottom block generates relation-dependent embeddings for the source entity and its neighbors, and the top block aggregates the outputs from the bottom block to produce the target entity embedding. To balance the information from contextual entities and the source entity itself, a masked entity model (MEM) task is combined with a link prediction task in model training.
-
-
-
-
-
-
-
-
-