-
公开(公告)号:US11875590B2
公开(公告)日:2024-01-16
申请号:US18068519
申请日:2022-12-19
Applicant: Microsoft Technology Licensing, LLC
Inventor: Itzik Malkiel , Dvir Ginzburg , Noam Koenigstein , Oren Barkan , Nir Nice
IPC: G06V30/418 , G06V10/75 , G06F18/2113 , G06F18/21
CPC classification number: G06V30/418 , G06F18/2113 , G06F18/2178 , G06V10/751
Abstract: Examples provide a self-supervised language model for document-to-document similarity scoring and ranking long documents of arbitrary length in an absence of similarity labels. In a first stage of a two-staged hierarchical scoring, a sentence similarity matrix is created for each paragraph in the candidate document. A sentence similarity score is calculated based on the sentence similarity matrix. In the second stage, a paragraph similarity matrix is constructed based on aggregated sentence similarity scores associated with the first candidate document. A total similarity score for the document is calculated based on the normalize the paragraph similarity matrix for each candidate document in a collection of documents. The model is trained using a masked language model and intra-and-inter document sampling. The documents are ranked based on the similarity scores for the documents.
-
公开(公告)号:US11532147B2
公开(公告)日:2022-12-20
申请号:US17084468
申请日:2020-10-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Oren Barkan , Omri Armstrong , Ori Katz , Noam Koenigstein
Abstract: A diagnostic tool for deep learning similarity models and image classifiers provides valuable insight into neural network decision-making. A disclosed solution generates a saliency map by: receiving a baseline image and a test image; determining, with a convolutional neural network (CNN), a first similarity between the baseline image and the test image; based on at least determining the first similarity, determining, for the test image, a first activation map for at least one CNN layer; based on at least determining the first similarity, determining, for the test image, a first gradient map for the at least one CNN layer; and generating a first saliency map as an element-wise function of the first activation map and the first gradient map. Some examples further determine a region of interest (ROI) in the first saliency map, cropping the test image to an area corresponding to the ROI, and determine a refined similarity score.
-
公开(公告)号:US12223274B2
公开(公告)日:2025-02-11
申请号:US17452818
申请日:2021-10-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Oren Barkan , Avi Caciularu , Idan Rejwan , Yonathan Weill , Noam Koenigstein , Ori Katz , Itzik Malkiel , Nir Nice
IPC: G06F40/295 , G06F16/28 , G06N7/01 , G06N20/00
Abstract: A relational similarity determination engine receives as input a dataset including a set of entities and co-occurrence data that defines co-occurrence relations for pairs of the entities. The relational similarity determination engine also receives as input side information defining explicit relations between the entities. The relational similarity determination engine jointly models the co-occurrence relations and the explicit relations for the entities to compute a similarity metric for each different pair of entities within the dataset. Based on the computed similarity metrics, the relational similarity determination engine identifies a most similar replacement entity from the dataset for each of the entities within the dataset. For a select entity received as an input, the relational similarity determination engine outputs the identified most similar replacement entity.
-
公开(公告)号:US11580764B2
公开(公告)日:2023-02-14
申请号:US17354333
申请日:2021-06-22
Applicant: Microsoft Technology Licensing, LLC
Inventor: Itzik Malkiel , Dvir Ginzburg , Noam Koenigstein , Oren Barkan , Nir Nice
IPC: G06V30/418 , G06K9/62 , G06V10/75
Abstract: Examples provide a self-supervised language model for document-to-document similarity scoring and ranking long documents of arbitrary length in an absence of similarity labels. In a first stage of a two-staged hierarchical scoring, a sentence similarity matrix is created for each paragraph in the candidate document. A sentence similarity score is calculated based on the sentence similarity matrix. In the second stage, a paragraph similarity matrix is constructed based on aggregated sentence similarity scores associated with the first candidate document. A total similarity score for the document is calculated based on the normalize the paragraph similarity matrix for each candidate document in a collection of documents. The model is trained using a masked language model and intra-and-inter document sampling. The documents are ranked based on the similarity scores for the documents.
-
公开(公告)号:US20170344635A1
公开(公告)日:2017-11-30
申请号:US15169305
申请日:2016-05-31
Applicant: Microsoft Technology Licensing, LLC
Inventor: Noam Koenigstein , Nir Nice , Shay Ben Elazar , Yehiel Berezin , Oren Barkan , Tal Zaccai , Shimon Shlevich , Nimrod Ben Simhon , Paul Nogues , Gal Lavee
IPC: G06F17/30
CPC classification number: G06F17/30772 , G06F17/30053 , G06F17/30749
Abstract: A playlist generator that utilizes multiple data sources to rank each track within a set of candidate tracks to enable selection of candidate tracks according to the ranking. Candidate tracks are each scored according to one or more features, such as acoustic similarity and/or similar usage patterns of the candidate track or artist of the candidate track to a current or previously played track or artist. Each feature is weighted according to historical listening patterns surrounding a user-selected playlist seed artist. The weighting may also be further corrected according to historical listening patterns of the particular user. When historical usage data related to a particular seed artist is limited, more generalized historical usage data related to a higher level in a genre hierarchy may be used.
-
公开(公告)号:US11769315B2
公开(公告)日:2023-09-26
申请号:US18052568
申请日:2022-11-03
Applicant: Microsoft Technology Licensing, LLC
Inventor: Oren Barkan , Omri Armstrong , Ori Katz , Noam Koenigstein
IPC: G06V10/46 , G06N3/08 , G06V10/25 , G06F18/22 , G06F18/2113
CPC classification number: G06V10/464 , G06F18/2113 , G06F18/22 , G06N3/08 , G06V10/25
Abstract: A diagnostic tool for deep learning similarity models and image classifiers provides valuable insight into neural network decision-making. A disclosed solution generates a saliency map by: receiving a baseline image and a test image; determining, with a convolutional neural network (CNN), a first similarity between the baseline image and the test image; based on at least determining the first similarity, determining, for the test image, a first activation map for at least one CNN layer; based on at least determining the first similarity, determining, for the test image, a first gradient map for the at least one CNN layer; and generating a first saliency map as an element-wise function of the first activation map and the first gradient map. Some examples further determine a region of interest (ROI) in the first saliency map, cropping the test image to an area corresponding to the ROI, and determine a refined similarity score.
-
公开(公告)号:US11373095B2
公开(公告)日:2022-06-28
申请号:US16725652
申请日:2019-12-23
Applicant: Microsoft Technology Licensing, LLC
Inventor: Oren Barkan , Noam Razin , Noam Koenigstein , Roy Hirsch , Nir Nice
Abstract: Machine learning multiple features of an item depicted in images. Upon accessing multiple images that depict the item, a neural network is used to machine train on the plurality of images to generate embedding vectors for each of multiple features of the item. For each of multiple features of the item depicted in the images, in each iteration of the machine learning, the embedding vector is converted into a probability vector that represents probabilities that the feature has respective values. That probability vector is then compared with a value vector representing the actual value of that feature in the depicted item, and an error between the two vectors is determined. That error is used to adjust parameters of the neural network used to generate the embedding vector, allowing for the next iteration in the generation of the embedding vectors. These iterative changes continue thereby training the neural network.
-
8.
公开(公告)号:US10825072B2
公开(公告)日:2020-11-03
申请号:US15432595
申请日:2017-02-14
Applicant: Microsoft Technology Licensing, LLC
Inventor: Oren Barkan , Yael Brumer , Noam Koenigstein , Ilona Kifer
Abstract: Aspects disclosed herein may utilize neural embedding techniques to model session activity. A dataset may be collected from on online market place, such as an app store. The data set may include one or more user sessions comprising sequential click actions and/or item purchases. Models may be generated to represent session activity and, therefore, may be utilized for contextual recommendations of apps in an online app store. As such, the various aspects disclosed herein may also generate purchase predictions based on click-purchase relations in a sequence. The item similarities and purchase predictions may be used to provide real-time aid to users navigating an online marketplace.
-
公开(公告)号:US10242098B2
公开(公告)日:2019-03-26
申请号:US15169305
申请日:2016-05-31
Applicant: Microsoft Technology Licensing, LLC
Inventor: Noam Koenigstein , Nir Nice , Shay Ben Elazar , Yehiel Berezin , Oren Barkan , Tal Zaccai , Shimon Shlevich , Nimrod Ben Simhon , Paul Nogues , Gal Lavee
IPC: G06F17/30
Abstract: A playlist generator that utilizes multiple data sources to rank each track within a set of candidate tracks to enable selection of candidate tracks according to the ranking. Candidate tracks are each scored according to one or more features, such as acoustic similarity and/or similar usage patterns of the candidate track or artist of the candidate track to a current or previously played track or artist. Each feature is weighted according to historical listening patterns surrounding a user-selected playlist seed artist. The weighting may also be further corrected according to historical listening patterns of the particular user. When historical usage data related to a particular seed artist is limited, more generalized historical usage data related to a higher level in a genre hierarchy may be used.
-
公开(公告)号:US12153651B2
公开(公告)日:2024-11-26
申请号:US17452961
申请日:2021-10-29
Applicant: Microsoft Technology Licensing, LLC
Inventor: Oren Barkan , Omri Armstrong , Amir Hertz , Avi Caciularu , Ori Katz , Itzik Malkiel , Noam Koenigstein , Nir Nice
IPC: G06F18/21 , G06F18/213 , G06N3/08 , G06V10/46
Abstract: A method of generating an aggregate saliency map using a convolutional neural network. Convolutional activation maps of the convolutional neural network model are received into a saliency map generator, the convolutional activation maps being generated by the neural network model while computing the one or more prediction scores based on unlabeled input data. Each convolutional activation map corresponds to one of the multiple encoding layers. The saliency map generator generates a layer-dependent saliency map for each encoding layer of the unlabeled input data, each layer-dependent saliency map being based on a summation of element-wise products of the convolutional activation maps and their corresponding gradients. The layer-dependent saliency maps are combined into the aggregate saliency map indicating the relative contributions of individual components of the unlabeled input data to the one or more prediction scores computed by the convolutional neural network model on the unlabeled input data.
-
-
-
-
-
-
-
-
-