-
公开(公告)号:US10332123B2
公开(公告)日:2019-06-25
申请号:US14837249
申请日:2015-08-27
Applicant: Oracle International Corporation
Inventor: Jeffrey H. Alexander , Stephen Green
IPC: G06F16/30 , G06Q30/00 , G06F11/30 , G06F9/451 , G06F16/332
Abstract: A system performs search and retrieval. The system monitors one or more user interface (“UI”) fields configured to receive text input in a UI. The system determines that the one or more UI fields are being used to enter a textual description, and performs a search on a knowledge base based on document similarity to identify documents that are similar to a portion of the textual description that has already been entered in the one or more UI fields. The system then provides one or more of the documents in a UI field of the UI, and repeats the monitoring, the determining, the performing, and the providing.
-
公开(公告)号:US12106050B2
公开(公告)日:2024-10-01
申请号:US17589662
申请日:2022-01-31
Applicant: Oracle International Corporation
Inventor: Swetasudha Panda , Ariel Kobren , Michael Louis Wick , Stephen Green
IPC: G06F40/279 , G06N20/00
CPC classification number: G06F40/279 , G06N20/00
Abstract: Debiasing pre-trained sentence encoders with probabilistic dropouts may be performed by various systems, services, or applications. A sentence may be received, where the words of the sentence may be provided as tokens to an encoder of a machine learning model. A token-wise correlation using semantic orientation may be determined to determine a bias score for the tokens in the input sentence. A probability of dropout that for tokens in the input sentence may be determined from the bias scores. The machine learning model may be trained or tuned based on the probabilities of dropout for the tokens in the input sentence.
-
公开(公告)号:US20190114319A1
公开(公告)日:2019-04-18
申请号:US15934262
申请日:2018-03-23
Applicant: Oracle International Corporation
Inventor: Jean-Baptiste Tristan , Michael Wick , Stephen Green
Abstract: Embodiments make novel use of random data structures to facilitate streaming inference for a Latent Dirichlet Allocation (LDA) model. Utilizing random data structures facilitates streaming inference by entirely avoiding the need for pre-computation, which is generally an obstacle to many current “streaming” variants of LDA as described above. Specifically, streaming inference—based on an inference algorithm such as Stochastic Cellular Automata (SCA), Gibbs sampling, and/or Stochastic Expectation Maximization (SEM)—is implemented using a count-min sketch to track sufficient statistics for the inference procedure. Use of a count-min sketch avoids the need to know the vocabulary size V a priori. Also, use of a count-min sketch directly enables feature hashing, which addresses the problem of effectively encoding words into indices without the need of pre-computation. Approximate counters are also used within the count-min sketch to avoid bit overflow issues with the counts in the sketch.
-
公开(公告)号:US20240419900A1
公开(公告)日:2024-12-19
申请号:US18817147
申请日:2024-08-27
Applicant: Oracle International Corporation
Inventor: Swetasudha Panda , Ariel Kobren , Michael Louis Wick , Stephen Green
IPC: G06F40/279 , G06N20/00
Abstract: Debiasing pre-trained sentence encoders with probabilistic dropouts may be performed by various systems, services, or applications. A sentence may be received, where the words of the sentence may be provided as tokens to an encoder of a machine learning model. A token-wise correlation using semantic orientation may be determined to determine a bias score for the tokens in the input sentence. A probability of dropout that for tokens in the input sentence may be determined from the bias scores. The machine learning model may be trained or tuned based on the probabilities of dropout for the tokens in the input sentence.
-
公开(公告)号:US20220245339A1
公开(公告)日:2022-08-04
申请号:US17589662
申请日:2022-01-31
Applicant: Oracle International Corporation
Inventor: Swetasudha Panda , Ariel Kobren , Michael Louis Wick , Stephen Green
IPC: G06F40/279 , G06N20/00
Abstract: Debiasing pre-trained sentence encoders with probabilistic dropouts may be performed by various systems, services, or applications. A sentence may be received, where the words of the sentence may be provided as tokens to an encoder of a machine learning model. A token-wise correlation using semantic orientation may be determined to determine a bias score for the tokens in the input sentence. A probability of dropout that for tokens in the input sentence may be determined from the bias scores. The machine learning model may be trained or tuned based on the probabilities of dropout for the tokens in the input sentence.
-
-
-
-