Abstract:
The present disclosure generally relates to systems and processes for emoji word sense disambiguation. In one example process, a word sequence is received. A word-level feature representation is determined for each word of the word sequence and a global semantic representation for the word sequence is determined. For a first word of the word sequence, an attention coefficient is determined based on a congruence between the word-level feature representation of the first word and the global semantic representation for the word sequence. The word-level feature representation of the first word is adjusted based on the attention coefficient. An emoji likelihood is determined based on the adjusted word-level feature representation of the first word. In accordance with the emoji likelihood satisfying one or more criteria, an emoji character corresponding to the first word is presented for display.
Abstract:
The present disclosure generally relates to integrated text conversion and prediction. In an example process, a current character input of a first writing system is received. A first current character context in the first writing system is determined based on the current character input and a first previous character context in the first writing system. A second current character context in a second writing system is determined based on the first current character context, a second previous character context in the second writing system, and a character representation in the second writing system. A current word context in the second writing system is determined based on the second current character context, a previous word context in the second writing system, and a word representation in the second writing system. Based on the current word context, a probability distribution over a word inventory in the second writing system is determined.
Abstract:
Systems and processes for multilingual word prediction are provided. In accordance with one example, a method includes, at an electronic device having one or more processors and memory, receiving context information associated with a current word; determining, for each of a plurality of languages, a set of monolingual probabilities based on the context information; determining a set of language weights based on the context information; determining a set of multilingual probabilities based on the respective sets of monolingual probabilities and the set of language weights; and providing a plurality of candidate words based on the set of multilingual probabilities.
Abstract:
Methods, systems, and computer-readable media related to a technique for providing handwriting input functionality on a user device. A handwriting recognition module is trained to have a repertoire comprising multiple non-overlapping scripts and capable of recognizing tens of thousands of characters using a single handwriting recognition model. The handwriting input module provides real-time, stroke-order and stroke-direction independent handwriting recognition for multi-character handwriting input. In particular, real-time, stroke-order and stroke-direction independent handwriting recognition is provided for multi-character, or sentence level Chinese handwriting recognition. User interfaces for providing the handwriting input functionality are also disclosed.
Abstract:
Systems and processes for language identification from short strings are provided. In accordance with one example, a method includes, at a first electronic device with one or more processors and memory, receiving user input including an n-gram and determining a similarity between a representation of the n-gram and a representation of a first language. The representation of the first language is based on an occurrence of each of a plurality of n-grams in the first language and an occurrence of each of the plurality of n-grams in a second language. The method further includes determining whether the similarity between the representation of the n-gram and the representation of the first language satisfies a threshold.
Abstract:
A method and system for training a user authentication by voice signal are described. In one embodiment, a set of feature vectors are decomposed into speaker specific recognition units. The speaker-specific recognition units are used to compute distribution values to train the voice signal. In addition, spectral feature vectors are decomposed into speaker-specific characteristic units which are compared to the speaker-specific distribution values. If the speaker-specific characteristic units are within a threshold limit of the speaker-specific distribution values, the speech signal is authenticated.
Abstract:
Methods, systems, and computer-readable media related to a technique for providing handwriting input functionality on a user device. A handwriting recognition module is trained to have a repertoire comprising multiple non-overlapping scripts and capable of recognizing tens of thousands of characters using a single handwriting recognition model. The handwriting input module provides real-time, stroke-order and stroke-direction independent handwriting recognition for multi-character handwriting input. In particular, real-time, stroke-order and stroke-direction independent handwriting recognition is provided for multi-character, or sentence level Chinese handwriting recognition. User interfaces for providing the handwriting input functionality are also disclosed.