VOCABULARY GENERATION FOR NEURAL MACHINE TRANSLATION

    公开(公告)号:US20230161977A1

    公开(公告)日:2023-05-25

    申请号:US17535365

    申请日:2021-11-24

    摘要: Implementations of the present disclosure relate to methods, devices, and computer program products for generating a destination vocabulary from a source vocabulary. In a method, a group of candidate vocabularies are determined from the source vocabulary based on a corpus, a size of a candidate vocabulary in the group of candidate vocabularies being different from a size of the source vocabulary. A group of marginal scores are obtained for the group of candidate vocabularies, respectively, a marginal score in the group of marginal scores being obtained for the candidate vocabulary based on a corpus entropy of the candidate vocabulary and a size of the candidate vocabulary. The destination vocabulary is selected from the group of candidate vocabularies based on the group of marginal scores. With these implementations, both of the corpus entropy and the vocabulary size are considered in the vocabulary generation, and thus a balance may be achieved therebetween, which may increase the performance of the generated vocabulary.

    Vocabulary generation for neural machine translation

    公开(公告)号:US12112139B2

    公开(公告)日:2024-10-08

    申请号:US17535365

    申请日:2021-11-24

    摘要: Implementations of the present disclosure relate to methods, devices, and computer program products for generating a destination vocabulary from a source vocabulary. In a method, a group of candidate vocabularies are determined from the source vocabulary based on a corpus, a size of a candidate vocabulary in the group of candidate vocabularies being different from a size of the source vocabulary. A group of marginal scores are obtained for the group of candidate vocabularies, respectively, a marginal score in the group of marginal scores being obtained for the candidate vocabulary based on a corpus entropy of the candidate vocabulary and a size of the candidate vocabulary. The destination vocabulary is selected from the group of candidate vocabularies based on the group of marginal scores. With these implementations, both of the corpus entropy and the vocabulary size are considered in the vocabulary generation, and thus a balance may be achieved therebetween, which may increase the performance of the generated vocabulary.

    Ontology creating apparatus, method, and program

    公开(公告)号:US12032617B2

    公开(公告)日:2024-07-09

    申请号:US17414314

    申请日:2019-12-10

    CPC分类号: G06F16/367 G06F40/237

    摘要: An ontology creation apparatus according to an embodiment includes: a first selection unit that receives an input operation that is performed based on information that represents definitions of candidate classes of ontology; an acquisition unit that acquires a candidate class of a subject when the selected class is determined as the object, based on information that represents definitions of properties that indicate connection relationships between classes that serve as objects and classes that serve as subjects; a second selection unit that receives an input operation for selecting an instance that belongs to a class of a subject when the selected class is determined as an object, based on the candidate; a relationship setting unit that sets a connection relationship between instances that belong to the selected class and the selected instance; and an output unit that creates and outputs ontology that indicates the selected class, the instance, and the set connection relationship.