-
公开(公告)号:US09218335B2
公开(公告)日:2015-12-22
申请号:US13648645
申请日:2012-10-10
Applicant: Verisign, Inc.
Inventor: Ronald Andrew Hoskinson , Lambert Arians , Marc Anderson , Mahendra Jain
CPC classification number: G06F17/275 , G06F15/16 , G06F17/20 , G06F17/2217 , G06F17/28 , G06F17/30141 , H04L61/1511 , H04L61/302 , H04L61/3035
Abstract: Methods and systems for automated language detection for domain names are disclosed. In some embodiments, a method for detecting a language of an Internationalized Domain Name (IDN) comprises receiving, by an I/O interface, a string of characters for the IDN; receiving training data, including a plurality of multi-gram analysis for a set of languages; analyzing, by a processor, the string of characters based on the training data, wherein the analyzing includes extracting a set of multi-grams from the string of characters and comparing the extracted set of multi-grams with the training data; detecting the language of the IDN based on results of the analyzing. In some embodiments, the method further comprises comparing the detected language of the IDN with a user selected language and using the IDN to generate a domain name, if the comparing indicates that the detected language of the IDN is consistent with the user selected language.
Abstract translation: 公开了用于域名自动语言检测的方法和系统。 在一些实施例中,一种用于检测国际化域名(IDN)的语言的方法包括:通过I / O接口接收IDN字符串; 接收训练数据,包括用于一组语言的多个多克分析; 由处理器基于训练数据分析字符串,其中所述分析包括从所述字符串中提取一组多克,并将所提取的多克数组与训练数据进行比较; 根据分析结果检测IDN的语言。 在一些实施例中,所述方法还包括如果所述比较指示所检测到的所述IDN的语言与所述用户选择的语言一致,则将检测到的所述IDN的语言与用户选择的语言进行比较并使用所述IDN来生成域名。
-
公开(公告)号:US20140100845A1
公开(公告)日:2014-04-10
申请号:US13648645
申请日:2012-10-10
Applicant: VERISIGN, INC.
Inventor: Ronald Andrew Hoskinson , Lambert Arians , Marc Anderson , Mahendra Jain
IPC: G06F17/20
CPC classification number: G06F17/275 , G06F15/16 , G06F17/20 , G06F17/2217 , G06F17/28 , G06F17/30141 , H04L61/1511 , H04L61/302 , H04L61/3035
Abstract: Methods and systems for automated language detection for domain names are disclosed. In some embodiments, a method for detecting a language of an Internationalized Domain Name (IDN) comprises receiving, by an I/O interface, a string of characters for the IDN; receiving training data, including a plurality of multi-gram analyses for a set of languages; analyzing, by a processor, the string of characters based on the training data, wherein the analyzing includes extracting a set of multi-grams from the string of characters and comparing the extracted set of multi-grams with the training data; detecting the language of the IDN based on results of the analyzing. In some embodiments, the method further comprises comparing the detected language of the IDN with a user selected language and using the IDN to generate a domain name, if the comparing indicates that the detected language of the IDN is consistent with the user selected language.
Abstract translation: 公开了用于域名自动语言检测的方法和系统。 在一些实施例中,一种用于检测国际化域名(IDN)的语言的方法包括:通过I / O接口接收IDN字符串; 接收训练数据,包括一组语言的多个多重分析; 由处理器基于训练数据分析字符串,其中所述分析包括从所述字符串中提取一组多克,并将所提取的多克数组与训练数据进行比较; 根据分析结果检测IDN的语言。 在一些实施例中,所述方法还包括如果所述比较指示所检测到的所述IDN的语言与所述用户选择的语言一致,则将检测到的所述IDN的语言与用户选择的语言进行比较并使用所述IDN来生成域名。
-
公开(公告)号:US10140282B2
公开(公告)日:2018-11-27
申请号:US14242190
申请日:2014-04-01
Applicant: VERISIGN, INC.
Inventor: Pallavi Aras , Ronald Andrew Hoskinson
Abstract: A plurality of input string n-grams may be generated by accessing an input string and generating a Universal character set transformation format (UTF) encoded input string from the input string. The UTF encoded input string may be parsed via an n-gram parser to generate a plurality of input string n-grams, where a length of each of the input string n-grams is larger than a lower bound and smaller than an upper bound. The generated plurality of input string n-grams may be provided to determine matches between the input string and a domain.
-
公开(公告)号:US09785629B2
公开(公告)日:2017-10-10
申请号:US14970414
申请日:2015-12-15
Applicant: VERISIGN, INC.
Inventor: Ronald Andrew Hoskinson , Lambert Arians , Marc Anderson , Mahendra Jain
CPC classification number: G06F17/275 , G06F15/16 , G06F17/20 , G06F17/2217 , G06F17/28 , G06F17/30141 , H04L61/1511 , H04L61/302 , H04L61/3035
Abstract: Methods and systems for automated language detection for domain names are disclosed. In some embodiments, a method for detecting a language of an Internationalized Domain Name (IDN) comprises receiving, by an I/O interface, a string of characters for the IDN; receiving training data, including a plurality of multi-gram analyses for a set of languages; analyzing, by a processor, the string of characters based on the training data, wherein the analyzing includes extracting a set of multi-grams from the string of characters and comparing the extracted set of multi-grams with the training data; detecting the language of the IDN based on results of the analyzing. In some embodiments, the method further comprises comparing the detected language of the IDN with a user selected language and using the IDN to generate a domain name, if the comparing indicates that the detected language of the IDN is consistent with the user selected language.
-
公开(公告)号:US20160232154A1
公开(公告)日:2016-08-11
申请号:US14970414
申请日:2015-12-15
Applicant: VERISIGN, INC.
Inventor: Ronald Andrew Hoskinson , Lambert Arians , Marc Anderson , Mahendra Jain
CPC classification number: G06F17/275 , G06F15/16 , G06F17/20 , G06F17/2217 , G06F17/28 , G06F17/30141 , H04L61/1511 , H04L61/302 , H04L61/3035
Abstract: Methods and systems for automated language detection for domain names are disclosed. In some embodiments, a method for detecting a language of an Internationalized Domain Name (IDN) comprises receiving, by an I/O interface, a string of characters for the IDN; receiving training data, including a plurality of multi-gram analyses for a set of languages; analyzing, by a processor, the string of characters based on the training data, wherein the analyzing includes extracting a set of multi-grams from the string of characters and comparing the extracted set of multi-grams with the training data; detecting the language of the IDN based on results of the analyzing. In some embodiments, the method further comprises comparing the detected language of the IDN with a user selected language and using the IDN to generate a domain name, if the comparing indicates that the detected language of the IDN is consistent with the user selected language.
Abstract translation: 公开了用于域名自动语言检测的方法和系统。 在一些实施例中,一种用于检测国际化域名(IDN)的语言的方法包括:通过I / O接口接收IDN字符串; 接收训练数据,包括一组语言的多个多重分析; 由处理器基于训练数据分析字符串,其中所述分析包括从所述字符串中提取一组多克,并将所提取的多克数组与训练数据进行比较; 根据分析结果检测IDN的语言。 在一些实施例中,所述方法还包括如果所述比较指示所检测到的所述IDN的语言与所述用户选择的语言一致,则将检测到的所述IDN的语言与用户选择的语言进行比较并使用所述IDN来生成域名。
-
公开(公告)号:US20150278188A1
公开(公告)日:2015-10-01
申请号:US14242190
申请日:2014-04-01
Applicant: VERISIGN, INC.
Inventor: Pallavi Aras , Ronald Andrew Hoskinson
CPC classification number: G06F17/271 , G06F17/30876 , H04L61/3025
Abstract: A plurality of input string n-grams may be generated by accessing an input string and generating a Universal character set transformation format (UTF) encoded input string from the input string. The UTF encoded input string may be parsed via an n-gram parser to generate a plurality of input string n-grams, where a length of each of the input string n-grams is larger than a lower bound and smaller than an upper bound. The generated plurality of input string n-grams may be provided to determine matches between the input string and a domain.
Abstract translation: 可以通过访问输入字符串并从输入字符串生成通用字符集转换格式(UTF)编码的输入字符串来生成多个输入字符串n-gram。 UTF编码的输入字符串可以经由n-gram解析器来解析,以生成多个输入字符串n-gram,其中每个输入字符串n-gram的长度大于下限并且小于上限。 可以提供所生成的多个输入字符串n-gram以确定输入字符串和域之间的匹配。
-
-
-
-
-