SCALABLE AND EFFECTIVE DOCUMENT SUMMARIZATION FRAMEWORK

    公开(公告)号:US20170228457A1

    公开(公告)日:2017-08-10

    申请号:US15019646

    申请日:2016-02-09

    Applicant: Yahoo! Inc.

    CPC classification number: G06F17/30719 G06F17/271 G06F17/277 G06F17/279

    Abstract: Systems, methods, and apparatuses are disclosed for adaptively generating a summary of web-based content based on an attribute of a mobile communication device having transmitted a request for the web-based content. By adaptively generating the summary based on an attribute of the mobile communication device such as an amount of visual space available or a number of characters permitted in the interface, a display of the web-based content may be controlled on the mobile communication device in a way that was not previously available. This enables control of displaying web-based content that has been adaptively generated to be displayed on limited display screens based on a learned attribute of the mobile communication device requesting the web-based content.

    TOPICAL BASED MEDIA CONTENT SUMMARIZATION SYSTEM AND METHOD
    2.
    发明申请
    TOPICAL BASED MEDIA CONTENT SUMMARIZATION SYSTEM AND METHOD 审中-公开
    基于主题的媒体内容概要系统和方法

    公开(公告)号:US20160299968A1

    公开(公告)日:2016-10-13

    申请号:US14682654

    申请日:2015-04-09

    Applicant: Yahoo! Inc.

    CPC classification number: G06F17/30843 G06F17/30719 G06F17/30722

    Abstract: Disclosed herein is an automated approach for summarizing media content using descriptive information associated with the media content. For example and without limitation, the descriptive information may comprise a title associated with the media content. One or more segments of the media content may be identified to form a media content summary based on each segment's respective similarity to the descriptive information, which respective similarity may be determined using a media content and auxiliary data feature spaces. A shared dictionary of canonical patterns generated using the media content and auxiliary data feature spaces may be used in determining a media content segment's similarity to the descriptive information.

    Abstract translation: 本文公开了一种使用与媒体内容相关联的描述性信息来概括媒体内容的自动化方法。 例如但不限于,描述性信息可以包括与媒体内容相关联的标题。 可以基于每个段与描述信息的各自相似度来识别媒体内容的一个或多个段以形成媒体内容摘要,其可以使用媒体内容和辅助数据特征空间来确定各自的相似度。 可以使用使用媒体内容和辅助数据特征空间生成的规范模式的共享字典来确定媒体内容段与描述性信息的相似性。

    Location-Based Recommendations Using Nearest Neighbors in a Locality Sensitive Hashing (LSH) Index

    公开(公告)号:US20170147575A1

    公开(公告)日:2017-05-25

    申请号:US14948213

    申请日:2015-11-20

    Applicant: Yahoo! Inc.

    Abstract: Software for a website hosting short-text services creates an index of buckets for locality sensitive hashing (LSH). The software stores the index in an in-memory database of key-value pairs. The software creates, on a mobile device, a cache backed by the in-memory database. The software then uses a short text to create a query embedding. The software map the query embedding to corresponding buckets in the index and determines which of the corresponding buckets are nearest neighbors to the query embedding using a similarity measure. The software displays location types associated with each of the buckets that are nearest neighbors in a view in a graphical user interface (GUI) on the mobile device and receives a user selection as to one of the location types. Then the software displays the entities for the selected location type in a GUI view on the mobile device.

    Entity disambiguation
    5.
    发明授权

    公开(公告)号:US11907858B2

    公开(公告)日:2024-02-20

    申请号:US15425978

    申请日:2017-02-06

    Applicant: Yahoo!, Inc.

    CPC classification number: G06N5/04 G06F16/36

    Abstract: One or more computing devices, systems, and/or methods for entity disambiguation are provided. For example, a document may be analyzed to identify a first mention and a second mention. One or more techniques may be used to select and link a candidate entity, from a first set of candidate entities, to the first mention and select and link a candidate entity, from a second set of candidate entities, to the second mention.

    ENTITY DISAMBIGUATION
    6.
    发明申请

    公开(公告)号:US20180225576A1

    公开(公告)日:2018-08-09

    申请号:US15425978

    申请日:2017-02-06

    Applicant: Yahoo!, Inc.

    CPC classification number: G06N5/04 G06F16/36

    Abstract: One or more computing devices, systems, and/or methods for entity disambiguation are provided. For example, a document may be analyzed to identify a first mention and a second mention. One or more techniques may be used to select and link a candidate entity, from a first set of candidate entities, to the first mention and select and link a candidate entity, from a second set of candidate entities, to the second mention.

    Location-based recommendations using nearest neighbors in a locality sensitive hashing (LSH) index

    公开(公告)号:US10521413B2

    公开(公告)日:2019-12-31

    申请号:US14948213

    申请日:2015-11-20

    Applicant: Yahoo! Inc.

    Abstract: Software for a website hosting short-text services creates an index of buckets for locality sensitive hashing (LSH). The software stores the index in an in-memory database of key-value pairs. The software creates, on a mobile device, a cache backed by the in-memory database. The software then uses a short text to create a query embedding. The software map the query embedding to corresponding buckets in the index and determines which of the corresponding buckets are nearest neighbors to the query embedding using a similarity measure. The software displays location types associated with each of the buckets that are nearest neighbors in a view in a graphical user interface (GUI) on the mobile device and receives a user selection as to one of the location types. Then the software displays the entities for the selected location type in a GUI view on the mobile device.

    Scalable Multilingual Named-Entity Recognition

    公开(公告)号:US20180203843A1

    公开(公告)日:2018-07-19

    申请号:US15406586

    申请日:2017-01-13

    Applicant: Yahoo! Inc.

    Abstract: Software on a website serves a user of an online content aggregation service a first article that the user views. The software extracts named entities from the first article using a named-entity recognizer. The named-entity recognizer uses a sequence of word embeddings as inputs to a conditional random field (CRF) tool to assign labels to each of the word embeddings. Each of the word embeddings is associated with a word in the first article and is trained using an entire topical article from a corpus of topical articles as a context for the word. The software then creates rankings for articles ingested by the content aggregation service based at least in part on the named entities and serves the user a second article using the rankings.

    COMPUTERIZED SYSTEM AND METHOD FOR FORMATTED TRANSCRIPTION OF MULTIMEDIA CONTENT
    9.
    发明申请
    COMPUTERIZED SYSTEM AND METHOD FOR FORMATTED TRANSCRIPTION OF MULTIMEDIA CONTENT 审中-公开
    用于形成多中心内容的计算机系统和方法

    公开(公告)号:US20170062010A1

    公开(公告)日:2017-03-02

    申请号:US14843185

    申请日:2015-09-02

    Applicant: YAHOO! INC.

    Abstract: Disclosed are systems and methods for improving interactions with and between computers in content searching, generating, hosting and/or providing systems supported by or configured with personal computing devices, servers and/or platforms. The systems interact to identify and retrieve data within or across platforms, which can be used to improve the quality of data used in processing interactions between or among processors in such systems. The disclosed systems and methods provide systems and methods for automatic creation of a formatted, readable transcript of multimedia content, which is derived, extracted, determined, or otherwise identified from the multimedia content. The formatted, readable transcript can be utilized to increase accuracy and efficiency in search engine optimization, as well as identification of relevant digital content available for communication to a user.

    Abstract translation: 公开了用于在内容搜索,生成,托管和/或提供由个人计算设备,服务器和/或平台支持或配置的系统之间改善与计算机之间的交互的系统和方法。 系统进行交互以识别和检索平台内或跨平台的数据,可用于提高在此类系统中处理器之间或之间处理交互中使用的数据的质量。 所公开的系统和方法提供用于自动创建由多媒体内容导出,提取,确定或以其他方式识别的多媒体内容的格式化的可读记录的系统和方法。 可以使用格式化的可读记录来提高搜索引擎优化的准确性和效率,以及识别可用于与用户通信的相关数字内容。

Patent Agency Ranking