Noise-robust feature extraction using multi-layer principal component analysis
    1.
    发明授权
    Noise-robust feature extraction using multi-layer principal component analysis 有权
    使用多层主成分分析的噪声鲁棒特征提取

    公开(公告)号:US07457749B2

    公开(公告)日:2008-11-25

    申请号:US11422862

    申请日:2006-06-07

    CPC classification number: G06K9/4647 G06K9/6232 G10L15/02 G10L15/20

    Abstract: Extracting features from signals for use in classification, retrieval, or identification of data represented by those signals uses a “Distortion Discriminant Analysis” (DDA) of a set of training signals to define parameters of a signal feature extractor. The signal feature extractor takes signals having one or more dimensions with a temporal or spatial structure, applies an oriented principal component analysis (OPCA) to limited regions of the signal, aggregates the output of multiple OPCAs that are spatially or temporally adjacent, and applies OPCA to the aggregate. The steps of aggregating adjacent OPCA outputs and applying OPCA to the aggregated values are performed one or more times for extracting low-dimensional noise-robust features from signals, including audio signals, images, video data, or any other time or frequency domain signal. Such extracted features are useful for many tasks, including automatic authentication or identification of particular signals, or particular elements within such signals.

    Abstract translation: 从用于分类,检索或识别由这些信号表示的数据的信号中提取特征使用一组训练信号的“失真判别分析”(DDA)来定义信号特征提取器的参数。 信号特征提取器采用具有时间或空间结构的一个或多个维度的信号,将定向主成分分析(OPCA)应用于信号的有限区域,聚合空间或时间相邻的多个OPCA的输出,并应用OPCA 到总计。 执行聚合相邻OPCA输出并将OPCA应用于聚合值的步骤一次或多次,用于从包括音频信号,图像,视频数据或任何其他时间或频域信号的信号中提取低维噪声鲁棒特征。 这些提取的特征对于许多任务是有用的,包括特定信号的自动认证或识别,或这些信号内的特定元件。

    System and method for speeding up database lookups for multiple synchronized data streams

    公开(公告)号:US20060106867A1

    公开(公告)日:2006-05-18

    申请号:US10980684

    申请日:2004-11-02

    Abstract: A “Media Identifier” operates on concurrent media streams to provide large numbers of clients with real-time server-side identification of media objects embedded in streaming media, such as radio, television, or Internet broadcasts. Such media objects may include songs, commercials, jingles, station identifiers, etc. Identification of the media objects is provided to clients by comparing client-generated traces computed from media stream samples to a large database of stored, pre-computed traces (i.e., “fingerprints”) of known identification. Further, given a finite number of media streams and a much larger number of clients, many of the traces sent to the server are likely to be almost identical. Therefore, a searchable dynamic trace cache is used to limit the database queries necessary to identify particular traces. This trace cache caches only one copy of recent traces along with the database search results, either positive or negative. Cache entries are then removed as they age.

    System and method for automatically customizing a buffered media stream

    公开(公告)号:US20060092282A1

    公开(公告)日:2006-05-04

    申请号:US10987365

    申请日:2004-11-12

    Abstract: A “media stream customizer” customizes buffered media streams by inserting one or more media objects into the stream to maintain an approximate buffer level. Specifically, when media objects such as songs, jingles, advertisements, etc., are deleted from the buffered stream (based on some user specified preferences), the buffer level will decrease. Therefore, over time, as more objects are deleted, the amount of the media stream being buffered continues to decrease, thereby limiting the ability to perform additional deletions from the stream. To address this limitation, the media stream customizer automatically chooses one or more media objects to insert back into the stream, and ensures that the inserted objects are consistent with any surrounding content of the media stream, thereby maintaining an approximate buffer level. In addition, the buffered content can also be stretched using pitch preserving audio stretching techniques to further compensate for deletions from the buffered stream.

    System and method for inferring similarities between media objects
    4.
    发明申请
    System and method for inferring similarities between media objects 审中-公开
    用于推断媒体对象之间的相似性的系统和方法

    公开(公告)号:US20060080356A1

    公开(公告)日:2006-04-13

    申请号:US10965604

    申请日:2004-10-13

    CPC classification number: G06F16/40

    Abstract: A “similarity quantifier” automatically infers similarity between media objects which have no inherent measure of distance between them. For example, a human listener can easily determine that a song like Solsbury Hill by Peter Gabriel is more similar to Everybody Hurts by R.E.M. than it is to Highway to Hell by AC/DC. However, automatic determination of this similarity is typically a more difficult problem. This problem is addressed by using a combination of techniques for inferring similarities between media objects thereby facilitating media object filing, retrieval, classification, playlist construction, etc. Specifically, a combination of audio fingerprinting and repeat object detection is used for gathering statistics on broadcast media streams. These statistics include each media objects identity and positions within the media stream. Similarities between media objects are then inferred based on the observation that objects appearing closer together in an authored stream are more likely to be similar.

    Abstract translation: “相似性量词”自动推断媒体对象之间的相似性,它们之间没有固有的距离度量。 例如,人类听众可以很容易地确定Peter Gabriel的Solsbury Hill歌曲更像R.E.M的“Everybody Hurts”。 比起AC / DC到高速公路到地狱。 然而,这种相似性的自动确定通常是更困难的问题。 通过使用用于推断媒体对象之间的相似性的技术的组合来解决该问题,从而便于媒体对象归档,检索,分类,播放列表构造等。具体地,使用音频指纹和重复对象检测的组合来收集广播媒体上的统计信息 流。 这些统计信息包括媒体流中每个媒体对象的身份和位置。 然后根据观察结果推断媒体对象之间的相似性,即在作者流中更靠近在一起的对象更有可能是类似的。

    Multi-level search
    5.
    发明申请
    Multi-level search 有权
    多级搜索

    公开(公告)号:US20080313147A1

    公开(公告)日:2008-12-18

    申请号:US11818088

    申请日:2007-06-13

    CPC classification number: G06F17/30657

    Abstract: A computer-implementable method and system for performing a multi-level search. The method includes performing a primary search that involves executing a query submitted by a user, and returning primary search results (a list of documents, for example). The method further includes automatically performing a secondary search. The secondary search involves identifying at least one third-party source of information based on the query, and automatically assessing a semantic interpretation of the query. The secondary search utilizes the identified at least one third-party source of information and the semantic interpretation of the query to derive secondary search results, which are displayed along with the primary search results.

    Abstract translation: 一种用于执行多级搜索的计算机可实现的方法和系统。 该方法包括执行涉及执行由用户提交的查询以及返回主搜索结果(例如文档列表)的主搜索。 该方法还包括自动执行辅助搜索。 辅助搜索涉及基于查询识别至少一个第三方信息源,并自动评估查询的语义解释。 辅助搜索利用所识别的至少一个第三方信息源和查询的语义解释来导出与搜索结果一起显示的辅助搜索结果。

    NOISE-ROBUST FEATURE EXTRACTION USING MULTI-LAYER PRINCIPAL COMPONENT ANALYSIS
    6.
    发明申请
    NOISE-ROBUST FEATURE EXTRACTION USING MULTI-LAYER PRINCIPAL COMPONENT ANALYSIS 有权
    使用多层主成分分析的噪声强度特征提取

    公开(公告)号:US20060217968A1

    公开(公告)日:2006-09-28

    申请号:US11422862

    申请日:2006-06-07

    CPC classification number: G06K9/4647 G06K9/6232 G10L15/02 G10L15/20

    Abstract: Extracting features from signals for use in classification, retrieval, or identification of data represented by those signals uses a “Distortion Discriminant Analysis” (DDA) of a set of training signals to define parameters of a signal feature extractor. The signal feature extractor takes signals having one or more dimensions with a temporal or spatial structure, applies an oriented principal component analysis (OPCA) to limited regions of the signal, aggregates the output of multiple OPCAs that are spatially or temporally adjacent, and applies OPCA to the aggregate. The steps of aggregating adjacent OPCA outputs and applying OPCA to the aggregated values are performed one or more times for extracting low-dimensional noise-robust features from signals, including audio signals, images, video data, or any other time or frequency domain signal. Such extracted features are useful for many tasks, including automatic authentication or identification of particular signals, or particular elements within such signals.

    Abstract translation: 从用于分类,检索或识别由这些信号表示的数据的信号中提取特征使用一组训练信号的“失真判别分析”(DDA)来定义信号特征提取器的参数。 信号特征提取器采用具有时间或空间结构的一个或多个维度的信号,将定向主成分分析(OPCA)应用于信号的有限区域,聚合空间或时间相邻的多个OPCA的输出,并应用OPCA 到总计。 执行聚合相邻OPCA输出并将OPCA应用于聚合值的步骤一次或多次,用于从包括音频信号,图像,视频数据或任何其他时间或频域信号的信号中提取低维噪声鲁棒特征。 这些提取的特征对于许多任务是有用的,包括特定信号的自动认证或识别,或这些信号内的特定元件。

    System and method for speeding up database lookups for multiple synchronized data streams
    7.
    发明授权
    System and method for speeding up database lookups for multiple synchronized data streams 失效
    用于加速多个同步数据流的数据库查找的系统和方法

    公开(公告)号:US07574451B2

    公开(公告)日:2009-08-11

    申请号:US10980684

    申请日:2004-11-02

    Abstract: A “Media Identifier” operates on concurrent media streams to provide large numbers of clients with real-time server-side identification of media objects embedded in streaming media, such as radio, television, or Internet broadcasts. Such media objects may include songs, commercials, jingles, station identifiers, etc. Identification of the media objects is provided to clients by comparing client-generated traces computed from media stream samples to a large database of stored, pre-computed traces (i.e., “fingerprints”) of known identification. Further, given a finite number of media streams and a much larger number of clients, many of the traces sent to the server are likely to be almost identical. Therefore, a searchable dynamic trace cache is used to limit the database queries necessary to identify particular traces. This trace cache caches only one copy of recent traces along with the database search results, either positive or negative. Cache entries are then removed as they age.

    Abstract translation: “媒体标识符”对并发媒体流进行操作,为大量的客户端提供嵌入在诸如广播,电视或互联网广播之类的流媒体中的媒体对象的实时服务器端识别。 这样的媒体对象可以包括歌曲,广告,歌曲,站标识符等。通过将从媒体流样本计算的客户端生成的跟踪与存储的预先计算的跟踪的大型数据库进行比较,将媒体对象的识别提供给客户端(即, “指纹”)已知识别。 此外,给定有限数量的媒体流和更多数量的客户端,发送到服务器的许多跟踪可能几乎相同。 因此,可搜索的动态跟踪缓存用于限制识别特定跟踪所需的数据库查询。 该跟踪缓存仅缓存最近跟踪的一个副本以及数据库搜索结果,正数或负数。 缓存条目随着年龄的推移被删除。

    Noise-robust feature extraction using multi-layer principal component analysis
    8.
    发明授权
    Noise-robust feature extraction using multi-layer principal component analysis 失效
    使用多层主成分分析的噪声鲁棒特征提取

    公开(公告)号:US07082394B2

    公开(公告)日:2006-07-25

    申请号:US10180271

    申请日:2002-06-25

    CPC classification number: G06K9/4647 G06K9/6232 G10L15/02 G10L15/20

    Abstract: Extracting features from signals for use in classification, retrieval, or identification of data represented by those signals uses a “Distortion Discriminant Analysis” (DDA) of a set of training signals to define parameters of a signal feature extractor. The signal feature extractor takes signals having one or more dimensions with a temporal or spatial structure, applies an oriented principal component analysis (OPCA) to limited regions of the signal, aggregates the output of multiple OPCAs that are spatially or temporally adjacent, and applies OPCA to the aggregate. The steps of aggregating adjacent OPCA outputs and applying OPCA to the aggregated values are performed one or more times for extracting low-dimensional noise-robust features from signals, including audio signals, images, video data, or any other time or frequency domain signal. Such extracted features are useful for many tasks, including automatic authentication or identification of particular signals, or particular elements within such signals.

    Abstract translation: 从用于分类,检索或识别由这些信号表示的数据的信号中提取特征使用一组训练信号的“失真判别分析”(DDA)来定义信号特征提取器的参数。 信号特征提取器采用具有时间或空间结构的一个或多个维度的信号,将定向主成分分析(OPCA)应用于信号的有限区域,聚合空间或时间相邻的多个OPCA的输出,并应用OPCA 到总计。 执行聚合相邻OPCA输出并将OPCA应用于聚合值的步骤一次或多次,用于从包括音频信号,图像,视频数据或任何其他时间或频域信号的信号中提取低维噪声鲁棒特征。 这些提取的特征对于许多任务是有用的,包括特定信号的自动认证或识别,或这些信号内的特定元件。

    Web spam page classification using query-dependent data
    9.
    发明授权
    Web spam page classification using query-dependent data 有权
    网页垃圾邮件分类使用查询相关数据

    公开(公告)号:US07853589B2

    公开(公告)日:2010-12-14

    申请号:US11742156

    申请日:2007-04-30

    CPC classification number: G06F17/3089

    Abstract: A web spam page classifier is described that identifies web spam pages based on features of a search query and web page pair. The features can be extracted from training instances and a training algorithm can be employed to develop the classifier. Pages identified as web spam pages can be demoted and/or removed from a relevancy ranked list.

    Abstract translation: 描述了基于搜索查询和网页对的特征来识别网页垃圾邮件页面的网页垃圾邮件页面分类器。 可以从训练实例中提取特征,并且可以采用训练算法来开发分类器。 识别为Web垃圾邮件页面的页面可以从相关性排名列表中降级和/或删除。

    System and method for automatically customizing a buffered media stream
    10.
    发明授权
    System and method for automatically customizing a buffered media stream 有权
    自动定制缓冲媒体流的系统和方法

    公开(公告)号:US07826708B2

    公开(公告)日:2010-11-02

    申请号:US10980683

    申请日:2004-11-02

    Abstract: A “media stream customizer” customizes buffered media streams by inserting one or more media objects into the stream to maintain an approximate buffer level. Specifically, when media objects such as songs, jingles, advertisements, etc., are deleted from the buffered stream (based on some user specified preferences), the buffer level will decrease. Therefore, over time, as more objects are deleted, the amount of the media stream being buffered continues to decrease, thereby limiting the ability to perform additional deletions from the stream. To address this limitation, the media stream customizer automatically chooses one or more media objects to insert back into the stream, and ensures that the inserted objects are consistent with any surrounding content of the media stream, thereby maintaining an approximate buffer level. In addition, the buffered content can also be stretched using pitch preserving audio stretching techniques to further compensate for deletions from the buffered stream.

    Abstract translation: “媒体流定制器”通过将一个或多个媒体对象插入流来定制缓冲媒体流,以维持近似的缓冲器级别。 特别地,当缓冲流(基于一些用户指定的偏好))删除诸如歌曲,歌曲,广告等的媒体对象时,缓冲器级别将减小。 因此,随着时间的推移,随着更多的对象被删除,缓冲的媒体流的数量继续减少,从而限制了从流中执行附加删除的能力。 为了解决这个限制,媒体流定制器自动选择一个或多个媒体对象来插入到流中,并确保所插入的对象与媒体流的任何周围内容一致,从而保持近似的缓冲器级别。 此外,缓冲内容还可以使用音高保持音频拉伸技术进行拉伸,以进一步补偿来自缓冲流的缺失。

Patent Agency Ranking