-
公开(公告)号:US09495591B2
公开(公告)日:2016-11-15
申请号:US13664295
申请日:2012-10-30
Applicant: QUALCOMM Incorporated
Inventor: Erik Visser , Haiyin Wang , Hasib A. Siddiqui , Lae-Hoon Kim
CPC classification number: G06K9/00624 , G06K9/00 , G06K9/0063 , G06K9/3233 , G06K9/4628 , G06K9/4671 , G06K9/6293 , G06T7/20 , H04R3/00 , H04R3/005 , H04S7/30 , H04S2400/11 , H04S2400/15
Abstract: Methods, systems and articles of manufacture for recognizing and locating one or more objects in a scene are disclosed. An image and/or video of the scene are captured. Using audio recorded at the scene, an object search of the captured scene is narrowed down. For example, the direction of arrival (DOA) of a sound can be determined and used to limit the search area in a captured image/video. In another example, keypoint signatures may be selected based on types of sounds identified in the recorded audio. A keypoint signature corresponds to a particular object that the system is configured to recognize. Objects in the scene may then be recognized using a shift invariant feature transform (SIFT) analysis comparing keypoints identified in the captured scene to the selected keypoint signatures.
Abstract translation: 公开了用于识别和定位场景中的一个或多个物体的方法,系统和制品。 拍摄场景的图像和/或视频。 使用在场景录制的音频,捕获的场景的对象搜索变窄。 例如,可以确定声音的到达方向(DOA)并用于限制捕获的图像/视频中的搜索区域。 在另一示例中,可以基于记录的音频中识别的声音的类型来选择关键点签名。 关键点签名对应于系统配置为识别的特定对象。 然后可以使用移位不变特征变换(SIFT)分析来比较场景中的对象,比较在捕获的场景中识别的关键点与所选择的关键点签名。
-
公开(公告)号:US20130272548A1
公开(公告)日:2013-10-17
申请号:US13664295
申请日:2012-10-30
Applicant: QUALCOMM INCORPORATED
Inventor: Erik Visser , Haiyin Wang , Hasib A. Siddiqui , Lae-Hoon Kim
CPC classification number: G06K9/00624 , G06K9/00 , G06K9/0063 , G06K9/3233 , G06K9/4628 , G06K9/4671 , G06K9/6293 , G06T7/20 , H04R3/00 , H04R3/005 , H04S7/30 , H04S2400/11 , H04S2400/15
Abstract: Methods, systems and articles of manufacture for recognizing and locating one or more objects in a scene are disclosed. An image and/or video of the scene are captured. Using audio recorded at the scene, an object search of the captured scene is narrowed down. For example, the direction of arrival (DOA) of a sound can be determined and used to limit the search area in a captured image/video. In another example, keypoint signatures may be selected based on types of sounds identified in the recorded audio. A keypoint signature corresponds to a particular object that the system is configured to recognize. Objects in the scene may then be recognized using a shift invariant feature transform (SIFT) analysis comparing keypoints identified in the captured scene to the selected keypoint signatures.
Abstract translation: 公开了用于识别和定位场景中的一个或多个物体的方法,系统和制品。 拍摄场景的图像和/或视频。 使用在场景录制的音频,捕获的场景的对象搜索变窄。 例如,可以确定声音的到达方向(DOA)并用于限制捕获的图像/视频中的搜索区域。 在另一示例中,可以基于记录的音频中识别的声音的类型来选择关键点签名。 关键点签名对应于系统配置为识别的特定对象。 然后可以使用移位不变特征变换(SIFT)分析来比较场景中的对象,比较在捕获的场景中识别的关键点与所选择的关键点签名。
-