-
公开(公告)号:US20230186939A1
公开(公告)日:2023-06-15
申请号:US17547644
申请日:2021-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Qingming Tang , Chieh-Chi Kao , Qin Zhang , Ming Sun , Chao Wang , Sumit Garg , Rong Chen , James Garnet Droppo , Chia-Jung Chang
Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.
-
公开(公告)号:US12249344B1
公开(公告)日:2025-03-11
申请号:US17853773
申请日:2022-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Christopher Evans , Sumit Garg , Ameya Agaskar , Mohammad Edris Qarghah , Zhengping Jin
IPC: G10L25/51 , G10L15/08 , G10L15/22 , G10L19/018
Abstract: Described herein is a system for encoding audio watermarks with frequency extensions to enable enhanced watermark detection. An extended audio watermark may include an existing audio watermark and a duplicate audio watermark, enabling backwards compatibility with existing watermark detection while also enabling enhanced watermark detection with increased accuracy. For example, embedding the extended audio watermark enables (i) limited devices to perform watermark detection to detect the existing audio watermark, and (ii) improved devices to perform enhanced watermark detection to detect the extended audio watermark. As the extended audio watermark includes redundancy in the form of duplicate audio watermark(s), an accuracy of performing enhanced watermark detection is increased relative to detecting the existing audio watermark alone.
-
公开(公告)号:US12182192B1
公开(公告)日:2024-12-31
申请号:US17854219
申请日:2022-06-30
Applicant: Amazon Technologies, Inc.
Abstract: A system configured to perform content identification using fingerprinting to recognize known media content. The system may generate a reference database including reference fingerprints for each media content item to include in the content identification. In addition, the system may generate a hash table that associates individual frames of the reference fingerprints with identification information for corresponding media content items. When a device is playing media content, the system may perform content identification by generating query fingerprints representing the media content and comparing the query fingerprints to the reference database. For example, the system may match a query fingerprint to a reference fingerprint by identifying which of the reference fingerprints shares the most frames with the query fingerprint using the hash table. In addition, the system may use additional decision criteria to confirm a match, such as fine-grain matching or tracking successive fingerprints over time.
-
公开(公告)号:US20240071408A1
公开(公告)日:2024-02-29
申请号:US18243804
申请日:2023-09-08
Applicant: Amazon Technologies, Inc.
Inventor: Qingming Tang , Chieh-Chi Kao , Qin Zhang , Ming Sun , Chao Wang , Sumit Garg , Rong Chen , James Garnet Droppo , Chia-Jung Chang
Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.
-
公开(公告)号:US11790932B2
公开(公告)日:2023-10-17
申请号:US17547644
申请日:2021-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Qingming Tang , Chieh-Chi Kao , Qin Zhang , Ming Sun , Chao Wang , Sumit Garg , Rong Chen , James Garnet Droppo , Chia-Jung Chang
CPC classification number: G10L25/51 , G06N3/045 , G06N3/08 , G10L25/21 , G10L25/30 , G10L15/08 , G10L15/22 , G10L2015/088 , G10L2015/223
Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.
-
公开(公告)号:US12205601B1
公开(公告)日:2025-01-21
申请号:US17853183
申请日:2022-06-29
Applicant: Amazon Technologies, Inc.
Inventor: David McGuire , Ahmed Abdelal , Sai Kiran Venkata Subramanya Rupanagudi , Sumit Garg , Terrence Yu , Nathaniel White , Siddharth Agrawal , Pavas Kant , Yuxuan Hao , Nagaraj Mahajan , Ameya Agaskar , Aaron Challenner
IPC: G10L19/018 , G06F21/62 , G06V20/40 , G11B27/34 , H04R3/00
Abstract: A system configured to perform content recognition using fingerprinting to recognize known media content. A device determines fingerprints based on decoded content data to be sent using a media interface component to an output component. Metadata related to the content/device/fingerprint may also be created. The fingerprints and metadata are sent by the device to a supporting system for orchestration and matching of the fingerprints to known media content.
-
公开(公告)号:US12136428B1
公开(公告)日:2024-11-05
申请号:US17490271
申请日:2021-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Ameya Agaskar , Sumit Garg
IPC: G10L19/018 , G10L15/22
Abstract: Described herein is a system for embedding audio watermarks. To improve performance without a user perceiving the audio watermark, a system embeds audio watermarks in audio data using scaling factors that are calculated based on a spectral masking level for each frame of the audio data. The scaling factors may vary over time and correspond to an amplitude of the audio watermark across a series of watermark frames. The system processes the audio data to determine a spectral mask, which represents an amount of energy perceived in a first frequency range that is caused by energy represented in neighboring frequency ranges. By selecting scaling factor values that keep an amplitude of the audio watermark below the threshold indicated by the spectral mask, the system may embed the audio watermark in the first audio data without the audio watermark being audible to the user.
-
公开(公告)号:US11887602B1
公开(公告)日:2024-01-30
申请号:US17547894
申请日:2021-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Brendon Jude Wilson , Henry Michael D Souza , Cindy Angie Hou , Christopher Evans , Sumit Garg , Ravina Chopra
IPC: G10L15/30 , G10L19/02 , G10L15/22 , G06F3/16 , G10L19/018
Abstract: Techniques for performing audio-based device location determinations are described. A system may send, to a first device, a command to output audio requesting a location of the first device be determined. A second device may receive the audio and send, to the system, data representing the second device received the audio, where the received data includes spectral energy data representing a spectral energy of the audio as received by the second device. The system may, using the spectral energy data, determine attenuation data representing an attenuation experienced by the audio as it traveled from the first device to the second device. The system may generate, based on the attenuation data, spatial relationship data representing a spatial relationship between the first device and the second device, where the spatial relationship data is usable to determine a device for outputting a response to a subsequently received user input.
-
-
-
-
-
-
-