-
公开(公告)号:US20190156849A1
公开(公告)日:2019-05-23
申请号:US16178841
申请日:2018-11-02
Applicant: Polycom, Inc.
Inventor: Jinwei Feng , Peter Chu
Abstract: A videoconference apparatus at a first location detects audio from a location and determines whether the sound should be included in an audio-video stream sent to a second location, or excluded as an interfering noise. Determining whether to include the audio involves using a face detector to see if there is a face at the source of the sound. If a face is present, the audio data from the location will be transmitted to the second location. If a face is not present, additional motion checks are performed to determine whether the sound corresponds to a person talking, (such as a presenter at a meeting), or whether the sound is instead unwanted noise.
-
公开(公告)号:US10134414B1
公开(公告)日:2018-11-20
申请号:US15640385
申请日:2017-06-30
Applicant: Polycom, Inc.
Inventor: Jinwei Feng , Peter Chu
IPC: G10L21/02 , H04N7/15 , G10L21/0216 , H04R1/40 , H04R3/00 , H04N5/232 , G10L25/57 , G10L21/0208
Abstract: A videoconference apparatus at a first location detects audio from a location and determines whether the sound should be included in an audio-video stream sent to a second location, or excluded as an interfering noise. Determining whether to include the audio involves using a face detector to see if there is a face at the source of the sound. If a face is present, the audio data from the location will be transmitted to the second location. If a face is not present, additional motion checks are performed to determine whether the sound corresponds to a person talking, (such as a presenter at a meeting), or whether the sound is instead unwanted noise.
-
公开(公告)号:US10091412B1
公开(公告)日:2018-10-02
申请号:US15640358
申请日:2017-06-30
Applicant: Polycom, Inc.
Inventor: Jinwei Feng , Peter Chu
Abstract: A system for ensuring that the best available view of a person's face is included in a video stream when the person's face is being captured by multiple cameras at multiple angles at a first endpoint. The system uses one or more microphone arrays to capture direct-reverberant ratio information corresponding to the views, and determines which view most closely matches a view of the person looking directly at the camera, thereby improving the experience for viewers at a second endpoint.
-
-