Obviousness — US Patent 8830293B2

US patent 8830293B2, titled "Video superposition for continuous presence," aims to overcome limitations of prior art videoconferencing systems by combining real-time video streams in a way that provides continuous presence with a more natural, life-like appearance, such as an anterior/posterior arrangement of participants (e.g., stadium seating) while maintaining near life-size images.

An analysis of the independent claims (Claim 1 for method, Claim 10 for apparatus, and Claim 13 for logic, which share similar core features) and dependent claims suggests that the claimed invention would have been obvious to a person having ordinary skill in the art (POSITA) at the time of the invention (priority date 2009-05-26) when considering certain combinations of prior art.

Independent Claim Analysis (e.g., Claim 1)

Claim 1 recites a method comprising:

Receiving at least first and second real-time video streams, each with a subject image and a background image.
Combining the subject images of corresponding video frames into a combined frame such that the subject image of the first video stream is positioned in an anterior portion and the subject image of the second video stream is positioned in a posterior portion.
The combining specifically includes:
- Scaling the video frames of the first video stream and repositioning them in a first direction.
- Removing the background image from the scaled first video frames to produce "first background separated video frames" for the anterior portion.
- Superimposing these "first background separated video frames" onto the corresponding video frames of the second video stream.
Supplying the combined video stream to a video display.

Prior Art Combinations and Motivation for Obviousness

The motivation for a POSITA would be to improve upon existing "continuous presence" videoconferencing solutions, such as the "Hollywood Squares" feature, which were known to display all participants simultaneously but suffered from reduced image sizes and an unnatural grid-like arrangement. The goal would be to achieve a more realistic, immersive, and natural visual experience, where participants appear to be in a shared space and maintain a "near life-size" appearance.

Combination 1: "Hollywood Squares" + US20090033737A1 (Goose) + US20080095470A1 (Chao)

"Hollywood Squares" (General Prior Art): This widely recognized videoconferencing feature demonstrates the concept of "continuous presence" by simultaneously displaying multiple participants. However, the patent itself acknowledges its drawbacks, specifically the proportional reduction in image size and the unnatural rectangular stacking of participants. This prior art establishes the problem of achieving realistic continuous presence and provides a clear motivation for improvement.
US20090033737A1 (Goose): Titled "Method and System for Video Conferencing in a Virtual Environment," Goose teaches "generating a composite image for display that includes a visual representation of each of the plurality of participants and a virtual environment. The composite image is generated by positioning the visual representation of at least one of the plurality of participants within the virtual environment."
- A POSITA would understand that "positioning a visual representation within a virtual environment" inherently requires isolating the participant's image (subject image) by removing its original background (i.e., background separation).
- The concept of a "virtual environment" naturally implies the ability to place participants at different perceived depths, thereby suggesting an "anterior portion" and a "posterior portion" to create a more realistic, layered scene, directly addressing the positioning aspect of Claim 1.
- While Goose suggests a new virtual environment, a POSITA, motivated to create depth using existing video streams, would find it an obvious design choice to use the full video frame (including its background) of one participant's stream as the "virtual environment" or posterior layer for another participant whose background has been removed.
US20080095470A1 (Chao): Titled "Digital Image Auto-Resizing," Chao teaches "automatically resizing a digital image or portions of a digital image based on a determined region of interest."
- A POSITA, when compositing participant images into a shared scene (as motivated by Goose), would find it obvious to apply scaling (Chao) to adjust the apparent size and perspective of the participant images. This allows for maintaining "near life-size" appearances or creating a sense of depth where anterior subjects might appear larger or closer than posterior subjects.

Motivation to Combine for Claim 1:
A POSITA, seeking to create a more realistic "continuous presence" experience than "Hollywood Squares" provides, would be motivated to combine the background separation and compositing teachings of Goose with the image scaling capabilities of Chao. The idea of "positioning participants within a virtual environment" (Goose) directly addresses the need to separate subjects from backgrounds and place them. The specific method of Claim 1—superimposing a background-separated subject from a first stream onto the full video frames of a second stream—is a logical and straightforward implementation of creating a layered, anterior/posterior depth effect using these known techniques. The "repositioning in a first direction" is a trivial aspect of placing a scaled subject image within a composite frame.

Dependent Claim Analysis

Obviousness of Claim 3

Claim 3, dependent on Claim 1, specifies further combining steps: scaling the second video stream, extending its background image to produce "background extended video frames," and then superimposing the first background-separated frames onto these extended frames.

Combination 2: Combination 1 + General Knowledge of Image Inpainting/Extension
The patent itself describes "extending of the background image... may be performed via a video enhancing technique (e.g., inpainting)", and details how inpainting can be accomplished (e.g., creating a static filler image, copying and flipping portions of the background). These are presented as known techniques.

Motivation to Combine for Claim 3:
If a POSITA, following Combination 1, decided to scale the "posterior" video stream (the second stream), it would be a natural and obvious consequence that gaps or areas needing to be filled might arise due to the scaling operation. To address this, a POSITA would be motivated to apply known video enhancing techniques, such as inpainting or background extension (explicitly mentioned as conventional in the patent), to seamlessly fill or extend the background of the scaled posterior video stream. This ensures a cohesive visual experience when the foreground subject is superimposed, directly addressing the steps of Claim 3.

Obviousness of Claim 5

Claim 5, dependent on Claim 1, specifies further combining steps: removing backgrounds from both first and second video streams, generating a supplemental background image, and superimposing both background-separated subjects onto this supplemental background.

Combination 3: "Hollywood Squares" + US20090033737A1 (Goose) + US20080095470A1 (Chao)
Goose's teaching of "generating a composite image for display that includes a visual representation of each of the plurality of participants and a virtual environment" directly anticipates this approach. The "virtual environment" serves as the "supplemental background image."

Motivation to Combine for Claim 5:
A POSITA, motivated to create the most unified and immersive "virtual meeting space" (beyond simply layering one participant over another's background), would directly turn to Goose's teachings. Goose explicitly describes extracting multiple participant representations and compositing them into a shared "virtual environment" (a supplemental background). This provides the means and motivation for removing the background from both participant streams and placing them onto a new, generated background. The scaling (Chao) would again be obviously applied to adjust the size and perspective of the participants within this entirely new composite scene, allowing for placement in anterior, intermediate, and posterior portions to achieve a natural depth effect, aligning with the overall goal of the patent.

Obviousness of Other Claims

Claims 2, 9, 14 (Multiple Videoconference Participants): The core problem addressed by both the patent and the prior art ("Hollywood Squares") is displaying multiple videoconference participants for continuous presence. Therefore, using images of multiple participants is inherent and obvious.
Claims 4, 16 (Opposite Direction Repositioning): If scaling and repositioning are applied to multiple subjects to create a layered effect, choosing opposite directions for movement or placement to enhance the perception of depth is a mere design choice well within the skill of a POSITA.
Claims 6, 18 (Same Site Production) & 7, 19 (Different Site Production): Whether video streams originate from the same or different conferencing sites represents common deployment scenarios for videoconferencing systems. Applying the combining techniques to either scenario would be an obvious application.
Claims 8, 20 (Combining at a Different Site/MCU): The patent itself describes the multipoint control apparatus (MCU) 300, which can perform the combining function, and states that components "may be distributed over a wide area network" and operable with various network protocols. WO2001037559A1 also mentions a "video mixer for arranging received video signals from a number of cameras", implying a central processing unit. Performing such processing at a centralized or distributed network component (like an MCU) is a conventional architectural decision in videoconferencing systems and would be obvious.

In conclusion, the claimed methods, apparatus, and logic of US8830293B2 would have been obvious to a person having ordinary skill in the art, motivated to improve upon existing continuous presence videoconferencing, by combining the known concepts of continuous presence (Hollywood Squares) with background separation and compositing into a virtual environment (Goose), and image scaling (Chao), along with general knowledge of image processing techniques like inpainting.