Patent 6700999
Obviousness
Combinations of prior art that suggest the claimed invention would have been obvious under 35 U.S.C. § 103.
Active provider: Google · gemini-2.5-flash
Obviousness
Combinations of prior art that suggest the claimed invention would have been obvious under 35 U.S.C. § 103.
The obviousness of US patent 6700999 under 35 U.S.C. § 103 can be assessed by examining the differences between the claimed invention and the prior art, and whether a person having ordinary skill in the art (PHOSITA) would have been motivated to combine prior art references to arrive at the claimed invention.
A PHOSITA in 2000, the priority date of US6700999, would be a computer vision engineer or researcher familiar with image processing, pattern recognition, and algorithms for real-time video analysis. This individual would be aware of various techniques for object detection, segmentation, tracking, and methods for optimizing computational load. The cited prior art, including survey papers and technical articles, reflects this level of skill and common knowledge in the field.
Here are combinations of prior art references that would render the independent claims of US6700999 obvious, along with the motivation for such combinations:
Claims 1, 8, and 27 (Method and Apparatus with Operating Modes)
These claims introduce a first operating mode (tracking every frame) and a standby mode (tracking only a portion of frames), where the standby mode activates after a specified number of consecutive frames are processed without locating any face candidate regions.
Primary References:
- Gary R. Bradski, "Computer Vision Face Tracking for Use in a Perceptual User Interface" (1998): This article describes a real-time face-tracking system using skin-color detection and highlights the challenge of "large amount of data processing required" for face tracking.
- Dorin Comaniciu et al., "Robust Analysis of Feature Spaces: Color Image Segmentation" (1997) and "Mean Shift analysis and applications" (1999): These papers describe techniques for color image segmentation, which is fundamental to identifying face candidate regions.
- Raja Yogesh et al., "Tracking and Segmenting people in varying lighting conditions using colour" (1998): This reference discusses using color models to extract skin regions for tracking.
- Chellapa et al., "Human and Machine Recognition of Faces: A survey" (1995) and Samal and Iyengar, "Automatic recognition and analysis of human faces and facial expressions: A survey" (1992): These survey papers provide an overview of face detection and tracking techniques, including various cues and challenges.
Motivation to Combine: Bradski explicitly points out the significant computational demands of real-time face tracking. A PHOSITA, aware of this known problem, would be strongly motivated to implement strategies to reduce processor load when face tracking is not actively yielding results. It is a common engineering principle to conserve computational resources by reducing the frequency of processing when a target object is absent or not detected. Therefore, designing a system to transition to a "standby mode" that processes only a portion of frames after a period of no face detections, and returning to a "normal mode" upon detection, would be an obvious solution to the known problem of high processor utilization.
Claims 6, 13, 19, 25, and 32 (Method and Apparatus with Temporal Filtering)
These claims describe temporal filtering of the face candidate mask, specifically by obtaining an average (or median, as described in the patent) of entries from the current mask and corresponding entries from temporally distinct (previous and/or subsequent) masks.
Primary References:
- Gary R. Bradski, "Computer Vision Face Tracking for Use in a Perceptual User Interface" (1998): Mentions "temporal filtering" in the context of improving robustness for face tracking.
- Raja Yogesh et al., "Tracking and Segmenting people in varying lighting conditions using colour" (1998): Refers to "temporal coherence of the tracked regions," acknowledging that object presence is correlated across frames.
- Chellapa et al., "Human and Machine Recognition of Faces: A survey" (1995) and Samal and Iyengar, "Automatic recognition and analysis of human faces and facial expressions: A survey" (1992): These surveys would establish the general understanding of temporal consistency in video processing.
Motivation to Combine: The temporal correlation of faces in a video sequence is a fundamental characteristic. A PHOSITA would be motivated to apply temporal filtering techniques, such as averaging or median filtering across multiple frames, to the face candidate masks. This is a well-known technique in image and video processing for noise reduction, smoothing, and improving the accuracy and stability of object detection by mitigating transient false alarms or missed detections. Applying such filtering would lead to a more robust and reliable face tracking output, which is a predictable benefit.
Claims 7, 14, 20, 26, and 33 (Filtering Based on Specific Characteristics)
These claims detail filtering the mask to remove false alarms by testing characteristics such as size, position, aspect ratio, intra-region flatness, shape, and contrast between interior and boundary areas of face candidate regions.
Primary References:
- Chellapa et al., "Human and Machine Recognition of Faces: A survey" (1995) and Samal and Iyengar, "Automatic recognition and analysis of human faces and facial expressions: A survey" (1992): These surveys discuss various features and cues used for face detection and verification, including shape, size, and aspect ratio.
- Gary R. Bradski, "Computer Vision Face Tracking for Use in a Perceptual User Interface" (1998): Describes face detection using initial cues and template matching, which implicitly involves feature comparison.
Motivation to Combine: The patent explicitly states that "color matching schemes... are not entirely immune to false alarms." A PHOSITA, facing the known problem of false positives in color-based face detection, would be motivated to employ additional discrimination criteria. Utilizing commonly known facial characteristics (e.g., expected size, aspect ratio, and shape of a human face) and general image processing features (e.g., color uniformity or contrast) to validate or reject face candidates is a standard approach in computer vision for improving detection accuracy and reducing false alarms.
Claim 15 (System with Video Encoder)
This claim describes a system where the face-tracking apparatus provides a mask to a video encoder, which then allocates a disproportionate number of bits to encode regions corresponding to face candidate regions.
Primary References:
- Gary R. Bradski, "Computer Vision Face Tracking for Use in a Perceptual User Interface" (1998): Provides a system for face tracking.
- General knowledge in video encoding: The patent itself mentions MPEG and H.263 standards and the concept of "content-directed video encoding" where "higher priority is given to regions of interest... such as human faces." This indicates that the goal of prioritizing faces in encoding was a known and desirable objective.
Motivation to Combine: Given the established desire in video compression to allocate more bits to perceptually important regions (like human faces) to improve subjective quality, a PHOSITA would be directly motivated to combine the output of a face tracking system (the mask identifying face regions) with a video encoder. This combination provides a practical and obvious way to achieve content-directed bit allocation, a known and desired application in the field.
Claims 21 and 25 (System with Camera Control)
These claims describe a system with a camera, a face-tracking apparatus (with or without temporal filtering and operating modes), and a camera control unit that adjusts camera movement (position/focal length) in response to face candidate regions across frames.
Primary References:
- Gary R. Bradski, "Computer Vision Face Tracking for Use in a Perceptual User Interface" (1998): Teaches face tracking.
- Martin Hunke et al., "Face Locating and Tracking for Human-Computer Interaction" (1995): Deals with face locating and tracking, in contexts that naturally lead to camera control for human-computer interaction.
- US6111517A (Visionics Corporation): Describes a "Continuous video monitoring system" for "detecting and tracking faces," which inherently implies a need to keep faces within the field of view.
- General knowledge of PTZ (Pan-Tilt-Zoom) cameras: The capability to control camera position and focal length was well-known.
Motivation to Combine: In applications such as video conferencing, surveillance, or telecine (as mentioned in US6700999), it is highly desirable to automatically keep human subjects' faces in the camera's view and in focus. A PHOSITA would be motivated to combine existing face tracking technology with known camera control capabilities to create an automated system for this purpose. The method of comparing face locations across successive frames to derive movement vectors for the camera is a logical and straightforward implementation for tracking a moving target. The inclusion of temporal filtering (as in Claim 25), already motivated for robustness, would further enhance the stability and smoothness of camera movements, preventing jerky or erratic behavior, which is an obvious improvement.
Claims 27 and 32 (Apparatus with Data Storage Medium)
These claims describe an apparatus comprising a data storage medium having machine-readable code that, when executed, defines the methods of Claims 1 and 6, respectively.
Primary References: The methods of Claim 1 and Claim 6, as discussed above, are themselves obvious.
- General knowledge in computer engineering: By 2000, it was a fundamental and widespread practice to implement computer-executable methods as machine-readable code stored on data storage media (e.g., ROM, flash memory, disks) for execution by processors (e.g., microprocessors, DSPs, ASICs, FPGAs). This is explicitly acknowledged within the US6700999 patent itself (FIGS. 17A and 17B and corresponding description).
Motivation to Combine: The implementation of any computer-based method as software or firmware on a data storage medium is a standard design choice and not inventive in itself. It is the obvious way to make such methods functional on a computing apparatus.
In conclusion, the various elements of US6700999, including the core face tracking methodology, temporal filtering, false alarm reduction, operational modes for processor efficiency, and specific applications like video encoding and camera control, would have been obvious to a PHOSITA by the patent's priority date (June 30, 2000), given the available prior art and the clear motivations to address known problems or achieve desirable system functionalities.
Generated 5/29/2026, 5:58:29 PM