Obviousness — US Patent 12136276

Based on an analysis of the patent text for US 12136276 and publicly available prior art, several strong arguments for obviousness under 35 U.S.C. § 103 could be constructed. An obviousness rejection requires showing that a person having ordinary skill in the art would have been motivated to combine teachings from prior art references to arrive at the claimed invention with a reasonable expectation of success.

Person Having Ordinary Skill in the Art (POSITA)

A person having ordinary skill in the art (POSITA) for this patent would be someone with a Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field, and 2-3 years of experience in the areas of computer vision, machine learning, and/or automotive systems (specifically Advanced Driver Assistance Systems or ADAS). The POSITA would be familiar with common image processing techniques, the principles of camera geometry (e.g., perspective projection, intrinsic/extrinsic parameters), and the architecture and training of deep neural networks, particularly convolutional neural networks (CNNs), for tasks like feature detection and regression.

Summary of Inventive Concepts in US 12136276

The patent discloses methods and systems for initializing a monocular camera in a vehicle, particularly one that may be retrofitted or adjusted, to determine its extrinsic parameters (e.g., height, viewing angle/pitch, and road plane normal). These parameters are then used for downstream ADAS tasks like lane detection and distance estimation. The core inventive thrusts are:

On-Vehicle, Horizon-Based Initialization: Using an on-vehicle deep learning model to process video frames, predict a horizon line, and then compute the camera's parameters from that horizon line (as detailed in FIG. 3A and 5A).
On-Vehicle, Direct Parameter Prediction: Using an on-vehicle deep learning model with a specific "head" to directly regress and output the camera's extrinsic parameters from the video frames, bypassing the intermediate step of horizon detection (as detailed in FIG. 3B and 5B).
Network-Assisted, Human-in-the-Loop Initialization: An on-vehicle device sends video to a remote server, which uses a model to predict the horizon and calculate parameters. These predictions are then sent to a human annotator for review and confirmation or correction before the final parameters are sent back to the vehicle (as detailed in FIG. 2).

Obviousness Analysis of Key Embodiments

An obviousness challenge would focus on showing that each of these approaches represents a predictable combination of known elements to solve a known problem.

Combination 1: On-Vehicle Horizon-Based Initialization (Embodiment 1)

This embodiment could be rendered obvious by combining a reference teaching monocular camera calibration for ADAS with a reference teaching the use of CNNs for horizon or lane detection.

Primary Reference (Base System): US Patent 9,785,951 B2 (filed 2015), "On-line camera calibration for vehicle surround view system." This patent teaches a system for calibrating cameras on a vehicle to determine extrinsic parameters like height, pitch, and yaw. It uses feature points (like lane markings) detected in images and performs geometric calculations to find the parameters. This establishes the foundational concept of using image features from a vehicle's camera to perform automatic calibration for ADAS purposes.
Secondary Reference (Enabling Technology): "DeepLanes: End-To-End Lane Position Estimation using Deep Neural Networks" (Garnett et al., 2017). This academic paper, representative of the art, discloses using a deep neural network (a CNN) to robustly detect lane lines in images from a vehicle's perspective. Similarly, US Patent 10,776,716 B2 (filed 2018) teaches using a CNN to detect a horizon line for image analysis. These references show that by 2017-2018, using CNNs to find key road geometry features like lanes and the horizon was a well-established and superior technique compared to older methods.
Motivation to Combine: A POSITA starting with the system in US 9,785,951 would be aware of the challenges in reliably detecting feature points using classical computer vision, especially in varied lighting and weather conditions. The POSITA would be motivated by the well-documented improvements in robustness and accuracy offered by CNNs, as taught by Garnett et al. or US 10,776,716, to replace the feature detection module of the '951 patent with a CNN-based horizon and/or lane detector. The goal—calculating camera parameters—remains the same; the combination is merely the application of a known, better tool (a CNN detector) to a known problem (automated camera calibration). The horizon line is an intrinsically useful feature for determining camera pitch, making this a straightforward and predictable substitution with a high expectation of success.

Combination 2: On-Vehicle, Direct Parameter Prediction (Embodiment 2)

This embodiment could be rendered obvious by showing that moving from a two-step process (feature detection, then calculation) to an end-to-end regression model was a well-known design choice in the field of machine learning.

Primary Reference: The combination of US 9,785,951 B2 and a CNN feature detector like US 10,776,716 B2, as established above. This combination teaches the two-step process: (1) use a CNN to find the horizon, (2) compute camera parameters from the horizon.
Secondary Reference (Architectural Principle): "PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization" (Kendall et al., 2015). This influential paper taught the use of a single CNN to directly regress the 6-DOF camera pose (position and orientation) from a single RGB image. This established the principle of framing camera parameter estimation as a direct regression problem for a neural network, eliminating intermediate steps.
Motivation to Combine: A POSITA familiar with the two-step approach (Combination 1) and the broader field of machine learning as exemplified by PoseNet would recognize the potential benefits of an end-to-end model. The motivation would be to create a simpler, potentially more robust system by training a network to learn the features most relevant to the final task (parameter estimation) rather than an intermediate one (horizon detection). The patent's own disclosure of this embodiment (FIG. 3B) as an alternative to the horizon-based method (FIG. 3A) illustrates that this is a recognized design trade-off. It would have been obvious to a POSITA to try framing the problem of finding camera height and pitch as a direct regression task for a CNN, with a reasonable expectation of success based on prior work like PoseNet.

Combination 3: Network-Assisted, Human-in-the-Loop Initialization (Embodiment 3)

This embodiment appears to be an obvious application of standard quality control and data annotation practices to the problem of camera initialization.

Primary Reference: The automated, on-vehicle, or server-side calibration system described above. This system automatically generates camera parameters but may fail or produce low-confidence results in ambiguous scenes (e.g., poor weather, unclear markings).
Secondary Reference: Any of numerous systems for data annotation and verification, such as those used by services like Amazon Mechanical Turk or described in patents like US 10,540,845 B1 (filed 2017) for "Interactive labeling of training data for machine learning." These systems show a standard workflow: an AI model makes a prediction, and if the confidence is low or for quality control, the result is flagged for human review. The human annotator confirms, rejects, or corrects the AI's output.
Motivation to Combine: The motivation here is simple and compelling: ensuring reliability. A POSITA would know that any automated perception system will have failure modes. For a safety-related application like ADAS, ensuring the initial camera calibration is correct is critical. The most straightforward way to handle low-confidence automated results is to escalate them for human review. Combining the automated calibration system with a standard human-in-the-loop verification workflow is a well-known and predictable method for improving system robustness and creating high-quality ground truth data for retraining the model. The description in FIG. 2 of the '276 patent, where an annotator device confirms or rejects the automatically placed horizon line, is a textbook implementation of this known practice.

Conclusion

The claims of US patent 12136276 appear vulnerable to an obviousness challenge under 35 U.S.C. § 103. The core ideas—using a CNN to find a horizon for geometric calibration, training a CNN to directly regress camera parameters, and using a human-in-the-loop process to verify automated outputs—all represent the application of known techniques and design principles from the fields of computer vision and machine learning to the known problem of vehicle camera calibration. A skilled artisan would have been motivated to combine these prior art teachings to achieve a more robust and automated calibration system with a high likelihood of success.