Obviousness — US Patent 9042448

Obviousness Analysis under 35 U.S.C. § 103 for US Patent 9042448

This analysis considers combinations of prior art references that would render the claims of US9042448 obvious to a person having ordinary skill in the art (PHOSITA) at the time of the invention (priority date May 30, 2008). The primary prior art references identified within the patent text itself are "Patent Literature 1" and "MPEG-4 SVC" (Scalable Video Coding) [cite: Patent Literature 1 and MPEG-4 SVC].

Core Inventive Concept of US9042448

US9042448 generally describes a moving picture encoding system that enhances encoding efficiency by leveraging super-resolution processing. Specifically, it involves:

Encoding and decoding a sequence of moving pictures at a standard resolution (e.g., via a "first encoder").
Applying a super-resolution enlargement process to the original standard-resolution input pictures to create "super-resolution enlarged pictures" (higher resolution).
Converting these super-resolution enlarged pictures back to the standard resolution, resulting in "super-resolution enlarged and converted pictures."
A "second encoder" then encodes these "super-resolution enlarged and converted pictures" as target pictures, often using decoded pictures from the first encoder (or super-resolution processed decoded pictures) as reference pictures for inter-layer prediction.
The patent emphasizes capturing "information on frequency components in the spatial direction and the temporal direction that has been potentially contained in the input moving pictures but unable to express to a sufficient degree by the standard resolution" through this super-resolution process to improve encoding efficiency [cite: a moving picture encoding system, 2nd to last definition block, 2nd paragraph of that definition block].

Prior Art Background

The patent explicitly references "Patent Literature 1 and MPEG-4 SVC" as prior art that "include predictive pictures or predictive blocks created from reference pictures of a layer lower than the layer in which a current encoding is made, for use in combination with target pictures or target blocks of the current layer to make an inter-layer prediction in between, aiming at still enhanced encoding efficiencies making use of high correlations between different spatial resolutions" [cite: Patent Literature 1 and MPEG-4 SVC]. This establishes that hierarchical or scalable video coding, employing inter-layer prediction between different spatial resolutions to enhance encoding efficiency, was known in the art, particularly through standards like MPEG-4 SVC. MPEG-4 SVC is further noted as a "scalable extension of MPEG-4 AVC".

The patent also describes the internal workings of its "first super-resolution enlarger 103" (comprising a positioner, interpolator, estimated picture creator, and repetition determiner) and "first resolution converter 104" (comprising a pixel inserter, filtering processor, and pixel thinner) [cite: the first super-resolution enlarger 103, Part (a) of FIG. 6]. These descriptions indicate that the individual components and processes for super-resolution and resolution conversion (upsampling/downsampling with filtering) were known techniques.

Obviousness Combination and Motivation

A person having ordinary skill in the art (PHOSITA) in the field of video coding, seeking to improve the efficiency of scalable video coding systems, would have been motivated to combine the teachings of MPEG-4 SVC (or Patent Literature 1) with known super-resolution and resolution conversion techniques.

Problem Addressed by the Combination:
The patent highlights a problem in existing hierarchical coding schemes: "failures to allot a sufficient code rate would degrade image qualities of predictive pictures or predictive blocks, resulting in reduced encoding efficiencies" [cite: like techniques making use of high correlations in the temporal direction]. This implies that the information content available for inter-layer prediction, even from lower-resolution layers, might not be optimal, leading to less efficient encoding of higher layers or enhanced standard-resolution layers.

Motivation for a PHOSITA to Combine:
A PHOSITA, aware of the challenge of maximizing encoding efficiency in scalable video coding and familiar with techniques for recovering high-frequency information, would naturally consider super-resolution as a means to "enrich" the information available to the encoder. Super-resolution techniques are designed to reconstruct a higher-resolution image from multiple lower-resolution images, thereby potentially inferring details not explicitly present in a single low-resolution frame.

The motivation to combine would specifically arise from the desire to:

Enhance the "target pictures" for an encoding layer (e.g., a standard-resolution enhancement layer): If the goal is to encode a standard-resolution video stream with a higher quality or more detail than typically achievable by simple downsampling from a high-resolution input, applying super-resolution to the original standard-resolution input would be an intuitive step. The super-resolution process would extract additional frequency components (as described in the patent), and then a subsequent resolution conversion would re-conform the signal to the standard resolution, but with "an increased amount of information" [cite: a moving picture encoding system, 2nd to last definition block]. This enhanced standard-resolution signal would provide a richer basis for the second encoder to operate on, thereby improving the efficiency of encoding this enhanced layer.
Provide richer "reference pictures" for inter-layer prediction: In an MPEG-4 SVC-like system where a higher layer predicts from a lower layer, providing a more detailed or "super-resolved" version of the decoded lower-layer picture as a reference could improve the accuracy of the prediction, thus reducing the amount of residual information to be encoded. This is also contemplated by the present patent's "second super-resolution enlarger" and "second resolution converter" used to create "second reference pictures." [cite: the second encoder 107, FIG. 1 and 8]

Specific Combination:
Consider a scalable video encoding system, such as that taught by MPEG-4 SVC (or Patent Literature 1), which includes:

A base layer encoder (analogous to the "first encoder" 102) encoding a standard resolution video stream.
An enhancement layer encoder (analogous to the "second encoder" 107 or "third encoder" 108) encoding an enhanced video stream, potentially at the same or higher resolution, and utilizing inter-layer prediction from the base layer's decoded pictures.

A PHOSITA would modify this system by introducing:

A super-resolution enlarger (employing known techniques like positioning, interpolation, and reconstruction, as described in the patent's "first super-resolution enlarger 103" [cite: Part (a) of FIG. 4]) to process the input standard-resolution video, or the decoded base-layer video, to recover lost high-frequency details.
A resolution converter (employing known techniques like pixel insertion, filtering with a low-pass filter, and pixel thinning, as described in the patent's "first resolution converter 104" [cite: Part (a) of FIG. 6]) to convert the super-resolution enlarged pictures back to the desired target resolution (e.g., the standard resolution for the "super-resolution enlarged and converted pictures," or the high resolution for the "resolution conversion enlarged decoded pictures").

The resulting "super-resolution enlarged and converted pictures" (for encoding target) or "super-resolution enlarged decoded pictures" (for reference) would then be fed into the enhancement layer encoder, providing it with more informative signals for encoding or prediction. The motivation is to improve the very "enhanced encoding efficiencies making use of high correlations between different spatial resolutions" that the prior art (MPEG-4 SVC) sought, by making those correlations more robust and information-rich through super-resolution processing.

Therefore, the combination of MPEG-4 SVC (or Patent Literature 1) with common knowledge of super-resolution techniques and resolution conversion methods would render the core aspects of US9042448 obvious, as it addresses a known problem in the art (degradation of predictive picture quality and reduced encoding efficiency) using known solutions in an expected manner to achieve a predictable improvement in efficiency.