Obviousness — US Patent 6950469

US patent 6950469, titled "Method for sub-pixel value interpolation," relates to techniques for interpolating sub-pixel values in video encoding and decoding processes, particularly for digital video [cite: the present invention relates to a method for sub-pixel value interpolation in the encoding and decoding of data. It relates particularly, but not exclusively, to encoding and decoding of digital video.]. The patent aims to improve upon existing interpolation methods, such as those described in Test Model 5 (TML5) and Test Model 6 (TML6) of video coding standards, by offering a more efficient balance between computational complexity, memory usage, and image quality.

For an obviousness analysis under 35 U.S.C. § 103, we identify combinations of prior art references that would have motivated a person having ordinary skill in the art (PHOSITA) to arrive at the claimed invention.

Summary of the Invention (US6950469):
The primary inventive aspect disclosed in US6950469, as described in the "first aspect of the invention," is a method of interpolation for generating sub-pixel values at fractional horizontal and vertical locations, defined as 1/(2^x) where 'x' is a positive integer up to 'N'. The method comprises:

Step (a): Interpolating values for sub-pixels at 1/(2^(N-1)) unit horizontal and 1/(2^(N-1)) unit vertical locations directly using a choice of a first weighted sum of values for sub-pixels residing at 1/(2^(N-1)) unit horizontal and unit vertical locations, and a second weighted sum of values for sub-pixels residing at unit horizontal and 1/(2^(N-1)) unit vertical locations. The first and second weighted sums are calculated according to a subsequent step (b) [cite: a first aspect of the invention there is provided method of interpolation in video coding in which an image comprising pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows residing at unit horizontal locations and the pixels in the columns residing at unit vertical locations, is interpolated to generate values for sub-pixels at fractional horizontal and vertical locations, the fractional horizontal and vertical locations being defined according to 1/2 x , where x is a positive integer having a maximum value N, the method comprising: step (a) when values for sub-pixels at 1/2 N -1 unit horizontal and 1/2 N -1 unit vertical locations are required, interpolating such values directly using a choice of a first weighted sum of values for sub-pixels residing at 1/2 N -1 unit horizontal and unit vertical locations and a second weighted sum of values for sub-pixels residing at unit horizontal and 1/2 N -1 unit vertical locations, the first and second weighted sums of values being calculated according to step (a);].
Step (b): Interpolating values for sub-pixels at 1/(2^(N-1)) unit horizontal and unit vertical locations, and unit horizontal and 1/(2^(N-1)) unit vertical locations, by taking the average of a first and a second pixel or sub-pixel [cite: when values for sub-pixels at 1/2 N -1 unit horizontal and unit vertical locations, and unit horizontal and 1/2 N -1 unit vertical locations are required, they may be interpolated by taking the average of the values of a first pixel or sub-pixel located at a vertical location corresponding to that of the sub-pixel being calculated and unit horizontal location and a second pixel or sub-pixel located at a vertical location corresponding to that of the sub-pixel being calculated and 1/2 N -1 unit horizontal location.].
Step (c): Interpolating a value for a sub-pixel at a 1/(2^N) unit horizontal and 1/(2^N) unit vertical location by taking a weighted average of a first sub-pixel or pixel and a second sub-pixel or pixel, where these first and second points are located diagonally with respect to the sub-pixel being calculated [cite: c) interpolate a value for a sub-pixel situated at a 1/2 N unit horizontal and 1/2 N unit vertical location by taking a weighted average of the value of a first sub-pixel or pixel situated at a 1/2 N -m unit horizontal and 1/2 N -n unit vertical location and the value of a second sub-pixel or pixel located at a 1/2 N -p unit horizontal and 1/2 N -q unit vertical location, variables m, n, p and q taking integer values in the range 1 to N such that the first and second sub-pixels or pixels are located diagonally with respect to the sub-pixel at 1/2 N unit horizontal and 1/2 N vertical location.].

Prior Art References:

The patent explicitly discusses two prior art sub-pixel interpolation methods:

TML5 (Test Model 5): This method involves interpolating 1/4 resolution sub-pixel values dependently on 1/2 resolution sub-pixel values. The 1/2 resolution values must be calculated first, and truncation of these intermediate values leads to reduced precision for the 1/4 resolution sub-pixels. Additionally, TML5 requires storing 1/2 resolution sub-pixel values, leading to increased memory usage [cite: TML5 uses an approach in which interpolation of 1/4 resolution sub-pixel values depends upon the interpolation of 1/2 resolution sub-pixel values. This means that in order to interpolate the values of the 1/4 resolution sub-pixels, the values of the 1/2 resolution sub-pixels from which they are determined must be calculated first., The 1/4 resolution sub-pixel values are less precise than they would be if calculated from values that had not been truncated and clipped., Another disadvantage of TML5 is that it is necessary to store the values of the 1/2 resolution sub-pixels in order to interpolate the 1/4 resolution sub-pixel values. Therefore, excess memory is required to store a result which is not ultimately required.].
TML6 (Test Model 6): TML6 improves upon TML5 by obtaining 1/4 resolution sub-pixel values directly using intermediate values, meaning it avoids deriving them from rounded and clipped 1/2 resolution sub-pixel values. This eliminates the need to calculate and store final 1/2 resolution sub-pixel values, reducing truncation errors and computational complexity. However, TML6 requires high-precision arithmetic, which demands more silicon area in ASICs and more computations in CPUs. Furthermore, its "on-demand" implementation has high memory requirements, particularly for embedded devices [cite: 1/4 resolution sub-pixel values are obtained directly using the intermediate values referred to above and are not derived from rounded and clipped values for 1/2 resolution sub-pixels. Therefore, in obtaining the 1/4 resolution sub-pixel values, it is not necessary to calculate final values for any of the 1/2 resolution sub-pixels., a disadvantage of TML6 is that high precision arithmetic is required both in the encoder and in the decoder. High precision interpolation requires more silicon area in ASICs and requires more computations in some CPUs. Furthermore, implementation of direct interpolation as specified in TML6 in an on-demand fashion has a high memory requirement. This is an important factor, particularly in embedded devices.].

Obviousness Analysis under 35 U.S.C. § 103:

A PHOSITA in the field of digital video processing would be motivated to combine the teachings of TML5 and TML6, along with general image processing principles, to address known problems in sub-pixel interpolation.

Combination: TML5 + TML6 + General Interpolation Principles

Motivation:
A PHOSITA, striving to optimize sub-pixel interpolation for video coding standards (such as H.261, H.263, H.26L, MPEG-4, mentioned as prior art standards that use motion-compensated prediction [cite: Modern video compression standards such as ITU-T recommendations H.261, H.263(+)(++), H.26L and the Motion Picture Experts Group recommendation MPEG-4 make use of ‘motion compensated temporal prediction’.]) would be well aware of the trade-offs presented by TML5 and TML6.

Problem with TML5: Loss of precision due to truncation/clipping of intermediate 1/2 resolution sub-pixels, and increased memory usage due to storing these intermediate values [cite: The 1/4 resolution sub-pixel values are less precise than they would be if calculated from values that had not been truncated and clipped., Another disadvantage of TML5 is that it is necessary to store the values of the 1/2 resolution sub-pixels in order to interpolate the 1/4 resolution sub-pixel values. Therefore, excess memory is required to store a result which is not ultimately required.].
Problem with TML6: While TML6 improved precision by enabling "direct" calculation of 1/4 resolution sub-pixels without relying on final (truncated/clipped) 1/2 resolution values, it incurred the cost of high-precision arithmetic and significant memory requirements for on-demand interpolation [cite: 1/4 resolution sub-pixel values are obtained directly using the intermediate values referred to above and are not derived from rounded and clipped values for 1/2 resolution sub-pixels. Therefore, in obtaining the 1/4 resolution sub-pixel values, it is not necessary to calculate final values for any of the 1/2 resolution sub-pixels., a disadvantage of TML6 is that high precision arithmetic is required both in the encoder and in the decoder. High precision interpolation requires more silicon area in ASICs and requires more computations in some CPUs. Furthermore, implementation of direct interpolation as specified in TML6 in an on-demand fashion has a high memory requirement. This is an important factor, particularly in embedded devices.].

The motivation for a PHOSITA would be to devise an interpolation method that achieves the precision benefits of TML6 (by using "direct" interpolation from intermediate values) while simultaneously reducing the computational complexity and memory footprint associated with TML6's high-precision arithmetic and memory-intensive on-demand calculation.

Why the combination would render the claims obvious:

"Direct" Interpolation (Step a): TML6 explicitly teaches obtaining 1/4 resolution sub-pixel values "directly" from intermediate values, without prior truncation or clipping of 1/2 resolution values [cite: 1/4 resolution sub-pixel values are obtained directly using the intermediate values referred to above and are not derived from rounded and clipped values for 1/2 resolution sub-pixels. Therefore, in obtaining the 1/4 resolution sub-pixel values, it is not necessary to calculate final values for any of the 1/2 resolution sub-pixels.]. This concept of "direct" interpolation, which the present invention also employs for its 1/(2^(N-1)) and 1/(2^N) resolution sub-pixels, is thus taught by TML6 as a solution to TML5's precision problems. The use of "weighted sums" to calculate these intermediate sub-pixels (as implied by the K-tap filter description in the patent for b sub-pixels) is a fundamental and common technique in signal processing and image interpolation, readily apparent to a PHOSITA.
Interpolation of 1/(2^(N-1)) horizontal/unit vertical, etc. (Step b): The patent specifies that these values (e.g., half-resolution sub-pixels like 'b' in FIG. 14a) are interpolated by taking the average of existing pixel or sub-pixel values. Linear averaging is a basic form of interpolation, widely known and applied in image processing to estimate intermediate pixel values. A PHOSITA, seeking to reduce the complexity of TML6's 6-tap filters [cite: sub-pixel values can be obtained directly by applying 6-tap filters in horizontal and vertical directions.] for simpler implementations, would routinely explore simpler averaging methods.
Diagonal Weighted Average for 1/(2^N) sub-pixels (Step c): The invention highlights that 1/4 resolution sub-pixels (like 'h' in FIG. 14a, representing 1/(2^N) resolution when N=2) are "interpolated diagonally in order to reduce dependency on other 1/4-pixels" [cite: 1/4 resolution sub-pixels h (and sub-pixel i in one embodiment of the invention) are interpolated diagonally in order to reduce dependency on other 1/4-pixels.]. While TML5 and TML6 provide methods for 1/4-pixel interpolation, they do not explicitly teach this specific diagonal averaging technique for these corner sub-pixels. However, given the motivation to simplify computation and memory requirements of TML6, a PHOSITA would consider various simpler interpolation kernels. Averaging diagonally located pixels or sub-pixels is a conventional heuristic in image processing for estimating values at diagonal fractional positions, aiming to capture spatial correlation efficiently with reduced computational load compared to more complex multi-tap filters (as might be inferred from TML6's "high precision arithmetic"). The patent itself notes that a more complex diagonal interpolation embodiment for h was less preferred, explicitly stating "The second embodiment has higher complexity, since calculation of sub-pixel c requires calculation of several intermediate values. Therefore the first embodiment is preferred" [cite: The second embodiment has higher complexity, since calculation of sub-pixel c requires calculation of several intermediate values. Therefore the first embodiment is preferred.], demonstrating a clear motivation towards computational simplicity that a PHOSITA would share.

Therefore, the claimed method in US6950469 represents an amalgamation of known interpolation principles (weighted sums, linear averaging, diagonal averaging) applied within the context of sub-pixel interpolation for video coding. A PHOSITA, motivated by the desire to overcome the precision limitations of TML5 and to mitigate the computational and memory burdens of TML6, would have found it obvious to combine the "direct" interpolation concept from TML6 with simpler, commonly known interpolation techniques to achieve an improved balance of efficiency and quality.