Patent 10218995

Derivative works

Defensive disclosure: derivative variations of each claim designed to render future incremental improvements obvious or non-novel.

Active provider: Google · gemini-2.5-flash

Derivative works

Defensive disclosure: derivative variations of each claim designed to render future incremental improvements obvious or non-novel.

✓ Generated

Defensive Disclosure Document for US Patent 10,218,995

This document describes derivative works and technical disclosures based on US Patent 10,218,995, aimed at establishing prior art to render future incremental improvements by competitors obvious or non-novel. The derivations are structured around independent Claims 1 (Moving Picture Encoding System), 15 (Moving Picture Decoding System), and 19 (Moving Picture Reencoding System), applying five distinct axes of variation. The principles detailed for these system claims are equally applicable to their corresponding method (Claims 13, 17, 21) and program (Claims 14, 18, 22) claims.


Derivatives of Independent Claim 1: Moving Picture Encoding System

Claim 1 describes an encoding system including a first encoder (standard resolution encoding/decoding), a first super-resolution enlarger (standard to higher resolution), a first resolution converter (higher to standard resolution), and a second encoder (using super-resolution enlarged and converted pictures as target, decoded pictures from first encoder as reference).

Derivative 1.1: FPGA/ASIC-Accelerated Super-Resolution Encoding System with CNN-based Super-Resolution

Enabling Description:
This derivative system implements the super-resolution enlargement and resolution conversion processes using dedicated hardware accelerators, specifically Field-Programmable Gate Arrays (FPGAs) or Application-Specific Integrated Circuits (ASICs), for enhanced computational efficiency and real-time processing capabilities. The first super-resolution enlarger (e.g., 103 in US10218995B2) is realized by a Convolutional Neural Network (CNN) inference engine optimized for FPGA/ASIC deployment. This engine utilizes a pre-trained super-resolution CNN architecture, such as a Super-Resolution Convolutional Neural Network (SRCNN) for direct end-to-end mapping, or an Enhanced Deep Super-Resolution Network (EDSR) for higher quality with increased computational complexity. The CNN operates on the input sequence of moving pictures with a standard resolution to generate super-resolution enlarged pictures by inferring high-frequency details. The first resolution converter (e.g., 104 in US10218995B2) is implemented as a fixed-function hardware block performing bicubic downsampling or a frequency-domain low-pass filter (e.g., using a 2D-FFT and inverse 2D-FFT with frequency cutoff) to accurately convert the CNN-generated high-resolution pictures back to the standard resolution, creating super-resolution enlarged and converted pictures. Both the first encoder (e.g., 102 in US10218995B2) and second encoder (e.g., 107 in US10218995B2) are also implemented as hardware codecs (e.g., H.264/AVC or H.265/HEVC cores) integrated within the same FPGA/ASIC, capable of parallel processing streams. The inter-layer prediction data exchange between the two encoders is facilitated by high-speed on-chip memory interfaces (e.g., AXI-Stream).

graph TD
    A[Standard Resolution Input Moving Pictures] --> B{FPGA/ASIC Super-Resolution System}
    B -- Subsequence --> C[First Encoder (Hardware Codec)]
    C -- Decoded Pictures (Std Res) --> G[Second Encoder (Hardware Codec)]
    B -- Subsequence --> D[CNN-based Super-Resolution Engine (FPGA/ASIC)]
    D -- Super-Resolution Enlarged Pictures (High Res) --> E[Resolution Converter (Hardware Filter)]
    E -- Super-Resolution Enlarged and Converted Pictures (Std Res) --> F[Second Encoder (Hardware Codec)]
    F -- Encoding Target Pictures --> G
    C -- First Sequence of Encoded Bits --> H[Multiplexer (Hardware)]
    G -- Second Sequence of Encoded Bits --> H
    H --> I[Output Encoded Bitstream]

Derivative 1.2: Ultra-High Frame Rate (UHFR) and High Dynamic Range (HDR) Super-Resolution Encoding for Scientific Imaging

Enabling Description:
This system is specialized for encoding moving picture sequences with extreme operational parameters, specifically Ultra-High Frame Rate (UHFR) video (e.g., 1,000 to 100,000 frames per second) and High Dynamic Range (HDR) content (e.g., 12-bit or 14-bit per color channel), prevalent in scientific imaging applications such as particle physics, combustion analysis, or ballistic studies. The first super-resolution enlarger (e.g., 103) employs a temporal super-resolution algorithm in addition to spatial super-resolution. This involves aggregating information from multiple temporally adjacent observation pictures (frames) to reconstruct a single high-resolution frame, effectively increasing both spatial detail and mitigating motion blur inherent in UHFR acquisition. For HDR content, the super-resolution and resolution conversion stages operate in a perceptually uniform color space (e.g., PQ, HLG, or log-luminance encoding) to prevent quantization artifacts in high-brightness areas. The first encoder (e.g., 102) and second encoder (e.g., 107) are adapted to handle UHFR/HDR metadata and extended color gamuts, utilizing coding standards like H.265/HEVC Main 10 or VVC (H.266) to efficiently compress the increased data volume. The second encoder leverages the standard-resolution decoded pictures and the super-resolution enlarged and converted pictures as references, where the super-resolution branch specifically enriches the scene with recovered temporal and spatial texture information crucial for scientific analysis.

graph TD
    A[UHFR/HDR Raw Input (1000+ fps, 12-bit+)] --> B{Specialized Input Buffer}
    B -- Chunked Frames --> C[First Super-Resolution Enlargement (Spatial + Temporal SR, HDR-aware)]
    C -- High-Res UHFR/HDR Frames --> D[First Resolution Converter (HDR-aware Downsampling)]
    D -- Std-Res UHFR/HDR Frames --> E[Second Encoder (H.265/VVC Main 10+)]
    B -- Chunked Frames --> F[First Encoder (H.265/VVC Main 10+)]
    F -- Decoded Std-Res UHFR/HDR Frames --> G[Second Encoder (H.265/VVC Main 10+)]
    E -- Encoding Target --> G
    F -- First Encoded Bitstream --> H[Multiplexer]
    G -- Second Encoded Bitstream --> H
    H --> I[UHFR/HDR Super-Res Encoded Output]

Derivative 1.3.1: Cross-Domain Application: Real-time Super-Resolution Encoding for Telemedicine Diagnostics

Enabling Description:
This system is configured for real-time moving picture encoding in telemedicine applications, particularly for remote diagnostic procedures involving low-resolution medical video feeds (e.g., endoscopic procedures, ultrasound scans, dermatoscopy). A common challenge in telemedicine is transmitting sufficient diagnostic detail over varying network bandwidths. The first encoder (e.g., 102) encodes the raw, standard-resolution (e.g., 720p) medical video for baseline transmission. Simultaneously, the first super-resolution enlarger (e.g., 103) applies a super-resolution process to the original standard-resolution medical video, specifically optimized for enhancing fine anatomical structures, tissue textures, or lesion boundaries. The super-resolution model might be pre-trained on large datasets of medical imagery. The resulting super-resolution enlarged pictures (e.g., 1080p or 4K) are then fed to the first resolution converter (e.g., 104), which downconverts them back to the standard resolution for efficient processing, preserving the enhanced high-frequency information within the lower resolution. The second encoder (e.g., 107) then encodes these super-resolution enlarged and converted pictures using the standard-resolution decoded pictures from the first encoder as reference. This hierarchical encoding creates a two-layer stream: a base layer (standard resolution, sufficient for general viewing) and an enhancement layer (carrying the super-resolution derived details, crucial for precise diagnosis). This allows a remote clinician to dynamically request or receive the enhanced layer, revealing finer details (e.g., vessel patterns, cell morphology) crucial for accurate real-time diagnosis, even if the original capture resolution was limited.

graph TD
    A[Low-Res Medical Video Input (e.g., Endoscope)] --> B{Medical Video Encoding System}
    B -- Original Stream --> C[First Encoder (Base Layer Encoding)]
    C -- Decoded Base Layer (Std Res) --> F[Second Encoder (Enhancement Layer Encoding)]
    B -- Original Stream --> D[Medical SR Enlargement (e.g., Texture/Boundary Enhancement)]
    D -- SR Enlarged Medical Pictures (High Res) --> E[Resolution Converter (Std Res Output)]
    E -- SR Enlarged & Converted (Std Res) --> F
    F -- Encoding Target --> F
    C -- Base Layer Bitstream --> G[Multiplexer]
    F -- Enhancement Layer Bitstream --> G
    G --> H[Multiplexed SR Medical Video Stream]

Derivative 1.3.2: Cross-Domain Application: Adaptive Super-Resolution Encoding for Geospatial Intelligence (Satellite/Aerial Surveillance)

Enabling Description:
This system applies moving picture encoding principles to geospatial intelligence, specifically for processing and transmitting satellite or aerial surveillance video feeds where bandwidth and storage are critical constraints. Earth observation satellites and drones capture vast amounts of imagery, often at lower resolutions for wide-area coverage, but requiring high detail for specific targets. The first encoder (e.g., 102) compresses the raw, standard-resolution (e.g., 720p wide-area scan) video feed. Simultaneously, the first super-resolution enlarger (e.g., 103) processes the original input, focusing on specific regions of interest (ROIs) identified by an object detection algorithm (e.g., identifying vehicles, infrastructure). For these ROIs, an advanced super-resolution algorithm, potentially leveraging sparse representation learning or generative adversarial networks (GANs) trained on high-resolution ground truth, reconstructs finer details from the lower-resolution input frames. The first resolution converter (e.g., 104) then downconverts these super-resolved ROIs back to the standard resolution, effectively embedding enhanced target details within a standard-resolution frame. The second encoder (e.g., 107) encodes these super-resolution enhanced ROI frames, using the base-layer decoded frames as reference. This creates a multi-layered stream where a base layer provides general situational awareness, and an enhancement layer (derived via SR) provides actionable intelligence for specific targets with improved detail, allowing analysts to zoom into targets without suffering from traditional upscaling artifacts, critical for reconnaissance and anomaly detection.

graph TD
    A[Satellite/Aerial Video Input (Std Res)] --> B{Geospatial Encoding System}
    B -- Original Stream --> C[First Encoder (Base Layer)]
    C -- Decoded Base Layer (Std Res) --> G[Second Encoder (Enhancement Layer)]
    B -- Original Stream --> D[Super-Resolution Enlargement (ROI-focused, GAN-based)]
    D -- SR Enlarged ROIs (High Res) --> E[Resolution Converter (Std Res Output)]
    E -- SR Enlarged & Converted (Std Res) --> G
    G -- Encoding Target --> G
    C -- Base Layer Bitstream --> H[Multiplexer]
    G -- Enhancement Layer Bitstream --> H
    H --> I[Multiplexed SR Geospatial Stream]

Derivative 1.3.3: Cross-Domain Application: Predictive Maintenance via Super-Resolution Encoding of Industrial Sensor Video

Enabling Description:
This system is tailored for industrial predictive maintenance, where video streams from surveillance cameras monitoring machinery are used to detect early signs of wear, defects, or anomalies. Often, these industrial cameras operate at standard or even lower resolutions to minimize storage and transmission costs. The first encoder (e.g., 102) processes the standard-resolution video feed of machinery components (e.g., bearings, gears, conveyor belts) for baseline monitoring. Concurrently, the first super-resolution enlarger (e.g., 103) applies a super-resolution technique to the input video, focusing on enhancing subtle visual cues indicative of potential failures, such as micro-cracks, surface pitting, unusual vibrations, or oil leaks. This SR process may employ optical flow for motion compensation between frames and a robust SR algorithm to reconstruct sharper details. The first resolution converter (e.g., 104) converts the super-resolution enlarged images back to standard resolution, effectively embedding critical high-frequency information about component integrity within a manageable data rate. The second encoder (e.g., 107) encodes these SR-enhanced frames, referencing the base-layer decoded frames. The resulting dual-layer stream allows for continuous, low-bandwidth monitoring with a base layer, while the enhancement layer provides diagnostically superior images for automated anomaly detection algorithms or human inspection, facilitating early intervention and preventing costly equipment failures.

graph TD
    A[Industrial Sensor Video (Std Res)] --> B{Predictive Maintenance Encoding System}
    B -- Original Stream --> C[First Encoder (Baseline Monitoring)]
    C -- Decoded Baseline (Std Res) --> G[Second Encoder (Anomaly Enhancement)]
    B -- Original Stream --> D[SR Enlargement (Defect/Wear-focused)]
    D -- SR Enlarged Features (High Res) --> E[Resolution Converter (Std Res Output)]
    E -- SR Enlarged & Converted (Std Res) --> G
    G -- Encoding Target --> G
    C -- Baseline Bitstream --> H[Multiplexer]
    G -- Anomaly Enhancement Bitstream --> H
    H --> I[Multiplexed SR Industrial Stream]

Derivative 1.4: Integration with Emerging Tech: AI-Optimized, IoT-Controlled Super-Resolution Encoding with Distributed Blockchain Verification

Enabling Description:
This derivative integrates AI, IoT, and blockchain for intelligent and verified moving picture encoding. An IoT sensor network (e.g., environmental sensors, network performance monitors) provides real-time contextual data. An AI-driven optimization module (e.g., a reinforcement learning agent) dynamically adjusts the parameters of the first super-resolution enlarger (e.g., 103) and first resolution converter (e.g., 104), as well as the rate control of the first encoder (e.g., 102) and second encoder (e.g., 107). This AI considers input content characteristics, available network bandwidth (from IoT), and computational resources to optimize the super-resolution model selection (e.g., switching between fast, lower-quality SRCNN and slower, higher-quality EDSR) and encoding bitrates to achieve a target quality-of-experience or minimize latency. For instance, if IoT sensors detect high network congestion, the AI may reduce the SR upscale factor or prioritize temporal over spatial SR. Furthermore, a blockchain verification module generates cryptographic hashes of segments of the first sequence of encoded bits and second sequence of encoded bits, along with associated encoding parameters and AI/IoT configuration metadata. These hashes are recorded on an immutable distributed ledger (blockchain). This ensures the integrity and verifiable provenance of the super-resolution enhanced video stream, guaranteeing that the content and its processing history (including AI decisions) can be audited and trusted across a supply chain or distributed network.

graph TD
    subgraph AI-Optimized Encoding
        A[Standard Res Input] --> B{AI-driven Optimization Module}
        B -- Parameters --> C[First Encoder]
        C -- Decoded Pictures --> H[Second Encoder]
        B -- Parameters --> D[First SR Enlargement]
        D -- High Res Pictures --> E[First Resolution Converter]
        E -- Std Res Pictures --> H
        H -- Target/Reference --> H
        C -- Encoded Bits 1 --> I[Multiplexer]
        H -- Encoded Bits 2 --> I
    end
    subgraph IoT Control
        J[IoT Sensor Network] -- Real-time Data --> B
    end
    subgraph Blockchain Verification
        I -- Encoded Bitstream Segments + Metadata --> K[Blockchain Verification Module]
        K -- Hashes --> L[Distributed Ledger (Blockchain)]
    end
    I --> M[Output Encoded Bitstream]

Derivative 1.5: The "Inverse" or Failure Mode: Resilient Super-Resolution Encoding with Progressive Degradation for Bandwidth-Constrained Environments

Enabling Description:
This system is designed for resilient operation in highly variable or constrained bandwidth environments, or when computational resources are limited, enabling a graceful degradation of super-resolution and encoding quality rather than outright failure. A resource monitoring unit continuously assesses available network bandwidth, CPU/GPU load, and memory usage. The first super-resolution enlarger (e.g., 103) is configured with multiple super-resolution models or scaling factors, ranging from aggressive high-quality/high-compute models to lightweight, faster models (or even simple interpolation). Similarly, the first resolution converter (e.g., 104) can adjust its filtering characteristics. The system prioritizes regions of interest (ROIs) within the moving pictures (e.g., detected faces, text, or primary subjects) for higher super-resolution fidelity. During bandwidth scarcity or high load, the resource monitoring unit instructs the super-resolution enlarger to switch to a lower computational complexity SR model or reduce the spatial upscale factor. Non-ROI areas may receive minimal or no super-resolution processing. Concurrently, the first encoder (e.g., 102) and second encoder (e.g., 107) dynamically adjust quantization parameters and reference picture structures to reduce bitrate, maintaining critical base-layer information at the expense of enhancement layer quality or overall fidelity. This progressive degradation ensures continuous, albeit adaptively scaled, video delivery with super-resolution benefits prioritized for essential content. In extreme conditions, the first super-resolution enlarger and first resolution converter may be bypassed entirely, and the second encoder operates purely on the original standard-resolution stream using the decoded base layer as a reference, effectively falling back to a non-SR, dual-layer encoding mode.

stateDiagram
    direction LR
    Idle --> Monitoring: Start Encoding
    Monitoring --> Optimal_SR_Encoding: High Resources
    Monitoring --> Degraded_SR_Encoding: Medium Resources / Bandwidth Constraint
    Monitoring --> Baseline_Encoding_Fallback: Low Resources / Severe Bandwidth
    Optimal_SR_Encoding --> Monitoring: Continue
    Degraded_SR_Encoding --> Monitoring: Continue
    Baseline_Encoding_Fallback --> Monitoring: Continue
    
    state Optimal_SR_Encoding {
        High_Quality_SR --> High_Bitrate_Encoding
    }
    state Degraded_SR_Encoding {
        Low_Compute_SR --> Adaptive_Bitrate_Encoding
        Low_Compute_SR --> ROI_Prioritized_SR
    }
    state Baseline_Encoding_Fallback {
        SR_Bypass --> Standard_Dual_Layer_Encoding
    }

Derivatives of Independent Claim 15: Moving Picture Decoding System

Claim 15 describes a decoding system including a demultiplexer, a first decoder (standard resolution), a first super-resolution enlarger (standard decoded to higher resolution), and a first resolution converter (higher to standard resolution).

Derivative 2.1: GPU-Accelerated Real-time Super-Resolution Decoding for Immersive Displays

Enabling Description:
This derivative system is designed for high-performance, real-time decoding of super-resolution enhanced video streams for immersive display technologies (e.g., large-format displays, virtual reality headsets). The demultiplexer (e.g., as in US10218995B2) separates the incoming multiplexed bitstream into standard-resolution encoded bits and associated enhancement layer bits. The first decoder (e.g., as in US10218995B2) decodes the standard-resolution layer. The first super-resolution enlarger (e.g., as in US10218995B2) and first resolution converter (e.g., as in US10218995B2) functionalities are offloaded entirely to a high-performance Graphics Processing Unit (GPU). The GPU leverages massively parallel processing (e.g., CUDA or OpenCL kernels) to execute advanced super-resolution algorithms (e.g., deep learning-based upscalers like ESRGAN or SwinIR for superior visual quality) on the decoded standard-resolution pictures. This allows for real-time reconstruction of super-resolution enlarged decoded pictures at high fidelity and resolution (ee.g., 4K or 8K) required for immersive viewing, which are then passed to the GPU-accelerated resolution converter for final processing if needed, before rendering. The entire pipeline from bitstream parsing to pixel rendering is optimized for minimum latency using GPU direct memory access (DMA) and shared memory architectures.

graph TD
    A[Input Encoded Bitstream] --> B[Demultiplexer]
    B -- Std Res Encoded Bits --> C[First Decoder (CPU/Dedicated HW)]
    C -- Std Res Decoded Pictures --> D{GPU Processing Unit}
    D -- Render Buffer --> E[Immersive Display]
    subgraph GPU Processing Unit
        D1[GPU-accelerated First Super-Resolution Enlargement (e.g., ESRGAN)] --> D2[GPU-accelerated First Resolution Converter]
    end
    D -- Std Res Decoded Pictures (Input to SR) --> D1
    D2 -- Super-Resolution Decoded Pictures (Std Res) --> D
    style D fill:#f9f,stroke:#333,stroke-width:2px

Derivative 2.2: Decoding of Volumetric Super-Resolution Video Streams for 3D Visualization

Enabling Description:
This derivative extends the moving picture decoding system to handle volumetric video data, such as those used in light field displays, holographic projections, or medical volumetric rendering. Instead of 2D moving pictures, the input bitstream contains encoded representations of 3D light fields or voxel grids, where standard resolution refers to a baseline volumetric sampling density. The demultiplexer (e.g., as in US10218995B2) parses the volumetric bitstream, separating base-layer volumetric data from super-resolution enhancement data. The first decoder (e.g., as in US10218995B2) reconstructs the standard resolution (e.g., 128x128x128 voxel) volumetric data. The first super-resolution enlarger (e.g., as in US10218995B2) is adapted to perform 3D volumetric super-resolution. This involves using 3D convolutional neural networks or multi-plane image (MPI) synthesis techniques to infer higher-resolution volumetric details (e.g., 256x256x256 voxel or higher density light field samples) from the decoded standard resolution volumetric data. This creates super-resolution enlarged decoded volumetric pictures. The first resolution converter (e.g., as in US10218995B2) then downsamples the high-resolution volumetric data to a desired output standard resolution (which may still be higher than the input base layer for specific 3D displays, or a virtual 2D slice). The system is crucial for enabling high-fidelity interactive 3D visualization from compressed volumetric streams, for example, in surgical planning or architectural walkthroughs.

graph TD
    A[Encoded Volumetric Bitstream] --> B[Volumetric Demultiplexer]
    B -- Std Res Volumetric Data --> C[First Volumetric Decoder]
    C -- Std Res Decoded Volume --> D[First 3D Super-Resolution Enlargement]
    D -- High Res Decoded Volume --> E[First Volumetric Resolution Converter]
    E -- Std Res Super-Resolution Decoded Volume --> F[3D Display/Renderer]

Derivative 2.3: Cross-Domain Application: Low-Latency Super-Resolution Decoding for Immersive Augmented Reality

Enabling Description:
This moving picture decoding system is optimized for low-latency operation in immersive Augmented Reality (AR) applications, particularly when AR content is rendered remotely and streamed to a head-mounted display (HMD). To minimize bandwidth while maintaining perceived quality, a remote server might encode lower-resolution video frames corresponding to the user's field of view (FOV). The demultiplexer (e.g., as in US10218995B2) receives this stream. The first decoder (e.g., as in US10218995B2) decodes the baseline standard resolution AR frames. Crucially, the first super-resolution enlarger (e.g., as in US10218995B2) employs a lightweight, high-speed super-resolution algorithm (e.g., a shallow CNN or optimized bicubic upscaler with sharpening) specifically tailored to run on the HMD's onboard processor or a mobile companion device. This SR process dynamically prioritizes regions within the user's foveal (central vision) area, applying higher quality super-resolution to those pixels, while peripheral areas receive minimal or no SR, or a computationally cheaper variant. This "foveated super-resolution" reduces overall computational load and latency. The first resolution converter (e.g., as in US10218995B2) then scales these selectively super-resolved images to the native resolution of the HMD. This allows for perceived high-resolution AR experiences from bandwidth-efficient streams, maintaining interactivity and minimizing motion-to-photon latency, which is critical for user comfort and immersion.

graph TD
    A[Encoded AR Stream (Low Res, Foveated)] --> B[Demultiplexer]
    B -- Encoded Std Res AR Frames --> C[First Decoder (on HMD)]
    C -- Std Res Decoded AR Frames --> D[Foveated Super-Resolution Enlargement (on HMD)]
    D -- High Res AR Frames (Foveated SR) --> E[Resolution Converter (on HMD)]
    E -- HMD Native Res AR Frames --> F[AR Head-Mounted Display]

Derivative 2.4: Integration with Emerging Tech: Edge-AI Enhanced Super-Resolution Decoding for Smart Devices

Enabling Description:
This moving picture decoding system is implemented on smart devices (e.g., smartphones, smart cameras, IoT-enabled screens) leveraging on-device Edge-AI capabilities. The demultiplexer (e.g., as in US10218995B2) and first decoder (e.g., as in US10218995B2) function as usual, decoding standard-resolution video streams. The first super-resolution enlarger (e.g., as in US10218995B2) and first resolution converter (e.g., as in US10218995B2) are realized by highly optimized, quantized deep learning models (e.g., MobileNetV2-based SR, or specialized neural network accelerators like Google's Edge TPU). These models perform super-resolution inference directly on the edge device, using the decoded standard resolution pictures as input. This Edge-AI approach reduces reliance on cloud processing, minimizes data transfer, and significantly lowers latency, making super-resolution enhancement suitable for applications where instant visual feedback is required (e.g., smart home security cameras upscaling live feeds, mobile video playback enhancing low-resolution content). The models are designed for low power consumption and high inference speed on resource-constrained hardware, potentially adapting the SR model dynamically based on battery life or computational load.

graph TD
    A[Input Encoded Bitstream] --> B[Demultiplexer (Smart Device)]
    B -- Std Res Encoded Bits --> C[First Decoder (Smart Device CPU)]
    C -- Std Res Decoded Pictures --> D{Edge-AI Super-Resolution Module (Smart Device NPU/GPU)}
    D -- High Res Output --> E[Smart Device Display/Output]
    subgraph Edge-AI Super-Resolution Module
        D1[Optimized SR-CNN Inference] --> D2[Quantized Resolution Converter]
    end
    C -- Std Res Decoded Pictures (Input to SR) --> D1
    D2 -- Super-Resolution Decoded Pictures (Std Res) --> D
    style D fill:#f9f,stroke:#333,stroke-width:2px

Derivative 2.5: The "Inverse" or Failure Mode: Bandwidth-Adaptive Super-Resolution Decoding with Content-Aware Skipping

Enabling Description:
This moving picture decoding system is designed for robust operation under adverse network conditions, prioritizing critical visual information when bandwidth is insufficient or processing power is overloaded. A network/resource monitor continuously assesses available bandwidth and decoding device capabilities. The demultiplexer (e.g., as in US10218995B2) intelligently drops or selectively decodes parts of the enhancement layer bitstream based on predefined content-aware metrics (e.g., motion vectors indicating activity, saliency maps highlighting important regions). The first decoder (e.g., as in US10218995B2) decodes the base layer. If network bandwidth drops significantly or processing becomes constrained, the first super-resolution enlarger (e.g., as in US10218995B2) switches to a lower-quality, faster super-resolution mode (e.g., bilinear upscaling instead of deep learning SR) or applies SR only to detected regions of interest. For non-critical pictures or regions, the system may skip the super-resolution enlargement process entirely, or even downscale the decoded base layer to conserve resources. The first resolution converter (e.g., as in US10218995B2) similarly adapts its conversion parameters. The goal is to maintain a continuous, albeit adaptively degraded, visual experience rather than buffering or freezing. For example, during video conferencing, the system would prioritize SR for speaker faces while using minimal processing for background elements, ensuring crucial visual communication persists through network fluctuations.

stateDiagram
    direction LR
    Idle --> Start_Decoding: Input Bitstream
    Start_Decoding --> Monitoring_Resources: Initialize
    
    Monitoring_Resources --> Full_SR_Decode: High Bandwidth/Resources
    Monitoring_Resources --> Adaptive_SR_Decode: Moderate Bandwidth/Resources
    Monitoring_Resources --> Base_Layer_Only_Decode: Low Bandwidth/Resources
    
    state Full_SR_Decode {
        Demux_Full --> First_Decode_Full --> Full_SR_Enlarge --> Full_Res_Convert --> Render_Full
    }
    
    state Adaptive_SR_Decode {
        Demux_Partial_Enhancement --> First_Decode_Adaptive --> Selective_SR_Enlarge --> Adaptive_Res_Convert --> Render_Adaptive
    }
    
    state Base_Layer_Only_Decode {
        Demux_Base_Only --> First_Decode_Base --> Render_Base
    }
    
    Full_SR_Decode --> Monitoring_Resources: Continue
    Adaptive_SR_Decode --> Monitoring_Resources: Continue
    Base_Layer_Only_Decode --> Monitoring_Resources: Continue

Derivatives of Independent Claim 19: Moving Picture Reencoding System

Claim 19 describes a reencoding system including a demultiplexer, a decoder (standard resolution), a first super-resolution enlarger (standard decoded to higher resolution), and a first resolution converter (higher to standard resolution). It is "adapted to input thus encoded moving pictures to decode and reencode," implying a reencoder stage using the processed data.

Derivative 3.1: Heterogeneous Compute Reencoding System with DNN-based Transcoding

Enabling Description:
This derivative moving picture reencoding system utilizes a heterogeneous computing architecture, distributing the workload across various processing units for optimal performance during transcoding and format conversion. The demultiplexer (e.g., as in US10218995B2) and initial decoder (e.g., as in US10218995B2) might run on specialized hardware decoder blocks (e.g., ASIC decoder) for efficient stream parsing and standard-resolution frame extraction. The first super-resolution enlarger (e.g., as in US10218995B2) is implemented on a high-performance GPU, employing deep neural networks (DNNs) specifically trained for super-resolution and potentially denoising/deblocking. The first resolution converter (e.g., as in US10218995B2) also leverages GPU shaders for high-quality downsampling and filtering. The subsequent reencoder stage (not explicitly detailed in the provided claim summary but implicit in "reencoding system") integrates a DNN-based transcoder on an AI accelerator (e.g., Tensor Processing Unit - TPU or dedicated NPU). This DNN transcoder performs content-aware rate control, perceptual quality optimization, and adaptive parameter selection for the new encoding process. It can dynamically choose optimal encoding presets (e.g., for H.264, H.265, AV1) based on the input content and target distribution platform, ensuring maximum quality for a given bitrate constraint, leveraging the super-resolution enhanced input to generate perceptually superior reencoded output.

graph TD
    A[Input Encoded Bitstream] --> B[Demultiplexer (ASIC)]
    B -- Encoded Bits --> C[Decoder (ASIC)]
    C -- Decoded Std Res Pictures --> D{Heterogeneous Compute Cluster}
    subgraph Heterogeneous Compute Cluster
        D1[GPU-accelerated First Super-Resolution Enlargement (DNN)] --> D2[GPU-accelerated First Resolution Converter]
        D2 -- Processed Pictures --> D3[AI Accelerator (TPU/NPU) - DNN Transcoder]
    end
    D3 -- Reencoded Bitstream --> E[Output Reencoded Bitstream]
    style D fill:#f9f,stroke:#333,stroke-width:2px

Derivative 3.2: Multi-View Super-Resolution Reencoding for 360-degree Video Streaming

Enabling Description:
This moving picture reencoding system is designed for processing and optimizing 360-degree or multi-view video content, common in virtual reality (VR) and immersive media. Input consists of multiple synchronized video streams (e.g., an equirectangular projection or multiple fisheye camera feeds). The demultiplexer (e.g., as in US10218995B2) and decoder (e.g., as in US10218995B2) handle each individual stream at standard resolution. The first super-resolution enlarger (e.g., as in US10218995B2) is adapted for multi-view super-resolution. This involves a spatial-temporal super-resolution algorithm that not only enhances individual frames but also leverages redundancy and coherence across adjacent views and temporal frames to reconstruct a higher-resolution, consistent 360-degree panorama. This might involve projecting frames into a 3D space, performing SR, and then re-projecting. The first resolution converter (e.g., as in US10218995B2) then converts this super-resolved 360-degree representation back to a standard resolution (e.g., a higher-quality equirectangular format than the original input). The subsequent reencoder (e.g., a specialized 360-degree video encoder like HEVC-360) compresses this super-resolved 360-degree content into a format suitable for adaptive streaming (e.g., view-dependent tiling). This derivative significantly improves the perceived quality of 360-degree video, especially for zoomed-in areas, which are typically low-resolution due to wide-angle capture.

graph TD
    A[Multi-View Encoded Bitstream (360 Video)] --> B[Demultiplexer (Per View)]
    B -- Individual View Streams --> C[Decoder Array (Per View, Std Res)]
    C -- Decoded Std Res Views --> D{Multi-View Super-Resolution Processor}
    subgraph Multi-View Super-Resolution Processor
        D1[Spatial-Temporal Multi-View SR Enlargement] --> D2[360 Resolution Converter]
    end
    D -- Super-Resolution 360 Video --> E[360 Video Reencoder (e.g., HEVC-360)]
    E --> F[Reencoded 360 Bitstream]
    style D fill:#f9f,stroke:#333,stroke-width:2px

Derivative 3.3: Cross-Domain Application: Preservation Reencoding with Perceptual Super-Resolution for Legacy Media Archiving

Enabling Description:
This moving picture reencoding system is tailored for digital archiving and preservation of legacy media, such as analog film, VHS tapes, or early digital video formats, which often suffer from low resolution, noise, and degradation. The system aims to perceptually enhance and digitize these assets into modern, robust formats. The demultiplexer (if applicable for digital inputs) and decoder (e.g., as in US10218995B2) process the source material (e.g., converted analog-to-digital or legacy digital files). The first super-resolution enlarger (e.g., as in US10218995B2) employs advanced perceptual super-resolution algorithms (e.g., GAN-based SR models trained on high-quality archival data) combined with sophisticated denoising, deinterlacing, and color restoration techniques. This process reconstructs super-resolution enlarged decoded pictures that appear to have significantly higher detail and clarity than the original, removing artifacts and inferring lost information. The first resolution converter (e.g., as in US10218995B2) then scales these enhanced pictures to a standard resolution suitable for modern digital archives (e.g., 1080p or 4K archive master), ensuring frequency components within this target resolution are optimally represented. The subsequent reencoder (e.g., using lossless or high-bitrate visually lossless codecs like JPEG 2000, ProRes, or FFV1) archives this perceptually enhanced content, future-proofing legacy media by capturing its "best possible" visual state.

graph TD
    A[Legacy Media Input (Analog/Low-Res Digital)] --> B[Digitizer/Demultiplexer]
    B -- Decoded Std Res Pictures --> C{Perceptual Enhancement & Reencoding}
    subgraph Perceptual Enhancement & Reencoding
        C1[Advanced SR Enlargement (GAN-based, Denoising, Color Restore)] --> C2[Resolution Converter (Archive Target Res)]
        C2 -- Perceptually Enhanced Pictures --> C3[Archival Reencoder (Lossless/Visually Lossless)]
    end
    C3 --> D[Archival Master Output (High Res, Modern Codec)]
    style C fill:#f9f,stroke:#333,stroke-width:2px

Derivative 3.4: Integration with Emerging Tech: Cloud-Native Serverless Super-Resolution Reencoding with Microservices Architecture

Enabling Description:
This moving picture reencoding system is implemented as a cloud-native, serverless application utilizing a microservices architecture, providing on-demand scalability and cost efficiency for video processing. Each functional block of the reencoding system – demultiplexing, decoding, super-resolution enlargement, resolution conversion, and reencoding – is deployed as an independent serverless function (e.g., AWS Lambda, Google Cloud Functions, Azure Functions) or a containerized microservice (e.g., Kubernetes pods). Input encoded bits are ingested into an object storage bucket (e.g., S3). An event trigger invokes the demultiplexing microservice, which then passes intermediate data to other microservices via message queues (e.g., Kafka, SQS) or temporary storage. The first super-resolution enlarger (e.g., as in US10218995B2) and first resolution converter (e.g., as in US10218995B2) microservices execute on transient, automatically scaled compute instances, potentially using specialized cloud GPUs for SR inference. The reencoder microservice (e.g., FFmpeg container) dynamically adjusts its resources based on workload. This architecture allows for massive parallel processing of multiple reencoding jobs, automatic scaling to meet demand spikes, and pay-per-execution cost models, making it ideal for large-scale content libraries requiring various output formats and resolutions.

graph TD
    A[Input Encoded Bitstream (Object Storage)] --> B(Event Trigger)
    B --> C[Demultiplexer Microservice]
    C --> D[Message Queue/Storage]
    D --> E[Decoder Microservice]
    E --> F[Message Queue/Storage]
    F --> G[SR Enlargement Microservice (Cloud GPU)]
    G --> H[Message Queue/Storage]
    H --> I[Resolution Converter Microservice]
    I --> J[Message Queue/Storage]
    J --> K[Reencoder Microservice]
    K --> L[Output Reencoded Bitstream (Object Storage)]

Derivative 3.5: The "Inverse" or Failure Mode: Disaster Recovery Reencoding for Corrupted Streams with AI-Assisted Reconstruction

Enabling Description:
This moving picture reencoding system is specifically designed to salvage and re-encode partially corrupted or severely degraded video streams, common in disaster recovery scenarios (e.g., damaged storage media, interrupted transmissions). The primary goal is to produce a viewable, albeit potentially imperfect, reencoded stream. The demultiplexer (e.g., as in US10218995B2) and decoder (e.g., as in US10218995B2) are equipped with robust error concealment and resilience mechanisms, attempting to decode as much valid data as possible even from corrupted segments. If blocks or frames are entirely missing or severely damaged, the first super-resolution enlarger (e.g., as in US10218995B2) integrates AI-assisted reconstruction modules. These modules (e.g., inpainting GANs, temporal prediction networks) utilize surrounding valid frames or spatial context to infer and reconstruct missing pixels or entire regions, generating super-resolution enlarged decoded pictures where corrupted data has been "filled in" or hallucinated to maintain visual coherence. The first resolution converter (e.g., as in US10218995B2) then smooths and downconverts these reconstructed images. The subsequent reencoder (e.g., a standard video codec) then re-encodes this "repaired" stream, prioritizing continuity and watchability. Metadata can be embedded in the reencoded stream to indicate areas that underwent AI-assisted reconstruction. This system ensures that even severely compromised video evidence or critical event recordings can be recovered and made accessible, rather than being discarded due to corruption.

stateDiagram
    direction LR
    Idle --> Start_Reencoding: Input Corrupted Stream
    Start_Reencoding --> Demultiplex_Corrupted: Process
    Demultiplex_Corrupted --> Decode_With_Error_Concealment: Process
    Decode_With_Error_Concealment --> Check_Corruption_Level: Evaluate
    
    Check_Corruption_Level --> AI_Reconstruction: High Corruption
    Check_Corruption_Level --> Direct_SR_Enlargement: Low Corruption
    
    state AI_Reconstruction {
        Inpainting_GAN --> Temporal_Prediction --> SR_Enlarge_Reconstructed
    }
    
    state Direct_SR_Enlargement {
        Standard_SR_Algorithm
    }
    
    SR_Enlarge_Reconstructed --> Resolution_Conversion: Process
    Standard_SR_Algorithm --> Resolution_Conversion: Process
    
    Resolution_Conversion --> Reencode_Repaired_Stream: Process
    Reencode_Repaired_Stream --> Output_Recovered_Stream: Done

Combination Prior Art Scenarios with Open-Source Standards

The core invention of US Patent 10,218,995 involves enhancing video encoding efficiency by integrating super-resolution processes into a hierarchical encoding framework. This concept can be combined with existing open-source video standards to create novel systems that leverage both the patent's innovations and widely adopted technologies.

Combination Prior Art 1: H.264/AVC with Super-Resolution Enhancement Layers

Scenario: A moving picture encoding system, as described in independent Claim 1 of US10218995B2, is implemented where the first encoder (e.g., 102) and second encoder (e.g., 107) are compliant with the H.264/AVC (Advanced Video Coding) standard, specifically utilizing its Scalable Video Coding (SVC) extension or a multi-layer coding approach. The first encoder generates a base layer H.264/AVC bitstream at a standard resolution. The first super-resolution enlarger (e.g., 103) and first resolution converter (e.g., 104) process the input to create a super-resolution enhanced standard-resolution signal. The second encoder then encodes this enhanced signal as an H.264/AVC enhancement layer, using the decoded base layer from the first encoder as an inter-layer reference. The multiplexer (e.g., 109) combines these two H.264/AVC layers into a single SVC-compliant bitstream. This combination demonstrates the application of the patent's super-resolution methodology within a widely used, standardized scalable video coding framework, making future similar hierarchical SR enhancements obvious.

Combination Prior Art 2: H.265/HEVC with AI-Driven Super-Resolution for Adaptive Streaming (MPEG-DASH)

Scenario: A moving picture encoding system, as described in independent Claim 1 of US10218995B2, is configured to produce multiple resolution renditions for MPEG-DASH (Dynamic Adaptive Streaming over HTTP). The first encoder (e.g., 102) generates a base layer video stream encoded with H.265/HEVC (High Efficiency Video Coding) at a standard resolution. The first super-resolution enlarger (e.g., 103) employs a pre-trained deep learning model (e.g., a lightweight SRCNN or EDSR) to create super-resolution enlarged pictures which are then processed by the first resolution converter (e.g., 104) back to a standard resolution. The second encoder (e.g., 107) then encodes this super-resolution enhanced stream also using H.265/HEVC, leveraging inter-layer prediction from the base layer. A multiplexer (e.g., 109) packages these H.265/HEVC streams into an MPEG-DASH compliant manifest, offering both the original resolution H.265/HEVC stream and the super-resolution enhanced H.265/HEVC stream as adaptive bitrate representations. This allows clients to adaptively stream content, choosing a super-resolution enhanced version if bandwidth permits, demonstrating the patent's utility in modern adaptive streaming workflows.

Combination Prior Art 3: VP9/AV1 Codecs with Open-Source Super-Resolution Libraries

Scenario: A moving picture reencoding system, as described in independent Claim 19 of US10218995B2, is used to transcode existing video content into the VP9 or AV1 (AOMedia Video 1) open-source video codecs. The demultiplexer (e.g., as in US10218995B2) and decoder (e.g., as in US10218995B2) process an input video stream (e.g., H.264). The first super-resolution enlarger (e.g., as in US10218995B2) integrates an open-source super-resolution library (e.g., OpenCV's DNN super-resolution module leveraging SRCNN/EDSR models or a custom FFmpeg filter for SR). This module takes the standard resolution decoded pictures and applies super-resolution to generate super-resolution enlarged decoded pictures. The first resolution converter (e.g., as in US10218995B2), also implemented using open-source image processing tools (e.g., FFmpeg's scale filter), converts these back to a target standard resolution. The reencoder (e.g., libvpx for VP9 or libaom for AV1) then encodes these super-resolution enhanced pictures into a new VP9 or AV1 bitstream. This scenario directly shows how the super-resolution concept from the patent can be integrated with prevalent open-source codecs, enriching their encoding capabilities and creating demonstrably higher quality outputs for web-based video delivery.

Generated 5/17/2026, 12:50:02 PM