Derivative works — US Patent 9203972

Defensive Disclosure and Prior Art Generation for Technology Disclosed in US 9,203,972

Publication Date: May 8, 2026
Subject: Derivatives, Expansions, and Novel Applications of Sub-Band Audio Processing Methodologies for Prior Art Purposes.
Reference Patent: US 9,203,972 B2, "Efficient audio signal processing in the sub-band regime"

This document serves as a defensive publication to disclose technical variations and applications of the methods described in US 9,203,972 (hereafter '972'). The intent is to place these concepts into the public domain, thereby establishing them as prior art against future patent applications that might claim these incremental or derivative innovations.

Derivative Set 1: Based on Core Method of Excision, Processing, and Reconstruction

1.1. Derivative (Integration with Emerging Tech): AI-Driven Dynamic Sub-Band Excision

Enabling Description: The static excision of sub-bands (e.g., every odd-indexed band) as described in '972 is improved by integrating a machine learning model for dynamic, content-aware excision. A lightweight convolutional neural network (CNN) or a Recurrent Neural Network (RNN) is trained on a large corpus of labeled audio signals (e.g., the Librispeech dataset augmented with noise from the NOISEX-92 corpus). The model is trained to predict the perceptual significance and signal-to-noise ratio (SNR) of each sub-band in real-time for short audio frames. The system's processor, based on the model's output, dynamically selects a variable number and distribution of sub-bands for excision in each frame. For frames identified as high-SNR speech, fewer sub-bands are excised. For frames identified as background noise or silence, a higher percentage of sub-bands (e.g., >75%) are excised to maximize computational savings. The reconstruction module then uses an appropriate interpolation method (e.g., linear for low-excision frames, spectral modeling for high-excision frames) based on the excision mask provided by the AI.

Mermaid.js Diagram:

graph TD
    A[Audio Input y(n)] --> B{Analysis Filter Bank};
    B --> C[Sub-Band Signals y_sb(n)];
    C --> D[AI Perceptual Analyzer];
    D -- Excision Mask --> E{Dynamic Excision Filter};
    C --> E;
    E -- Remaining Sub-Bands --> F[Noise/Echo Processing];
    F -- Enhanced Sub-Bands --> G{Reconstruction Processor};
    D -- Reconstruction Hint --> G;
    G -- Reconstructed Sub-Bands --> H{Synthesis Filter Bank};
    H --> I[Enhanced Audio Output s(n)];

    subgraph AI Module
        D
    end

1.2. Derivative (Operational Parameter Expansion): Cryogenic and Ultrasonic Signal Processing

Enabling Description: The method is adapted for operation on signals outside the human-audible spectrum and in extreme temperature environments. For ultrasonic applications, such as non-destructive testing of materials or medical imaging, the analysis filter bank is designed to decompose signals in the 1 MHz - 20 MHz range into hundreds of sub-bands. The high sampling rates (e.g., >50 Msps) make computational efficiency critical. The '972 method is applied to excise redundant sub-bands from the ultrasonic reflection signal before processing for material flaw detection or tissue characterization. For cryogenic applications, such as processing signals from superconducting quantum interference devices (SQUIDs), the processing algorithm is implemented on specialized digital signal processors (DSPs) capable of operating at temperatures below 77 Kelvin. The noise models in the processing stage are adapted to account for Johnson-Nyquist noise at these low temperatures, and the reconstruction algorithms are optimized for the specific statistical properties of quantum signals.

Mermaid.js Diagram:

sequenceDiagram
    participant Transducer as Ultrasonic Transducer (10 MHz);
    participant ADC as High-Speed ADC (50 Msps);
    participant FPGA;
    participant Processor;

    Transducer->>ADC: Analog Ultrasonic Signal;
    ADC->>FPGA: Digital Signal y(n);
    FPGA->>FPGA: Analysis Filter Bank (e.g., Polyphase FFT);
    FPGA->>FPGA: Excision of 60% of Sub-Bands;
    FPGA-->>Processor: Remaining Sub-Bands;
    Processor->>Processor: Flaw Detection Algorithm (Processing);
    Processor->>Processor: Sub-Band Reconstruction (Interpolation);
    Processor-->>FPGA: Enhanced Full-Band Signal;
    FPGA-->>Transducer: Processed Output/Display Data;

1.3. Derivative (Cross-Domain Application): Hyperspectral Image Processing for Agriculture

Enabling Description: The 'excise-process-reconstruct' methodology is applied to the processing of hyperspectral imaging data for precision agriculture. A hyperspectral sensor on a drone or satellite captures image data across hundreds of narrow spectral bands. This results in a massive data cube. To enable real-time, on-board analysis, the '972 concept is adapted.
1. Excision: For a given pixel, the vector of spectral bands is treated as a signal. Based on known spectral signatures of healthy vs. stressed vegetation, a subset of non-critical spectral bands is excised. For example, bands known to be irrelevant for detecting nitrogen deficiency are removed.
2. Processing: The remaining spectral bands are processed using algorithms to calculate vegetation indices (like NDVI) or to detect signs of disease or water stress.
3. Reconstruction: For archival or further analysis, the excised spectral bands are reconstructed using spectral interpolation based on the processed bands. This allows for a significant reduction in data transmission bandwidth from the drone to the ground station.

Mermaid.js Diagram:

graph TD
    A[Hyperspectral Data Cube] --> B{Per-Pixel Spectral Vector};
    B --> C{Band Excision Module};
    C -- Irrelevant Bands Removed --> D[Remaining Bands];
    D --> E{On-Board Processor};
    subgraph "Processing"
        E -- Performs --> F[Vegetation Index Calculation];
        E -- Performs --> G[Disease Signature Analysis];
    end
    F & G -- Processed Bands --> H{Band Reconstruction};
    H --> I[Compressed Data for Transmission];
    I --> J[Ground Station];

1.4. Derivative (The "Inverse" / Failure Mode): Graceful Audio Degradation Mode

Enabling Description: A system-on-chip (SoC) implementing the '972 method is designed with a low-power, "graceful degradation" mode. A power management unit (PMU) monitors the system's battery level. When the battery drops below a predefined threshold (e.g., 20%), the PMU signals the audio processor to enter this mode. In this mode:
1. The number of excised sub-bands is dramatically increased from 50% to 80-90%.
2. The computationally intensive echo cancellation and noise reduction algorithms in the processing stage are replaced with simpler spectral subtraction or a fixed gain model.
3. The reconstruction algorithm is switched from a complex interpolation method to a simple zero-order hold or linear averaging, which has a minimal computational footprint.
  This results in a noticeable but controlled reduction in audio quality, preserving basic intelligibility while extending battery life. The system also implements a fail-safe where if a real-time artifact detector (measuring spectral discontinuity) exceeds a threshold, the reconstruction step is bypassed entirely, and only the processed sub-bands are outputted, creating a band-limited but stable signal.

Mermaid.js Diagram:

stateDiagram-v2
    [*] --> Normal_Mode: Power On
    Normal_Mode: Excision: 50%\nProcessing: NLMS Filter\nReconstruction: Cubic Interpolation
    Normal_Mode --> Low_Power_Mode: Battery < 20%
    Low_Power_Mode: Excision: 85%\nProcessing: Spectral Subtraction\nReconstruction: Linear Average
    Low_Power_Mode --> Fail_Safe_Mode: Artifacts > Threshold
    Low_Power_Mode --> Normal_Mode: Battery Charging
    Fail_Safe_Mode: Reconstruction Bypassed\nOutput is Band-Limited
    Fail_Safe_Mode --> Low_Power_Mode: Artifacts < Threshold
    state Low_Power_Mode {
        [*] --> Standard
        Standard --> Fail_Safe_Mode
    }

1.5. Derivative (Cross-Domain Application): Seismic Data Compression and Analysis

Enabling Description: The method is applied to the processing of seismic data for oil and gas exploration. A seismic survey generates terabytes of time-series data from thousands of geophones. The '972 method is used for efficient compression and pre-processing.
1. Excision: Each seismic trace (time-series signal) is transformed into the frequency domain (sub-bands). Based on the geological region, frequency bands known to contain primarily surface-wave noise or irrelevant high-frequency components are excised.
2. Processing: The remaining sub-bands, containing valuable reflection data, are processed for noise attenuation and migration (a process to reposition reflection data to its correct subsurface location).
3. Reconstruction: Before final interpretation by a geophysicist, the excised frequency bands are reconstructed. This reconstruction can be guided by a geological model of the subsurface, using model-based interpolation to fill in the missing bands in a geologically plausible way. This reduces storage and processing requirements in the data center.

Mermaid.js Diagram:

flowchart LR
    A[Geophone Array] --> B(Raw Seismic Traces);
    B --> C{Frequency Transform};
    C --> D[Sub-Band Representation];
    D --> E{Excision};
    E -- Surface Noise Bands Removed --> F[Remaining Reflection Bands];
    F --> G[Migration & Processing];
    G --> H{Model-Based Reconstruction};
    H -- Uses --> I[Geological Model];
    H --> J(Processed Seismic Volume for Interpretation);

Combination Prior Art Scenarios

This section discloses the combination of the core '972 technology with established open-source standards to create novel, yet obvious, implementations.

2.1. Combination with WebRTC (Web Real-Time Communication)

Title: Computationally-Efficient Audio Processing Pipeline for WebRTC using Dynamic Sub-Band Excision.
Enabling Description: The standard audio processing pipeline in WebRTC, which includes Acoustic Echo Cancellation (AEC), Noise Suppression (NS), and Automatic Gain Control (AGC), is modified to incorporate the '972 method. A new processing block, the "Sub-Band Efficiency Manager," is inserted after the audio capture and before the AEC module. This manager performs an STFT on the input audio, excises a set of sub-bands (e.g., 50% of the bands above 4 kHz, where speech energy is lower), and passes only the remaining sub-bands to the computationally intensive AEC and NS modules. The reference signal (playback audio) is processed identically. After processing, the enhanced sub-bands are fed to a reconstruction module which interpolates the missing bands before the signal is encoded by the Opus codec and transmitted. This method is exposed via a new RTCRtpSender constraint, enableSubBandProcessing, allowing developers to enable this CPU-saving feature on resource-constrained devices like mobile phones and IoT endpoints.

2.2. Combination with the Opus Interactive Audio Codec (IETF RFC 6716)

Title: Pre-Conditioning for Low-Bitrate Opus Encoding via Perceptually-Tuned Sub-Band Excision and Reconstruction.
Enabling Description: The Opus codec's pre-encoding stage is enhanced with a module based on the '972 method. Opus uses a hybrid SILK and CELT architecture. The disclosed method operates as a pre-conditioner for the CELT portion, which handles higher frequencies. Before encoding a frame, the pre-conditioner transforms the audio into the frequency domain. It then uses a psychoacoustic model to identify and excise frequency bands that are likely to be masked or quantized to zero by the Opus encoder at the target bitrate. The remaining bands are processed for noise reduction. The excised bands are then reconstructed using a simplified generative model that introduces "comfort noise" rather than attempting perfect interpolation. This pre-conditioned signal, with reduced noise and complexity in perceptually unimportant regions, allows the Opus encoder to achieve higher quality at very low bitrates, as fewer bits are wasted on encoding noise.

2.3. Combination with the Kaldi Open-Source Speech Recognition Toolkit

Title: Efficient Front-End Processing for Robust Speech Recognition using Sub-Band Excision.
Enabling Description: The feature extraction front-end of the Kaldi speech recognition toolkit is modified to improve robustness in noisy environments and reduce computational load. In a standard Kaldi recipe, the audio is converted to features like Mel-Frequency Cepstral Coefficients (MFCCs). In the disclosed method, a pre-processing step based on '972 is inserted before feature extraction. The input audio is divided into sub-bands. A noise estimator identifies the SNR of each band. Sub-bands with an SNR below a dynamic threshold (e.g., -5 dB) are excised. The remaining "clean" sub-bands are processed for dereverberation. The excised, noisy sub-bands are not reconstructed. Instead, the feature extraction process is modified to compute MFCCs only from the remaining clean sub-bands, effectively ignoring the noisy parts of the spectrum on a frame-by-frame basis. This creates a smaller, more robust feature vector that improves the accuracy of the backend acoustic model in noisy conditions.