Patent 8320575
Derivative works
Defensive disclosure: derivative variations of each claim designed to render future incremental improvements obvious or non-novel.
Active provider: Google · gemini-2.5-pro
Derivative works
Defensive disclosure: derivative variations of each claim designed to render future incremental improvements obvious or non-novel.
Defensive Disclosure and Prior Art Derivations for US 8,320,575
Publication Date: May 8, 2026
Subject: Derivatives and extensions of methods for efficient audio signal processing in the sub-band regime. This document is intended to enter the public domain and serve as prior art for future inventions in this field.
Derivatives of Core Method (Claim 1)
The core method involves dividing an audio signal into sub-bands, excising a subset for computational efficiency, processing the remaining subset, reconstructing the excised subset, and synthesizing a final enhanced signal. The following are derivative works based on this core concept.
Axis 1: Component & Algorithm Substitution
Derivative 1.1: Wavelet Packet Decomposition for Non-Uniform Sub-Bands
- Enabling Description: Instead of using uniform-bandwidth filter banks like DFT or FFT, the initial "dividing" step is performed using a Wavelet Packet Decomposition (WPD). This allows for a non-uniform division of the frequency spectrum, providing higher frequency resolution at lower frequencies and higher temporal resolution at higher frequencies, which better matches human auditory perception. A 5-level WPD with a Daubechies 8 (db8) mother wavelet is used to decompose the signal. The "excising" step then selectively prunes entire branches of the wavelet packet tree based on a perceptual entropy calculation, discarding sub-bands with minimal contribution to speech intelligibility. Reconstruction of the pruned branches is achieved using polyphase interpolation from adjacent parent and child nodes in the tree before the inverse WPD synthesis.
- Mermaid.js Diagram:
graph TD A[Audio Input y(n)] --> B{Wavelet Packet Decomposition}; B --> C1[Sub-band 1]; B --> C2[Sub-band 2]; B --> C...[...]; B --> CN[Sub-band N]; subgraph Excision C1 --> D1{Process}; C2 --> E2[Excise]; C... --> D...{Process}; CN --> EN[Excise]; end subgraph Processing & Reconstruction D1 --> F1[Enhanced Sub-band 1]; D... --> F...[Enhanced Sub-band ...]; F1 & F... --> G{Reconstruct Excised Bands}; G --> H1[Reconstructed Sub-band 2]; G --> HN[Reconstructed Sub-band N]; end F1 & F... & H1 & HN --> I{Inverse Wavelet Packet Synthesis}; I --> J[Enhanced Output s(n)];
Derivative 1.2: Neural Network-Based Spectral Reconstruction
- Enabling Description: After excising a subset of sub-bands (e.g., every odd-indexed frequency bin from an STFT), the remaining sub-bands are processed for noise reduction. The reconstruction of the excised bands is performed by a Generative Adversarial Network (GAN). The generator network is a lightweight convolutional neural network (CNN) that takes the processed, sparse sub-band vector as input and outputs a fully dense, reconstructed vector. The discriminator network is trained to distinguish between original, complete sub-band vectors and the generator's reconstructed vectors. This system learns the statistical relationships between frequency bands, allowing it to generate highly accurate reconstructions that preserve harmonic structures, significantly outperforming linear or spline interpolation. The model is trained on a large corpus of speech data, such as the LibriSpeech dataset.
- Mermaid.js Diagram:
sequenceDiagram participant A as Audio Input participant B as STFT participant C as Exciser participant D as Processor participant G as GAN Reconstructor participant S as iSTFT A->>B: Process Frame B->>C: Full Spectrum C->>D: Remaining Bands (Even) C-->>G: Remaining Bands (Even) D->>G: Enhanced Bands (Even) G->>S: Reconstructed Full Spectrum S->>A: Enhanced Audio Frame
Axis 2: Operational Parameter Expansion
Derivative 2.1: Massively Parallel Processing for Volumetric Acoustic Imaging
- Enabling Description: The method is scaled to process data from a large-aperture spherical microphone array comprising 4096 individual MEMS microphones for real-time 3D acoustic imaging. The input is not a single audio signal but a 4096-channel data stream. The sub-band division is performed on a GPU using a batched CUDA-based FFT kernel. The excision is applied across both frequency and spatial domains; sub-bands corresponding to spatial sectors with energy below a dynamic threshold are excised to focus computation on active sound sources. Processing involves beamforming and dereverberation on the remaining sub-bands. Reconstruction uses a 4D spatio-spectral interpolation model. The system processes audio at 96kHz to resolve ultrasonic frequencies for detailed source localization, requiring a throughput in the teraflop range.
- Mermaid.js Diagram:
graph LR subgraph Input Stage M1[Mic 1] M2[Mic 2] M...[Mic ...] M4096[Mic 4096] end subgraph GPU Processing Core A[Batched FFT] B[Spatio-Spectral Excision] C[Beamforming/Denoising] D[4D Interpolation/Reconstruction] E[Batched iFFT & Synthesis] end M1 -- Channel 1 --> A M4096 -- Channel 4096 --> A A --> B --> C --> D --> E E --> F[3D Acoustic Image Stream]
Derivative 2.2: Application in Infrasonic Geopolitical Monitoring
- Enabling Description: The method is applied to signals from the International Monitoring System (IMS), which uses infrasound arrays to detect nuclear detonations. These signals are extremely low frequency (0.01 Hz to 10 Hz) and recorded over continental distances. The raw signal from an array is divided into narrow sub-bands using a high-resolution FFT (e.g., 2^24 points). Sub-bands known to contain persistent microbarom noise from oceanic wave interactions are excised. The remaining bands are processed using adaptive filters to enhance transient events. Reconstruction is critical to avoid artifacts that could be misinterpreted as treaty violations. The processed and reconstructed signals are synthesized to provide a clear signal for event analysis and source triangulation.
- Mermaid.js Diagram:
stateDiagram-v2 [*] --> Receiving: IMS Array Data (0.01-10Hz) Receiving --> SubBandDivision: Long-window FFT SubBandDivision --> Excision: Remove Microbarom Bands Excision --> Processing: Enhance Transients Processing --> Reconstruction: Interpolate Excised Bands Reconstruction --> Synthesis: Inverse FFT Synthesis --> Analysis: Event Detection & Triangulation Analysis --> [*]
Axis 3: Cross-Domain Application
Derivative 3.1: Accelerated Magnetic Resonance Imaging (MRI) Scans
- Enabling Description: The method is applied to the raw k-space data acquired during an MRI scan. The k-space data, which is the 2D or 3D Fourier transform of the image, is treated as a signal. It is divided into sub-bands (regions of k-space). The "excising" step corresponds to a compressed sensing acquisition pattern, where peripheral, high-frequency k-space regions are deliberately under-sampled or skipped, drastically reducing scan time. The remaining acquired sub-bands (central k-space) are processed to correct for motion artifacts. The non-acquired, "excised" sub-bands are then reconstructed using an iterative algorithm (e.g., Total Variation minimization or a deep learning model) that leverages the sparsity of the underlying medical image. The full k-space is then synthesized via an inverse Fourier transform to produce the final diagnostic image.
- Mermaid.js Diagram:
flowchart TD A[MRI RF Coil Signal] --> B{k-Space Acquisition}; B -- Full Sampling --> C[Standard k-Space Data]; B -- Compressed Sensing --> D[Sub-Sampled k-Space (Remaining Bands)]; D --> E{Motion Correction}; subgraph Image Reconstruction E --> F{Reconstruct Missing k-Space (Excised Bands)}; F --> G[Full Reconstructed k-Space]; end G --> H{Inverse Fourier Transform}; H --> I[High-Resolution MR Image];
Derivative 3.2: High-Frequency Trading Data Compression and Analysis
- Enabling Description: A time-series of stock market order book data is treated as a signal. The signal is divided into sub-bands using a Short-Time Fourier Transform (STFT) to create a spectrogram of market activity. The high-frequency sub-bands, representing algorithmic noise trading and fleeting quote changes, are excised to reduce the dataset size and computational load for trend analysis. The remaining lower-frequency sub-bands, representing more significant market movements, are processed using a Long Short-Term Memory (LSTM) network to predict price trends. The excised bands are then statistically reconstructed to provide a complete, but denoised, data stream for back-testing and risk modeling before being synthesized back into a time-series.
- Mermaid.js Diagram:
sequenceDiagram participant T as Tick Data Stream participant W as Windowing & STFT participant E as High-Frequency Exciser participant L as LSTM Trend Predictor participant R as Statistical Reconstructor participant S as Synthesis T->>W: Ingest Market Data W->>E: Market Spectrogram E->>L: Low-Frequency Bands L-->>L: Predict Trend E->>R: Low-Frequency Bands R->>S: Denoised Full Spectrogram S->>T: Filtered Data for Trading Algo
Axis 4: Integration with Emerging Tech
Derivative 4.1: AI-Driven Adaptive Excision for Voice Assistants
- Enabling Description: The method is integrated into the audio front-end of a smart speaker. A reinforcement learning (RL) agent dynamically controls the excision process. The agent’s state is defined by real-time acoustic parameters: Signal-to-Noise Ratio (SNR), presence of a keyword (e.g., "Alexa"), and number of active speakers. The agent's action is to select an excision pattern (e.g., excise 25%, 50%, or 75% of sub-bands) and a reconstruction algorithm (linear, spline, or neural). The reward function is
R = w1 * (CPU_savings) - w2 * (1 - WER), where WER is the Word Error Rate from the downstream speech recognizer. This allows the device to conserve maximum power during idle listening but instantly allocate full processing fidelity when a keyword is detected, optimizing the trade-off between power consumption and recognition accuracy. - Mermaid.js Diagram:
graph TD A[Mic Audio] --> B{Feature Extraction}; B --> C[State: SNR, Keyword, etc.]; C --> D[RL Agent (Policy Network)]; D -- Action: Excision % --> E{Sub-band Excision}; A --> F{Divide into Sub-bands}; F --> E; E --> G{Enhancement Processing}; G --> H{Reconstruction}; D -- Action: Recon. Algo --> H; H --> I{Synthesis}; I --> J{ASR Engine}; J -- Word Error Rate --> K{Reward Calculation}; C --> K; K -- Reward --> D;
Derivative 4.2: IoT-Edge Data Reduction with Blockchain Anchoring
- Enabling Description: In an industrial predictive maintenance system, a vibration sensor on a machine generates a continuous data stream. An edge computing device applies the '575 method. It divides the vibration signal into sub-bands. A pre-trained anomaly detection model identifies which sub-bands are exhibiting nominal behavior; these are excised. The remaining sub-bands, which may contain early fault signatures, are processed and transmitted to a cloud server. To ensure data integrity, a cryptographic hash of the remaining sub-band data, along with a bitmask representing the excision pattern, is computed and anchored to a private blockchain. This creates an immutable, auditable record of the sensor data, preventing tampering while reducing data transmission and storage costs by over 90%. The full signal can be reconstructed in the cloud for detailed analysis if a fault is confirmed.
- Mermaid.js Diagram:
flowchart LR A[Vibration Sensor] --> B(Edge Device); subgraph B B1[Sub-band Division] --> B2{Anomaly Detection}; B2 -- Nominal --> B3[Excise]; B2 -- Anomaly --> B4[Process]; B4 --> B5{Data Hashing}; B3 & B4 --> B6[Assemble Payload]; end B5 --> C((Blockchain Anchor)); B6 --> D((Cloud Storage)); D --> E{Reconstruction & Analysis};
Axis 5: The "Inverse" or Failure Mode
Derivative 5.1: Graceful Audio Degradation for Low-Power/High-CPU Load
- Enabling Description: The method is implemented in a mobile telecommunication device's baseband processor. A system monitor continuously tracks CPU load and battery state of charge. When the battery drops below 20% or CPU load exceeds 95%, the system triggers a "low-power" audio mode. In this mode, the excision ratio is increased from 50% (every other sub-band) to 75% (keep one, excise three). The processing block (e.g., echo canceller) reduces its filter tap length by half. The reconstruction algorithm switches from a computationally expensive spline interpolator to a simple zero-order hold. This results in audibly lower fidelity (more "robotic" sound) but prevents the call from dropping or audio from breaking up entirely, ensuring mission-critical functionality under adverse conditions.
- Mermaid.js Diagram:
stateDiagram-v2 state "Normal Mode" as Normal { [*] --> Processing Processing: Excision=50%, Full-tap filter, Spline recon. Processing --> Processing: CPU < 95% AND Battery > 20% } state "Low Power Mode" as LowPower { [*] --> Degraded Degraded: Excision=75%, Half-tap filter, Zero-order recon. Degraded --> Degraded: CPU > 95% OR Battery < 20% } Normal --> LowPower: CPU > 95% OR Battery < 20% LowPower --> Normal: CPU < 95% AND Battery > 20%
Combination Prior Art Scenarios
Combination with WebRTC Standard: The core method is implemented as a new module within the WebRTC audio processing pipeline. The
RTCPeerConnectionAPI is extended with adegradationPreferenceattribute. When a developer sets this to"maintain-framerate", the browser, upon detecting network congestion via RTCP receiver reports, will trigger the excision/reconstruction process on the audio stream before it is fed to the Opus encoder. The excision pattern is signaled to the remote peer via a custom RTP header extension, allowing the receiver to perform a more informed reconstruction, thus maintaining a smooth, uninterrupted conversation at the cost of temporary fidelity reduction.Combination with GStreamer Open-Source Framework: A GStreamer plugin named
subexciseis created. It acts as an audio filter element that can be dynamically inserted into any GStreamer pipeline. The element exposes properties such asexcision-ratio(a float from 0.0 to 1.0) andmode(e.g., 'odd-even', 'random', 'perceptual'). A user could construct a pipeline for adaptive streaming:rtspsrc ! rtph264depay ! avdec_h264 ! videoconvert ! autovideosink pulsesrc ! audioconvert ! subexcise excision-ratio=0.5 ! opusenc ! rtpopuspay ! udpsink. Theexcision-ratiocould be controlled in real-time by an application monitoring system resources.Combination with SOFA (Spatially Oriented Format for Acoustics) Standard: The method is used to create a lossy, compressed variant of the SOFA file format, tentatively named
SFC(SOFA-Compressed). A utility,sofa2sfc, takes a standard SOFA file containing high-resolution Head-Related Transfer Functions (HRTFs). For each HRTF, it performs a sub-band analysis, excises perceptually masked frequency bands, and stores only the remaining bands along with metadata for the reconstruction algorithm. This reduces the file size of complex spatial audio scenes by 70-80%, enabling their use in web-based and mobile applications where bandwidth and storage are at a premium. The reconstruction to a full SOFA-compliant structure happens at load time.
Generated 5/8/2026, 10:04:03 PM