Derivative works — US Patent 11798576

Defensive Disclosure: Derivative Works and Obvious Implementations of SNR-Range-Based Adaptive Gain Control

Publication Date: April 26, 2026
Reference Patent: US 11798576 B2 ("Methods and apparatus for adaptive gain control in a communication system")
Technical Field: Digital Signal Processing, Acoustics, Communications Systems, Embedded Systems.
Abstract: This document discloses a series of derivative methods, systems, and applications stemming from the core teachings of US patent 11,798,576. The disclosures herein are intended to enter the public domain as prior art. These disclosures describe obvious and logical extensions, substitutions, and new applications of the core invention, which details an adaptive gain control system based on maintaining a Signal-to-Noise Ratio (SNR) within a predefined range at a listener's position. A person having ordinary skill in the art of digital signal processing would find these variations to be straightforward extensions of the original concept.

Section 1: Component and Algorithmic Substitution

1.1. Gain Control using Wavelet Transform Domain

Enabling Description: The transformation of the signal into the frequency domain as specified in the '576 patent is performed using a Fast Fourier Transform (FFT). An obvious alternative is to use a Discrete Wavelet Transform (DWT) or a Stationary Wavelet Transform (SWT). The DWT provides superior time-frequency localization for transient signals. In this implementation, the microphone signal is decomposed into multiple wavelet sub-bands. Noise and speech energy are estimated independently in each sub-band. The SNR is then calculated for each sub-band, and a sub-band-specific gain is computed to bring the SNR into a target range. The final signal is reconstructed via an Inverse DWT (IDWT). This method offers improved handling of non-stationary noise, such as clicks or claps, by isolating them in specific wavelet coefficients.

Diagram:

flowchart TD
    A[Microphone Signal] --> B{Discrete Wavelet Transform};
    B --> C1[Sub-band 1];
    B --> C2[Sub-band 2];
    B --> CN[Sub-band N];
    C1 --> D1{Estimate Speech/Noise};
    C2 --> D2{Estimate Speech/Noise};
    CN --> DN{Estimate Speech/Noise};
    D1 --> E1{Calculate SNR_1};
    D2 --> E2{Calculate SNR_2};
    DN --> EN{Calculate SNR_N};
    E1 --> F1{Compute Gain_1};
    E2 --> F2{Compute Gain_2};
    EN --> FN{Compute Gain_N};
    F1 --> G1{Apply Gain_1};
    F2 --> G2{Apply Gain_2};
    FN --> GN{Apply Gain_N};
    G1 & G2 & GN --> H{Inverse Wavelet Transform};
    H --> I[Output Signal];
end

1.2. Neuromorphic Processor Implementation

Enabling Description: Instead of a conventional DSP or CPU, the gain control algorithm is implemented on a neuromorphic processor utilizing Spiking Neural Networks (SNNs). The input audio is converted into a stream of spikes using a delta modulator or similar analog-to-spike converter. An SNN, trained to recognize temporal patterns of speech and noise, performs the energy estimation. A separate small SNN implements the gain control loop, where the "actual gain" and "target gain" are represented by the firing rates of specific neuron populations. The gain increment is determined by excitatory and inhibitory connections between these populations. This approach provides extremely low-latency and low-power operation, suitable for always-on battery-powered devices.

Diagram:

sequenceDiagram
    participant A as Audio Input
    participant B as Spike Encoder
    participant C as Speech/Noise SNN
    participant D as Gain Control SNN
    participant E as Spike Decoder
    participant F as Audio Output

    A->>B: Analog Audio Signal
    B->>C: Spike Train
    C->>D: Speech/Noise Firing Rates
    D->>D: Compare Actual vs Target Gain Firing Rates
    D->>E: Modulated Spike Train
    E->>F: Reconstructed Analog Signal
end

1.3. Non-Acoustic Sensor Fusion for Speech Estimation

Enabling Description: The speech level estimation is augmented with data from non-acoustic sensors to achieve a more robust estimation in high-noise environments. A piezoelectric throat microphone or a bone conduction sensor is used in conjunction with a standard acoustic microphone. Since the throat/bone sensor is largely immune to ambient acoustic noise, its signal provides a clean reference for speech energy and voice activity. The system calculates the speech level primarily from this reference sensor, while the ambient noise level is calculated from the standard microphone during periods of silence (as indicated by the reference sensor). This de-couples the speech and noise estimation, leading to a much more accurate SNR calculation.

Diagram:

graph LR
    subgraph Sensors
        A[Acoustic Mic]
        B[Throat Mic]
    end
    subgraph Processing
        A --> C{Noise Estimator};
        B --> D{Speech Estimator};
        C --> E;
        D --> E{SNR Calculation};
        E --> F[Gain Control Module];
    end
    F --> G[Output];
end

Section 2: Operational Parameter Expansion

2.1. Adaptive Gain for Ultrasonic Industrial Monitoring

Enabling Description: The adaptive gain control method is applied to the ultrasonic frequency range (e.g., 20 kHz - 100 kHz) for predictive maintenance of industrial machinery. An ultrasonic microphone array monitors a machine, such as a high-pressure hydraulic system. The system is trained to identify the acoustic signature of a healthy operational state ("speech") versus the signature of a potential failure mode, such as a bearing wear or a high-pressure leak ("noise"). The adaptive gain system ensures that the faint, early-stage failure signatures are amplified to meet a target SNR, making them detectable by an analysis system, while ignoring loud, broadband operational noise. The [SNRmin, SNRmax] range is set to a high-sensitivity level to detect incipient failures.

Diagram:

stateDiagram-v2
    [*] --> Idle
    Idle --> Monitoring: System Active
    Monitoring --> Monitoring: Healthy Signature (Low Gain)
    Monitoring --> AnomalyDetected: Failure Signature SNR < SNRmin
    AnomalyDetected --> Monitoring: Signature Lost
    AnomalyDetected: Apply High Gain to meet Target SNR
    AnomalyDetected --> Alert: Persists > T seconds
    Alert --> [*]
end

2.2. Gain Control for Deep-Space Optical Communication

Enabling Description: The invention is applied to the gain control of a photodiode amplifier in a deep-space laser communication system. The "speech" is the modulated laser signal from the transmitter, and the "noise" is stray light from stars, solar radiation, and detector shot noise. The system continuously estimates the power of the desired signal and the power of the noise. It adjusts the transimpedance gain of the amplifier to keep the electronic SNR of the resulting signal within an optimal range for the demodulator and error-correction decoder. The [SNRmin, SNRmax] range is dynamically adjusted based on the expected bit error rate (BER) for the current communication protocol.

Diagram:

flowchart LR
    A[Incoming Photons] --> B(Photodiode);
    B --> C{Transimpedance Amplifier (TIA)};
    C --> D[Output Signal];
    D --> E{Signal/Noise Estimator};
    E --> F{SNR Calculator};
    F --> G{Gain Control Logic};
    G --> C;
end

Section 3: Cross-Domain Applications

3.1. Aerospace: Hypersonic Vehicle Cockpit Communications

Enabling Description: In a hypersonic flight environment (> Mach 5), intense, non-stationary noise is generated by atmospheric friction and ionized plasma sheathing around the aircraft. This system is integrated into the pilot's helmet communication system. It uses an array of internal and external microphones to estimate pilot speech and the extreme background noise. The gain control algorithm adapts on a sub-millisecond timescale, adjusting the sidetone and intercom gain to maintain a consistent SNR. The psychoacoustic model is modified to account for the altered perception of sound under high-G forces and cognitive load, with the [SNRmin, SNRmax] range being widened to prevent over-correction during rapid vibrational transients.

Diagram:

classDiagram
  class CockpitComms {
    +pilotMicSignal
    +ambientNoiseSignal
    +gForceData
    -dsp
    +processAudio()
  }
  class DSP {
    -snrTargetRange
    -psychoacousticModel
    +estimateNoise()
    +estimateSpeech()
    +calculateAdaptiveGain()
    +adjustSnrRange(gForce)
  }
  CockpitComms "1" *-- "1" DSP : contains

3.2. AgTech: Livestock Distress Vocalization Monitoring

Enabling Description: An array of microphones is deployed in a large-scale pig farrowing house. The system is trained to recognize the specific acoustic signature of a piglet in distress (e.g., being crushed by the sow) as "speech." All other sounds (other piglets, sow grunts, ventilation fans) are treated as "noise." The adaptive gain control system processes the feed from each microphone. When a distress call is detected, the gain for that channel is increased to ensure the SNR of the call is high enough to trigger an alert system. The system's VAD (renamed Distress Activity Detection) is critical. During periods of no distress calls, the gain is attenuated to avoid amplifying the constant cacophony of the barn.

Diagram:

sequenceDiagram
    participant M as Microphone Array
    participant S as Signal Processor
    participant A as Alerting System
    loop Continuous Monitoring
        M->>S: Audio from Zone 4
        S->>S: Detect Distress Signature (SNR < SNRmin)
        S->>S: Increase gain for Zone 4 channel
        S->>A: Trigger Alert: Distress in Zone 4
    end
end

3.3. Medical: Surgical Theater Command & Control

Enabling Description: The system is integrated into the master communication console of a robotic surgery theater. Directional microphones are focused on the primary surgeon. The system identifies the surgeon's voice as "speech" and the sounds of life support equipment, alarms, and other team members' conversations as "noise." The gain on the surgeon's channel is adaptively controlled to ensure their commands to the team and to the voice-controlled surgical robot are always clear and intelligible, maintaining a high SNR at the listeners' earpieces and the robot's speech recognition input. This reduces cognitive load on the surgeon, who does not need to consciously speak louder to be heard over intermittent noise sources like suction devices.

Diagram:

graph TD
    A[Surgeon's Voice] --> B{Directional Mic};
    C[OR Equipment Noise] --> B;
    B --> D[DSP];
    D --> E{Speech/Noise Separation};
    E -- Speech --> F{SNR Calculation};
    E -- Noise --> F;
    F --> G{Adaptive Gain Control};
    G --> H[Team Earpieces];
    G --> I[Surgical Robot ASR];
end

Section 4: Integration with Emerging Technologies

4.1. AI-Driven Reinforcement Learning for SNR Range Optimization

Enabling Description: The fixed or manually configured [SNRmin, SNRmax] range is replaced by a dynamic range controlled by a reinforcement learning (RL) agent. The agent's "state" is the current acoustic environment (noise level, noise type, speaker identity). Its "action" is to adjust the SNRmin and SNRmax values. The "reward" is a function of the output speech intelligibility (measured by a companion speech-to-text model's confidence score) and a penalty for excessive gain or rapid fluctuations. Over time, the RL agent learns the optimal SNR target range for thousands of different acoustic contexts, personalizing the system for specific users and environments without manual tuning.

Diagram:

flowchart TD
    subgraph RL_Agent
        A[Observe State: Noise, Speaker] --> B{Select Action: Set SNRmin, SNRmax};
        B --> C{Apply to Gain Control};
        D[Calculate Reward: ASR Confidence] --> E{Update Policy};
        C --> D;
        E --> B;
    end
    subgraph Gain_Control_System
        F[Audio In] --> G{SNR Calculation};
        G --> H[Gain Adjustment];
        C --> H;
        H --> I[Audio Out];
    end
    I --> D;
end

4.2. IoT-Contextualized Preemptive Gain Adjustment

Enabling Description: In a smart factory setting, the adaptive gain control system for worker communication headsets is connected to the factory's IoT network. IoT sensors on machinery broadcast their operational state (e.g., idle, spinning up, active, emergency stop). The gain control module subscribes to these messages. When a large stamping press broadcasts a "stamping_cycle_imminent" message, the communication headsets of all nearby workers preemptively increase their target SNR range before the noise event occurs. This eliminates the small delay inherent in a purely reactive system and prevents even a momentary loss of communication clarity.

Diagram:

sequenceDiagram
    participant IoT as IoT Sensor (Press)
    participant MQTT as MQTT Broker
    participant GCM as Gain Control Module
    participant H as Headset Audio
    IoT->>MQTT: Publish topic 'factory/press/state' payload 'imminent'
    MQTT-->>GCM: Receive Message
    GCM->>GCM: Preemptively raise SNR_target
    GCM->>H: Apply new gain curve
end

Section 5: Inverse and Failsafe Modes

5.1. Graceful Degradation to Time-Domain Energy Control

Enabling Description: The system includes a watchdog timer that monitors the processing load and execution time of the frequency-domain gain control algorithm. If the processing latency exceeds a critical threshold (e.g., due to high CPU load from other tasks) or a critical module fails, the system enters a "failsafe" mode. In this mode, it bypasses the FFT, SNR estimation, and complex gain logic. Instead, it falls back to a simple, low-computation time-domain RMS energy calculation. It applies gain based on a simple energy threshold, providing a rudimentary but stable form of gain control that guarantees system stability and prevents audio dropouts or loud artifacts.

Diagram:

stateDiagram-v2
    state "Full SNR-based Control" as FullMode
    state "Time-Domain Energy Control" as FailsafeMode

    [*] --> FullMode
    FullMode --> FailsafeMode: CPU Load > 95% OR Module Failure
    FailsafeMode --> FullMode: System Reset OR Load < 70%
    FailsafeMode: Bypasses FFT and SNR logic. Uses RMS energy for gain.
end

Section 6: Combination with Open-Source Standards

6.1. Combination with WebRTC Standard

Enabling Description: The method is embodied as a WebAssembly (WASM) module for high-performance execution in a web browser. The module exposes a JavaScript API that interfaces with the Web Audio API. A developer can insert this module into a MediaStream processing graph, replacing the browser's native autoGainControl. This brings the high-fidelity SNR-range-based control to any web-based communication application, providing superior performance over the standard AGC in noisy environments like coffee shops or co-working spaces.

6.2. Combination with SOFA (Spatially Oriented Format for Acoustics)

Enabling Description: The system uses an open-source SOFA file library to personalize the audio experience. The user provides a personalized Head-Related Transfer Function (HRTF) stored in the SOFA format. The gain control system uses this data to calculate the estimated SNR not at the microphone's position, but at the user's eardrums, after the sound has been filtered by their head, shoulders, and pinnae. This allows for a more perceptually accurate gain adjustment, particularly in multi-loudspeaker systems where spatial audio cues are important. The target SNR can be set differently for each ear if needed.

6.3. Combination with Kaldi Speech Recognition Toolkit

Enabling Description: The system is configured in a closed loop with the Kaldi open-source speech recognition toolkit. The gain-adjusted audio output is continuously fed to a Kaldi ASR process. The ASR engine's output includes not just the transcribed text, but also a confidence score for the recognition. This confidence score is used as a real-time feedback metric to the gain control module. If the confidence score drops below a set threshold (e.g., 0.85), the SNRmin target is automatically nudged upwards by a small delta (e.g., +1 dB), thereby optimizing the audio not just for human listening but for maximum machine intelligibility.