Patent 8549339

Derivative works

Defensive disclosure: derivative variations of each claim designed to render future incremental improvements obvious or non-novel.

Active provider: Google · gemini-2.5-pro

Derivative works

Defensive disclosure: derivative variations of each claim designed to render future incremental improvements obvious or non-novel.

✓ Generated

Here is a comprehensive Defensive Disclosure document generating derivative works from US Patent 8,549,339 to establish prior art against future incremental inventions.

Defensive Disclosure: Derivative Architectures and Methods for Heterogeneous Multi-Core Processors

Publication Date: May 13, 2026
Reference Patent: US 8549339 B2

This document discloses novel variations, applications, and integrations of the core technologies described in US Patent 8549339. The purpose of this disclosure is to place these derivative concepts into the public domain, thereby rendering them obvious or non-novel for the purposes of future patent prosecution.


Axis 1: Material & Component Substitution

Derivative 1.1: Silicon-Photonics Interface Block

  • Enabling Description: The electrical interface block (120, 200, 300) between processor core sets is replaced with a silicon-photonics interface. Each processor stripe (e.g., 112, 114) terminates its electrical signaling at an on-chip optical modulator (e.g., a Mach-Zehnder modulator). Data is transmitted via an on-chip silicon waveguide to an optical receiver (e.g., a germanium photodetector) integrated into the adjacent stripe. This substitution completely decouples the voltage domains, making electrical level shifters (202) unnecessary. Clock domain synchronization is achieved by embedding the source clock into the transmitted optical data (e.g., using a Manchester encoding scheme) and recovering it at the receiver using a clock-data recovery (CDR) circuit, which replaces the purely digital synchronizer (302). Power for the optical components is supplied by the respective voltage domain of each stripe.

  • Mermaid Diagram:

    graph TD
        subgraph Stripe_A [Stripe A @ Vdd_A, Clock_A]
            CoreA1(Core A1) --> E_O_A{Electrical-to-Optical<br>Modulator};
        end
        subgraph Stripe_B [Stripe B @ Vdd_B, Clock_B]
            O_E_B{Optical-to-Electrical<br>Photodetector/CDR} --> CoreB1(Core B1);
        end
        E_O_A -- Optical Signal in Waveguide --> O_E_B;
    
        subgraph Interface_Block [Optical Interface Block]
            E_O_A;
            O_E_B;
        end
    
        style Stripe_A fill:#f9f,stroke:#333,stroke-width:2px
        style Stripe_B fill:#ccf,stroke:#333,stroke-width:2px
    

Derivative 1.2: GaN/SiC High-Power Core Integration

  • Enabling Description: The multi-core processor is fabricated on a hybrid substrate. The first set of processor cores (e.g., high-performance cores) is implemented using Gallium Nitride (GaN) High-Electron-Mobility Transistors (HEMTs) to enable operation at extremely high frequencies (>10 GHz) and higher supply voltages (e.g., 5V). The second set of processor cores (e.g., efficiency cores) is implemented in standard low-power silicon CMOS (e.g., at 0.8V). The interface block must now handle a massive voltage differential. The level shifters (202) are implemented as a cascade of diode-based clamps and Schmitt triggers robust enough to translate the 5V GaN logic levels down to the 0.8V CMOS logic levels without latch-up or damage.

  • Mermaid Diagram:

    sequenceDiagram
        participant GaN_Core as GaN Core Set<br>(5V Logic)
        participant Interface as Interface Block
        participant CMOS_Core as CMOS Core Set<br>(0.8V Logic)
    
        GaN_Core->>+Interface: Transmit High-Voltage Signal (5V)
        Interface->>Interface: Cascade Level Shifter (5V -> 2.5V -> 0.8V)
        Interface->>+CMOS_Core: Deliver Low-Voltage Signal (0.8V)
        CMOS_Core-->>-Interface: Acknowledge
        Interface-->>-GaN_Core: Acknowledge
    

Derivative 1.3: MEMS Resonator Clock Source

  • Enabling Description: The per-stripe Phase-Locked Loops (PLLs) are replaced by Micro-Electro-Mechanical System (MEMS) resonators. Each stripe (112, 114) has its own tunable MEMS resonator coupled with a driver circuit. A frequency change request (as in Claim 15) is serviced by applying a new DC bias voltage to the MEMS structure, which changes its resonant frequency. The "lock acquisition" step (408) is replaced by a "frequency stabilization" check, where the output frequency from the MEMS driver is compared against a reference using a digital frequency counter until the drift is below a specified tolerance (e.g., 10 ppm). This component substitution offers higher Q-factor and lower phase noise compared to traditional LC-tank PLLs.

  • Mermaid Diagram:

    stateDiagram-v2
        [*] --> Stable_FreqA
        Stable_FreqA --> Changing_Freq: Receive Freq Change Request
        Changing_Freq --> Stable_FreqB: Bias voltage applied, resonator settles
        state Changing_Freq {
            direction LR
            [*] --> Applying_Bias
            Applying_Bias --> Measuring_Drift: New DC Bias applied to MEMS
            Measuring_Drift --> Measuring_Drift: Drift > 10 ppm
            Measuring_Drift --> Freq_Stable: Drift <= 10 ppm
            Freq_Stable --> [*]
        }
        Stable_FreqB --> [*]
    

Axis 2: Operational Parameter Expansion

Derivative 2.1: Wafer-Scale Implementation

  • Enabling Description: The technology is applied to a Wafer-Scale Engine (WSE). The "stripes" are entire silicon reticles, each containing hundreds of cores. Each reticle operates on an independently supplied voltage and is clocked by a unique, wafer-region-specific clock tree. The "interface block" is the physical boundary logic between adjacent reticles, handling communication across the scribe lines. The method of idling communication (Claim 15) is critical, as an entire reticle can be powered down or have its frequency changed for yield management (to disable a faulty reticle) or to create a specialized high-performance zone on one part of the wafer. The transition processing routine (308) is managed by a dedicated controller located at the center of the wafer.

  • Mermaid Diagram:

    graph TD
        subgraph Wafer
            R1(Reticle 1<br>Vdd_A, Clock_A)
            R2(Reticle 2<br>Vdd_B, Clock_B)
            R3(Reticle 3<br>Vdd_C, Clock_C)
            R4(Reticle 4<br>Vdd_D, Clock_D)
            R1 <-->|Interface Logic| R2
            R1 <-->|Interface Logic| R3
            R2 <-->|Interface Logic| R4
            R3 <-->|Interface Logic| R4
        end
        WaferControl(Wafer-Scale Controller) -.-> R1
        WaferControl -.-> R2
        WaferControl -.-> R3
        WaferControl -.-> R4
    
        style R1 fill:#d5f,stroke:#333
        style R2 fill:#d5f,stroke:#333
        style R3 fill:#d5f,stroke:#333
        style R4 fill:#d5f,stroke:#333
    

Derivative 2.2: Cryogenic Quantum Controller Application

  • Enabling Description: The processor is designed to control a qubit array in a dilution refrigerator. A first set of cores operates at the 4K stage of the cryostat, handling high-level quantum algorithm compilation and scheduling. A second set of cores operates at the 50mK stage, directly adjacent to the qubits, and is responsible for generating low-level microwave control pulses. The two sets operate at vastly different voltages and clock speeds due to the thermal and material property differences. The interface block uses superconducting single-flux quantum (SFQ) logic to transfer data with minimal heat dissipation. The method of Claim 15 is used when recalibrating the qubits, which requires changing the microwave pulse frequency, forcing the 50mK cores to change their clock, idle the SFQ interface, and re-stabilize before continuing the experiment.

  • Mermaid Diagram:

    graph LR
        subgraph 4K_Stage [4K Stage]
            ControlCores(Control Core Set<br>Vdd_High, Clock_High)
        end
        subgraph 50mK_Stage [50mK Stage]
            PulseGenCores(Pulse Generation Cores<br>Vdd_Low, Clock_Low)
        end
        subgraph SFQ_Interface [Superconducting Interface]
            InterfaceLogic(SFQ Logic)
        end
        Qubits((Qubit Array))
    
        ControlCores --> InterfaceLogic
        InterfaceLogic --> PulseGenCores
        PulseGenCores --> Qubits
    

Axis 3: Cross-Domain Application

Derivative 3.1: Aerospace - Hypersonic Vehicle Flight Control

  • Enabling Description: In a flight control computer for a hypersonic scramjet vehicle, the processor has two core sets. The "Atmospheric Flight" set runs at high voltage and clock frequency, processing thousands of sensor inputs per second from control surfaces. The "Guidance & Navigation" set runs at a lower voltage and frequency, handling GPS data and communication with ground control. The interface block is a radiation-hardened, fault-tolerant data bus. When the vehicle reaches a certain altitude and speed, the scramjet ignites, requiring the flight control system to enter a new operational mode. This triggers a frequency change request for the Atmospheric Flight cores. Communication with the Guidance cores is idled (Claim 15) for a few microseconds to ensure no corrupt data is passed during the transition, preventing catastrophic control failure.

  • Mermaid Diagram:

    flowchart TD
        A[Start Flight] --> B{Altitude < 100k ft?};
        B -- Yes --> C[Run Atmospheric Cores @ High Power];
        C --> D{Process Sensor Data};
        D --> E[Interface with Guidance Cores];
        E --> B;
        B -- No --> F[Transition to Hypersonic Mode];
        F --> G[Request Freq Change for Atmo Cores];
        G --> H[Idle Interface to Guidance Cores];
        H --> I{Atmo Core PLL Locked?};
        I -- No --> I;
        I -- Yes --> J[Resume Interface];
        J --> K[Run Atmo Cores @ New Freq];
        K --> End;
    

Derivative 3.2: AgTech - Autonomous Vineyard Robot

  • Enabling Description: An autonomous robot for monitoring grapevines uses a processor with a high-power core set for real-time stereoscopic computer vision (detecting disease, counting grapes) and motor control. A second, ultra-low-power core set handles long-range LoRaWAN communication and passive sensor monitoring (soil moisture, temperature). When the robot navigates from a sunlit row to a shaded row, the computer vision algorithm's required performance changes. The vision core set's clock frequency and voltage are reduced to save power. During this transition, the interface to the LoRaWAN core set is idled to prevent garbled telemetry data from being transmitted. The robot pauses for the milliseconds required for the PLL to re-lock before resuming its path.

  • Mermaid Diagram:

    sequenceDiagram
        participant Vision as Vision Core Set (High Power)
        participant LoRa as LoRa Core Set (Low Power)
        participant Routine as Transition Routine
    
        loop Sunlight Navigation
            Vision->>Vision: Process High-Def Images
        end
        Note over Vision, LoRa: Robot enters shaded row
        Routine->>Vision: Request Clock Freq Decrease
        Routine->>Vision: IDLE communication link
        Routine->>LoRa: IDLE communication link
        par
            Vision->>Vision: Change PLL Frequency
        and
            LoRa->>LoRa: Buffer outgoing telemetry
        end
        Routine->>Vision: Wait for PLL lock signal
        Routine->>Vision: RESUME communication link
        Routine->>LoRa: RESUME communication link
        loop Shade Navigation
            Vision->>Vision: Process Low-Light Images
        end
    

Axis 4: Integration with Emerging Tech

Derivative 4.1: AI-Driven Predictive Power Management

  • Enabling Description: The clock control block (110) and power control block (108) are managed by an on-chip Reinforcement Learning (RL) agent. The RL agent's state is defined by performance counters, thermal sensors, and instruction pipeline statistics from all core sets. Its action space is the set of possible voltage/frequency pairs for each stripe. The reward function is maximizing energy efficiency (performance per watt) while staying within a thermal design power (TDP) budget. The RL agent learns the workload patterns and predictively initiates frequency changes before they are demanded by the software. It also learns the precise settling time of each PLL, allowing it to minimize the communication idle time (Claim 15) to the physical minimum required for that specific transition.

  • Mermaid Diagram:

    graph TD
        subgraph On-Chip_Fabric
            CoreSet_A(Core Set A) -- Telemetry --> RL_Agent;
            CoreSet_B(Core Set B) -- Telemetry --> RL_Agent;
            RL_Agent(RL Agent);
            PowerControl(Power Control Block)
            ClockControl(Clock Control Block)
            RL_Agent -- Action: V/F Pairs --> PowerControl;
            RL_Agent -- Action: V/F Pairs --> ClockControl;
            PowerControl -- Vdd_A/Vdd_B --> CoreSet_A & CoreSet_B;
            ClockControl -- Clock_A/Clock_B --> CoreSet_A & CoreSet_B;
        end
        style RL_Agent fill:#f96,stroke:#333
    

Derivative 4.2: IoT Sensor-Triggered Proactive Idling

  • Enabling Description: The multi-core processor is integrated with a dense array of on-die IoT sensors, including voltage-droop monitors, thermal diodes, and current sensors. These sensors feed a real-time monitoring unit. If the monitor detects a voltage droop on Stripe A that exceeds a critical threshold, it preemptively triggers the "idle communication" step (404) between Stripe A and its neighbors before a clock or data error can occur. It then instructs the power control block to either increase Vdd_A or, if that fails, instructs a task scheduler to migrate the workload off Stripe A. Communication is only resumed after the sensor network confirms the supply voltage has stabilized. This turns the reactive method of Claim 15 into a proactive, fault-avoidance mechanism.

  • Mermaid Diagram:

    stateDiagram-v2
        state "Normal Operation" as Normal
        [*] --> Normal
        Normal --> Proactive_Idle: Droop Sensor > Threshold
        Proactive_Idle --> Normal: Voltage Stabilized
        Proactive_Idle --> Fail_State: Voltage Fails to Stabilize
    
        state Proactive_Idle {
          description Idles interface, attempts Vdd correction
        }
    

Axis 5: The "Inverse" or Failure Mode

Derivative 5.1: Graceful Degradation via Asynchronous Link

  • Enabling Description: The interface block (300) is enhanced with a parallel, low-bandwidth, asynchronous serial link (e.g., a 2-wire I2C-like bus) in addition to the primary high-speed synchronous bus. When a clock frequency change is requested, the transition processing routine (308) does not completely "idle" communications. Instead, it disables the high-speed bus and enables the asynchronous link. This allows low-priority but critical information like heartbeats, status flags, or emergency commands to continue flowing between core sets during the PLL re-locking period. Once the new clock is stable, the asynchronous link is disabled and the high-speed bus is re-enabled, ensuring zero downtime for critical state awareness.

  • Mermaid Diagram:

    sequenceDiagram
        autonumber
        participant A as Core A
        participant B as Core B
        participant IF as Interface Block
    
        A->>IF: High-speed Data
        IF->>B: High-speed Data
        Note over A,B: Freq Change Request
        IF->>IF: Disable High-Speed Bus
        IF->>IF: Enable Async Low-Speed Bus
        A-xIF: Low-speed Heartbeat
        IF-x>B: Low-speed Heartbeat
        Note over A,B: PLL Re-locks
        IF->>IF: Disable Async Low-Speed Bus
        IF->>IF: Enable High-Speed Bus
        A->>IF: High-speed Data
        IF->>B: High-speed Data
    

Derivative 5.2: Failsafe Permanent Idle Mode

  • Enabling Description: A hardware watchdog timer is associated with each PLL. If a PLL fails to acquire a lock within a programmable time window after a frequency change command (as determined in step 408), the watchdog triggers a "permanent idle" state. The interface block tri-states all its outputs connected to the faulty core set's stripe. The transition routine then logs the error and signals the hypervisor or operating system to permanently de-schedule any tasks for that hardware region, effectively and safely removing the faulty stripe from the system's available resources without causing a system crash.

  • Mermaid Diagram:

    flowchart TD
        Start(Freq Change Req) --> Idle(Idle Comms);
        Idle --> Change(Change PLL Freq);
        Change --> StartTimer(Start Watchdog Timer);
        StartTimer --> CheckLock{PLL Locked?};
        CheckLock -- Yes --> Resume(Resume Comms);
        Resume --> End(End);
        CheckLock -- No --> CheckTimer{Timer Expired?};
        CheckTimer -- No --> CheckLock;
        CheckTimer -- Yes --> FailSafe(Enter Permanent Idle);
        FailSafe --> Log(Log Error & Notify OS);
        Log --> End;
    

Combination Prior Art Scenarios

  1. With RISC-V TileLink: A multi-core processor based on the RISC-V ISA is architected with two distinct core clusters: a high-performance cluster of "BOOM" (Berkeley Out-of-Order Machine) cores and an efficiency cluster of "Rocket" cores. These clusters operate in separate voltage and frequency domains. They are interconnected using the open-standard TileLink cache-coherent bus. The "interface block" of Claim 1 is implemented as a TileLink-to-TileLink bridge that incorporates the necessary level-shifting and clock-domain-crossing (CDC) logic. The method of Claim 15 is executed by a dedicated management core when the operating system requests a performance state change for the BOOM cluster. The TileLink protocol's channel handshakes are used to "idle" the link by stalling new requests until a hardware signal confirms the BOOM cluster's new clock is stable.

  2. With ARM AMBA AXI: The invention is applied to an ARM big.LITTLE system. The "first set of processor cores" is an ARM Cortex-A7x cluster (the "big" cores) and the "second set" is an ARM Cortex-A5x cluster (the "LITTLE" cores). These are in distinct voltage/frequency domains. The "interface block" is the AMBA AXI Coherent Hub (or a similar coherent interconnect fabric) that bridges the two clusters. This AXI fabric inherently contains the synchronizers (CDC logic) and logic to support the different power domains. A frequency change, managed by a Power Policy Unit (PPU), uses the AXI protocol's ability to "quiesce" traffic on specific channels. This quiescence corresponds to "idling communications" (Claim 15), and is maintained until the PPU receives a "PLL lock" confirmation from the target cluster's clock controller, after which it de-asserts the quiesce signal to "resume communications".

  3. With OpenPOWER CAPI/OpenCAPI: The technology is embodied in a system using the Open Coherent Accelerator Processor Interface (OpenCAPI) standard. The "first set of cores" is a standard IBM POWER CPU core complex. The "second set of cores" is a field-programmable gate array (FPGA) acting as a hardware accelerator. The POWER CPU and the FPGA have completely independent power and clocking. The "interface block" is the physical layer (PHY) and link layer of the OpenCAPI interface itself, which is designed to handle communication between such disparate domains. The method of Claim 15 is used when the FPGA is partially reconfigured or its internal clock is changed to optimize for a new algorithm. The OpenCAPI link layer protocol is commanded to enter a low-power "retrain" state (the equivalent of "idling"), the FPGA PLL re-locks, and then the link is retrained and brought back to full operation ("resuming communications").

Generated 5/13/2026, 6:49:52 AM