Derivative works — US Patent US20180240021A1

Defensive Disclosure and Prior Art Generation

Document ID: D-9384-A1
Publication Date: 2026-05-01
Title: Systems and Methods for Predictive Classification Modeling Across Multiple Technical Domains
Subject Matter: This document discloses derivative inventions, alternative embodiments, and cross-domain applications of the core methodology described in US Patent 11,087,221 B2 (based on application US20180240021A1), thereby placing them in the public domain. The core methodology involves: (1) receiving parameters for a target system, (2) forming a classification model from simulation data, (3) generating a probabilistic estimate of the target system's performance, and (4) using the estimate to guide a decision.

Derivative Embodiment 1: Component Substitution with Advanced Neural Architectures

Enabling Description: The classification model described in the reference patent can be substantially improved by replacing the Multilayer Perceptron (MLP), Bayesian, and DTW models with a Transformer-based architecture. Time-series data from the reservoir simulator (e.g., pressure, flow rate, water cut at discrete time steps) is treated as a sequence, analogous to words in a sentence. Each time-step's vector of parameters (e.g., [pressure, temp, saturation]) is converted into a high-dimensional embedding vector. Positional encodings are added to these embeddings to retain temporal information. The entire sequence of embedded vectors is then processed by a multi-head self-attention mechanism within a Transformer encoder stack. This allows the model to learn complex, non-linear dependencies between distant time steps in the simulation, providing a more accurate classification of the well's long-term performance (e.g., "Good," "Bad," "Requires Intervention"). The final classification is produced by a linear layer and a softmax function applied to the output of the [CLS] token embedding from the Transformer. This approach is superior for identifying subtle patterns in long-duration simulations that simpler models would miss.

sequenceDiagram
    participant Sim as Reservoir Simulator
    participant Emb as Tokenizer & Embedder
    participant Trans as Transformer Encoder
    participant Classifier as Output Layer

    Sim->>Emb: Generate Time-Series Data (Vectors)
    Emb->>Emb: Convert each vector to an embedding
    Emb->>Emb: Add Positional Encodings
    Emb->>Trans: Pass sequence of encoded vectors
    Trans->>Trans: Apply Multi-Head Self-Attention
    Trans->>Classifier: Pass final hidden state of [CLS] token
    Classifier->>User: Output Probabilistic Classification (Good/Bad)

Derivative Embodiment 2: Operational Parameter Expansion to Cryogenic Fluid Sequestration

Enabling Description: The methodology is applied to the geological sequestration of cryogenically stored supercritical fluids, such as liquid nitrogen or captured carbon dioxide, in subterranean salt caverns or depleted gas reservoirs. The operational parameters are expanded to include extreme low temperatures (-150°C to -50°C) and high pressures (200-300 bar). The reservoir simulation models are adapted to include multiphase fluid dynamics under cryogenic conditions, incorporating the Joule-Thomson effect and phase-change boundaries within the reservoir rock. The classification model is trained on simulation data to predict the long-term stability of the sequestration site. It classifies potential injection scenarios as "Stable Sequestration," "High Risk of Caprock Fracture," or "Potential for Uncontrolled Phase Transition." Input parameters include injection temperature, pressure curves, cavern geometry, and rock thermal conductivity. The probabilistic estimate guides the operational plan for the safe and permanent storage of industrial gases.

graph TD
    A[Define Injection Scenario] -- Temp, Pressure, Duration --> B(Geomechanical & Thermal Simulation);
    B -- Simulation Output (Stress fields, Temp gradients) --> C{Train Classification Model};
    C -- Trained Model --> D[Input Target Scenario];
    D --> E{Generate Probabilistic Estimate};
    E -- 95% --> F[Class: Stable Sequestration];
    E -- 4% --> G[Class: High Risk of Caprock Fracture];
    E -- 1% --> H[Class: Uncontrolled Phase Transition];
    F --> I[Decision: Proceed with Injection];
    G --> J[Decision: Redesign or Abort];
    H --> J;

    style F fill:#9f9,stroke:#333,stroke-width:2px
    style G fill:#f99,stroke:#333,stroke-width:2px
    style H fill:#f99,stroke:#333,stroke-width:2px

Derivative Embodiment 3: Cross-Domain Application in Aerospace Materials Science

Enabling Description: The core method is adapted for predicting the fatigue life and failure probability of novel composite aerospace components (e.g., turbine blades, fuselage panels) under operational stress. A high-fidelity Finite Element Analysis (FEA) simulation is used in place of the reservoir simulator. The FEA model simulates decades of operational cycles (thermal, vibrational, aerodynamic loading). Thousands of simulations are run with varying material compositions (e.g., carbon fiber ply angles, resin matrix composition) and micro-fracture initial conditions. The output data (stress, strain, delamination progression) is used to train a 3D Convolutional Neural Network (3D-CNN) which acts as the classification model. For a proposed new component design, the 3D-CNN provides a probabilistic estimate of it belonging to one of three classes: "Exceeds 100,000-cycle lifespan," "Fails between 50,000-100,000 cycles," or "Catastrophic failure before 50,000 cycles." This classification directly informs the go/no-go decision for manufacturing and physical testing, drastically reducing development costs.

flowchart LR
    subgraph Simulation Phase
        A[Define Material Parameters & Load Cases] --> B(Run FEA Simulations);
        B --> C[Generate 4D Dataset (x,y,z,time)];
    end
    subgraph Training Phase
        C --> D(Train 3D-CNN Classifier);
    end
    subgraph Prediction Phase
        E[Propose New Component Design] --> F(Input Design into 3D-CNN);
        F --> G{Probabilistic Classification};
        G --> H[Class 1: >100k cycles];
        G --> I[Class 2: 50k-100k cycles];
        G --> J[Class 3: <50k cycles];
    end
    subgraph Decision
        H --> K(Decision: Certify for Production);
        I --> L(Decision: Redesign & Re-evaluate);
        J --> M(Decision: Reject Design);
    end

Derivative Embodiment 4: Cross-Domain Application in Algorithmic Trading

Enabling Description: The methodology is applied to classify the future performance of high-frequency trading (HFT) algorithms. In this context, the "reservoir simulator" is a market back-testing engine that simulates the algorithm's performance against years of historical tick-level market data. "Well parameters" are the algorithm's hyperparameters (e.g., lookback windows, risk thresholds, order sizes). The simulation results (profit/loss curves, Sharpe ratio, max drawdown) form the training data for a Long Short-Term Memory (LSTM) network, which serves as the classification model. Given a new set of hyperparameters for a target algorithm, the LSTM model produces a probabilistic estimate of its performance classification over the next quarter: "Alpha-Generating," "Market-Neutral," or "Capital-Depleting." This classification guides the decision of whether to deploy the algorithm with real capital in live markets.

stateDiagram-v2
    [*] --> Backtesting
    Backtesting: Run thousands of hyperparameter combinations on historical data
    Backtesting --> Training: Generate performance curves

    Training: Train LSTM model on performance curves
    Training --> Classification

    Classification: Input new algorithm's hyperparameters
    Classification --> P_Alpha: P=0.6
    Classification --> P_Neutral: P=0.3
    Classification --> P_Loss: P=0.1

    state "Decision" as D {
        P_Alpha --> Deploy: Activate algorithm in live market
        P_Neutral --> Review: Tweak parameters, re-classify
        P_Loss --> Archive: Reject algorithm
    }

Derivative Embodiment 5: Cross-Domain Application in Agricultural Technology (AgTech)

Enabling Description: This application predicts crop yield and classifies the success of a given planting strategy. A biophysical crop growth simulator (e.g., DSSAT, APSIM) replaces the reservoir simulator. The input parameters include seed genetics, soil composition, fertilizer/irrigation schedules, and long-range weather forecasts. Thousands of simulation runs generate a dataset of potential growth outcomes over a full season. This data is used to train a Gaussian Process Classifier. For a farmer's proposed planting strategy for the upcoming season, the system generates a probabilistic estimate classifying the likely outcome as "High Yield (>90th percentile)," "Average Yield," or "Crop Failure/Low Yield (<25th percentile)." This allows for the optimization of resource allocation and the purchase of appropriate crop insurance before a single seed is planted.

graph TD
    A[Input: Seed, Soil, Weather, Strategy] --> B(Crop Growth Simulation);
    B -- Thousands of runs --> C(Generate Training Dataset);
    C --> D(Train Gaussian Process Classifier);
    E[Farmer's Proposed Plan for Season] --> F(Input to Trained Classifier);
    F --> G{Probabilistic Outcome Classification};
    G -- "Prob > 0.7" --> H[High Yield];
    G -- "Prob > 0.2" --> I[Average Yield];
    G -- "Prob < 0.1" --> J[Crop Failure];

Derivative Embodiment 6: Integration with Emerging Tech (AI, IoT, Blockchain)

Enabling Description: The patented system is integrated into a closed-loop "Digital Twin" of the reservoir, creating a dynamic, self-optimizing system.

AI-driven Optimization: A Genetic Algorithm (GA) or Reinforcement Learning (RL) agent is used to propose the initial well configuration and location parameters. Its goal is to maximize the probability of a "Good" classification from the model, intelligently exploring the parameter space instead of relying on human engineers.
IoT Integration: Real-time data from downhole Distributed Temperature Sensing (DTS) and Distributed Acoustic Sensing (DAS) fiber-optic cables are streamed to the system. This data provides an instantaneous, high-resolution view of fluid flow and reservoir dynamics. The classification model is continuously re-trained or fine-tuned with this live data, allowing it to adapt to changing reservoir conditions.
Blockchain for Verification: Every simulation run, model training event, probabilistic estimate, and subsequent operational decision (e.g., "Drill Well at X, Y, Z") is recorded as a transaction on a private, permissioned blockchain (e.g., Hyperledger Fabric). This creates an immutable, auditable, and cryptographically secure log of the entire decision-making process, which can be shared with regulators, partners, or insurers to verify compliance and operational integrity.

classDiagram
    class GeneticAlgorithm {
        +proposeWellParameters()
    }
    class ReservoirSimulator {
        +runSimulation(params)
    }
    class ClassificationModel {
        -model
        +train(data)
        +predict(params)
    }
    class IoTSensorStream {
        +getRealTimeData()
    }
    class BlockchainLedger {
        +recordDecision(decisionData)
    }
    GeneticAlgorithm --> ReservoirSimulator : "Submits parameters"
    ReservoirSimulator --> ClassificationModel : "Provides training data"
    IoTSensorStream --> ClassificationModel : "Provides fine-tuning data"
    ClassificationModel --> GeneticAlgorithm : "Returns fitness score"
    ClassificationModel --> BlockchainLedger : "Logs prediction & decision"

Derivative Embodiment 7: The "Inverse" or Failure-Mode Classifier

Enabling Description: A parallel version of the classification model is trained specifically to identify scenarios leading to undesirable or catastrophic outcomes. Instead of classifying for production ("Good"/"Bad"), this "Hazard Classifier" is trained on simulation data that is specifically labeled for failure modes. The labels include "Wellbore Instability," "Casing Collapse," "Uncontrolled Gas Kick," and "Premature Water Breakthrough." The input parameters (well configuration, drilling plan, geological properties) are processed by a Support Vector Machine (SVM) or a deep neural network trained with a focal loss function to handle the rarity of failure events. The system outputs a probabilistic estimate of the risk for each specific hazard. An operational decision to drill is only considered acceptable if the primary performance classifier predicts "Good" AND the Hazard Classifier predicts a probability below a critical safety threshold (e.g., <0.1%) for all catastrophic failure modes.

flowchart TD
    A[Input: Proposed Well Plan] --> B{Performance Classifier};
    A --> C{Hazard Classifier};
    B --> B_Good["P(Good) > 0.8"];
    B --> B_Bad["P(Good) <= 0.8"];
    C --> C_Safe["P(Hazard) < 0.001"];
    C --> C_Unsafe["P(Hazard) >= 0.001"];

    subgraph Decision Logic
        direction LR
        X[B_Good] & Y[C_Safe] --> Z[ACCEPT];
        X & C_Unsafe --> W[REJECT - Unsafe];
        B_Bad & Y --> V[REJECT - Poor Performance];
        B_Bad & C_Unsafe --> W;
    end

Combination Prior Art Scenarios

Combination with TensorFlow/PyTorch and Open-Source Simulators: The entire method is implemented using exclusively open-source components. The reservoir simulation is performed using the Open Porous Media (OPM) Initiative's open-source simulator. The resulting data is processed and used to train a classification model (e.g., a Recurrent Neural Network or a Gradient Boosted Tree model like XGBoost) built, trained, and deployed using the Python-based TensorFlow or PyTorch libraries. This combination demonstrates that the entire patented workflow can be achieved without proprietary software, making the specific combination of these well-known tools for this purpose obvious to a person skilled in the art.
Combination with Kubernetes and OPC UA: The system is architected as a cloud-native microservices application managed by Kubernetes. The reservoir simulator is a scalable service, the model training is another, and the prediction/inference is a lightweight API endpoint. Real-world operational data for model re-training is ingested from field equipment (pumps, chokes, sensors) using the OPC UA industrial communication standard (IEC 62541). This data is fed into a Kafka data stream, processed, and used to trigger automated model re-training jobs within the Kubernetes cluster. This architecture makes the patented process scalable, resilient, and interoperable with standard industrial control systems.
Combination with Apache Spark and MLlib: For extremely large reservoirs requiring massive ensembles of simulations ("giant reservoirs"), the data processing and model training steps are performed on a distributed computing cluster using Apache Spark. The simulation outputs, stored in a data lake (e.g., in Parquet format), are loaded into a Spark Resilient Distributed Dataset (RDD) or DataFrame. The feature engineering and training of the classification models (e.g., Naive Bayes, Decision Trees, K-Means Clustering, as mentioned in the patent) are executed in a massively parallel fashion using Spark's built-in MLlib library. This combination addresses the computational bottleneck of the original patent, demonstrating an obvious path to scale the method for industrial-level problems using standard, open-source big data technologies.