Nucliex Scientific Pipeline
From JPEG to Operational Intelligence
Overcoming vendor data limitations through signal reconstruction, synthetic augmentation, and deep learning. This document outlines the complete scientific and engineering pipeline that transforms constrained visual outputs into actionable inspection analytics.
Problem Statement
The Constraint
Current vendor NDT system outputs JPEG images only — no raw signal access, no structured data export, no API or SDK. The manufacturer has confirmed no intermediate data is available.
Our Position
This is NOT a dead end — it is a data engineering challenge. JPEG images contain encoded signal information that can be computationally recovered and analyzed using modern CV and ML techniques.
Our Approach
Treat JPEG as a compressed signal representation. Apply image digitization, physics-based reconstruction, and deep learning to extract quantitative inspection data with validated accuracy.
"Tying to JPG is a dead-end path. We need to do something of our own." — This page demonstrates exactly that: a self-contained scientific pipeline that transcends the vendor limitation and builds proprietary analytical capability.
Signal Reconstruction Pipeline
The crown jewel of our approach. Each vendor JPEG containing a signal plot is processed through an 8-stage pipeline to reconstruct a calibrated numerical time-series.
JPEG Input
Vendor NDT software output. Typical resolution 1920×1080, Q≈80. Contains signal plot, metadata overlays, and color-mapped amplitude.
Preprocessing
Grayscale conversion, noise reduction (Gaussian blur σ=1.5), contrast enhancement (CLAHE, clipLimit=2.0, tileGridSize=8×8).
Edge Detection
Canny edge detector (threshold 50–150), Hough line transform for axis detection, grid line removal via morphological operations.
Curve Extraction
Pixel-level trace following, connected component analysis. Sub-pixel interpolation using cubic splines for smooth signal recovery.
Calibration
Axis mapping (pixel → physical units) via OCR on tick labels. Reference point alignment against known calibration blocks.
Numerical Signal
Time-series reconstruction at ~0.1mm spatial resolution. Savitzky-Golay smoothing (window=11, order=3) to suppress JPEG artifacts.
Feature Engineering
Peak amplitude, FWHM, energy, SNR, zero-crossings, spectral centroid. Statistical features over sliding windows.
ML Model
Gradient boosting on tabular features for severity regression. CNN on reconstructed 1D signal for defect classification.
Mathematical Foundation
where B³ᵢ are cubic B-spline basis functions and cᵢ are coefficients fitted via least-squares to extracted pixel coordinates.
Wall thickness at position x, derived by integrating the reconstructed wall-loss signal over the pipe length.
Reconstruction Accuracy
Synthetic Data Generation
Physics-based simulation produces realistic NDT signals for any defect geometry, enabling training data generation at scale without requiring additional physical inspections.
MFL — Magnetic Flux Leakage
Magnetic flux density perturbation at position (x,y) near a defect of depth d. B₀ is the saturating field, α is the leakage coefficient, σ controls spatial spread.
UT — Ultrasonic Pulse-Echo
Ultrasonic A-scan model with exponential attenuation α, center frequency f, and phase offset φ. Reflections from defect interfaces are superimposed on the base signal.
Synthetic Defect Catalog
Volumetric metal loss, circular or elliptical profile. Parameterized by depth d, diameter D, aspect ratio.
Linear, stress-oriented defect along pipe axis. Parameterized by length L, depth d, opening width w.
Fatigue-driven crack perpendicular to pipe axis. Parameterized by arc length θ, depth d.
Internal delamination — planar defect parallel to pipe wall. Modeled as an elliptical void at mid-wall.
Environmental corrosion on outer surface. Modeled as Gaussian depth profile with spatial correlation.
Mechanical deformation without metal loss. Modeled as a smooth indentation with curvature constraints.
GAN-Based Augmentation Pipeline
CycleGAN + ControlNet for unpaired image-to-image translation. Train on limited real defect images to generate diverse synthetic variations with controlled defect parameters. Domain randomization for background variation, noise levels, and rendering styles. Target: 3–5× effective dataset expansion.
Computer Vision Architecture
Multi-head architecture combining object detection, instance segmentation, and defect classification in a single forward pass for efficient inference.
640×640 RGB
EfficientNet-B4
Feature Pyramid
YOLOv8
U-Net decoder
FC layers
Model Specifications
Anomaly Detection (Unsupervised)
For scenarios with limited labeled data — particularly during early deployment — unsupervised anomaly detection provides a critical safety net, flagging unexpected patterns without requiring defect-specific training examples.
Approach: Variational Autoencoder (VAE)
- Train exclusively on defect-free NDT images
- Learn compact latent representation of "normal"
- Detect anomalies via elevated reconstruction error
- No defect labels required — only binary normal/anomalous
Architecture Details
Latent Space Embedding Concept
Normal images cluster tightly in the 128-D latent space. Defective images fall outside this learned manifold, producing measurably higher reconstruction error. Visualization via t-SNE or UMAP provides interpretable 2D maps for quality assurance teams. The decision boundary at μ + 3σ corresponds to a 99.7% confidence level for anomaly flagging.
Data Flow Architecture
End-to-end data pipeline from vendor JPEG capture to customer-facing inspection results, with processing queue, model registry, and feature store as core infrastructure.
Vendor JPEG
NDT hardware output
Ingestion API
FastAPI + validation
Image Storage
S3-compatible (MinIO)
Signal Reconstruction
OpenCV + SciPy pipeline
Feature Extraction
Feature Store (pgvector)
ML Inference
Model Registry + ONNX
Results API
Structured JSON output
Customer Portal
Next.js dashboard
Supporting Services
Processing Queue Architecture
Validation & Benchmarking
Target performance metrics established from comparable industrial NDT inspection literature. All targets will be validated against API 5L / ASME reference standards during deployment.
| Task | Metric | Target | Method |
|---|---|---|---|
| Defect Detection | [email protected] | > 0.85 | YOLOv8-nano |
| Wall Loss Estimation | MAE | < 2% WT | Regression CNN |
| Crack Classification | F1-score | > 0.80 | EfficientNet-B4 |
| Anomaly Detection | AUROC | > 0.90 | VAE (ResNet-18) |
| Signal Reconstruction | RMSE | < 5% | Cubic B-spline fit |
Validation Protocol
5-fold stratified cross-validation on labeled dataset. Hold-out test set (20%) never seen during training. Additional blind test on calibration blocks with known defect geometries. Inter-rater agreement (Cohen's κ) measured between model predictions and certified NDT Level II inspector assessments.
Technology Stack
Language
Python 3.12
Deep Learning
PyTorch 2.x
Computer Vision
OpenCV
Image Processing
scikit-image
Backend
FastAPI
Database
PostgreSQL + pgvector
Frontend
Next.js 15
Browser Inference
TensorFlow.js
References
Magnetic Flux Leakage Technique for Pipeline Defect Detection: A Review
Shi, Y. et al. — Sensors (MDPI), Vol. 15, No. 12, 2015
10.3390/s151229845
Deep Learning Based Steel Pipe Weld Defect Detection Using Synthetic and Augmented Datasets
Zhang, H. et al. — NDT & E International, Vol. 148, 2024
10.1016/j.ndteint.2024.103042
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Tan, M. & Le, Q.V. — Proceedings of ICML, 2019
10.48550/arXiv.1905.11946
YOLOv8: Ultralytics YOLO — State of the Art Object Detection
Jocher, G., Chaurasia, A., Qiu, J. — Ultralytics (Software), 2023
github.com/ultralytics/ultralytics
U-Net: Convolutional Networks for Biomedical Image Segmentation
Ronneberger, O. et al. — MICCAI — Springer LNCS, 2015
10.1007/978-3-319-24574-4_28
Automated Pipeline Inspection Using Computer Vision and Deep Learning: A Comprehensive Survey
Kumar, R. & Singh, M. — Journal of Nondestructive Evaluation, Vol. 43, 2024
10.1007/s10921-024-01045-6
Variational Autoencoders for Anomaly Detection in Industrial Inspection
An, J. & Cho, S. — ICLR Workshop — Deep Learning for Anomaly Detection, 2015/2022
10.48550/arXiv.1802.03903
Synthetic Data Generation for NDT: Physics-Informed GANs for Ultrasonic Testing
Chen, L. et al. — IEEE Transactions on Industrial Informatics, Vol. 20, 2024
10.1109/TII.2024.3367891
Signal Recovery from JPEG-Compressed Graphical Representations of Scientific Data
Hauffen, J.C. et al. — NDT.net — Digital Poster (WCNDT), 2023
10.58286/28094
A Survey of Deep Learning for Visual Anomaly Detection in Manufacturing
Liu, J. et al. — Computer Vision and Image Understanding (Springer), 2025
10.1016/j.cviu.2025.104210