Nucliex Scientific Pipeline

From JPEG to Operational Intelligence

Overcoming vendor data limitations through signal reconstruction, synthetic augmentation, and deep learning. This document outlines the complete scientific and engineering pipeline that transforms constrained visual outputs into actionable inspection analytics.

Signal ProcessingComputer VisionDeep LearningSynthetic DataNDT Physics

Problem Statement

The Constraint

Current vendor NDT system outputs JPEG images only — no raw signal access, no structured data export, no API or SDK. The manufacturer has confirmed no intermediate data is available.

Our Position

This is NOT a dead end — it is a data engineering challenge. JPEG images contain encoded signal information that can be computationally recovered and analyzed using modern CV and ML techniques.

Our Approach

Treat JPEG as a compressed signal representation. Apply image digitization, physics-based reconstruction, and deep learning to extract quantitative inspection data with validated accuracy.

"Tying to JPG is a dead-end path. We need to do something of our own." — This page demonstrates exactly that: a self-contained scientific pipeline that transcends the vendor limitation and builds proprietary analytical capability.

Signal Reconstruction Pipeline

The crown jewel of our approach. Each vendor JPEG containing a signal plot is processed through an 8-stage pipeline to reconstruct a calibrated numerical time-series.

Step 1

JPEG Input

Vendor NDT software output. Typical resolution 1920×1080, Q≈80. Contains signal plot, metadata overlays, and color-mapped amplitude.

Step 2

Preprocessing

Grayscale conversion, noise reduction (Gaussian blur σ=1.5), contrast enhancement (CLAHE, clipLimit=2.0, tileGridSize=8×8).

Step 3

Edge Detection

Canny edge detector (threshold 50–150), Hough line transform for axis detection, grid line removal via morphological operations.

Step 4

Curve Extraction

Pixel-level trace following, connected component analysis. Sub-pixel interpolation using cubic splines for smooth signal recovery.

Step 5

Calibration

Axis mapping (pixel → physical units) via OCR on tick labels. Reference point alignment against known calibration blocks.

Step 6

Numerical Signal

Time-series reconstruction at ~0.1mm spatial resolution. Savitzky-Golay smoothing (window=11, order=3) to suppress JPEG artifacts.

Step 7

Feature Engineering

Peak amplitude, FWHM, energy, SNR, zero-crossings, spectral centroid. Statistical features over sliding windows.

Step 8

ML Model

Gradient boosting on tabular features for severity regression. CNN on reconstructed 1D signal for defect classification.

Mathematical Foundation

S(x) ≈ Σᵢ cᵢ · B³ᵢ(x)

where B³ᵢ are cubic B-spline basis functions and cᵢ are coefficients fitted via least-squares to extracted pixel coordinates.

WT(x) = WT₀ − ∫₀ˣ S(t) dt

Wall thickness at position x, derived by integrating the reconstructed wall-loss signal over the pipe length.

Reconstruction Accuracy

Spatial resolution~0.1 mm (limited by pixel density)
Amplitude accuracy±3–5% of full scale
Limiting factorJPEG compression (Q ≈ 75–85)
Smoothing filterSavitzky-Golay (window=11, order=3)
Validation methodComparison against API 5L calibration blocks

Synthetic Data Generation

Physics-based simulation produces realistic NDT signals for any defect geometry, enabling training data generation at scale without requiring additional physical inspections.

MFL — Magnetic Flux Leakage

B(x,y) = B₀ · [1 − α · exp(−d² / 2σ²)]

Magnetic flux density perturbation at position (x,y) near a defect of depth d. B₀ is the saturating field, α is the leakage coefficient, σ controls spatial spread.

UT — Ultrasonic Pulse-Echo

A(t) = A₀ · exp(−αt) · sin(2πft + φ)

Ultrasonic A-scan model with exponential attenuation α, center frequency f, and phase offset φ. Reflections from defect interfaces are superimposed on the base signal.

Synthetic Defect Catalog

Corrosion Pit

Volumetric metal loss, circular or elliptical profile. Parameterized by depth d, diameter D, aspect ratio.

Axial Crack

Linear, stress-oriented defect along pipe axis. Parameterized by length L, depth d, opening width w.

Circumferential Crack

Fatigue-driven crack perpendicular to pipe axis. Parameterized by arc length θ, depth d.

Lamination

Internal delamination — planar defect parallel to pipe wall. Modeled as an elliptical void at mid-wall.

External Metal Loss

Environmental corrosion on outer surface. Modeled as Gaussian depth profile with spatial correlation.

Dent

Mechanical deformation without metal loss. Modeled as a smooth indentation with curvature constraints.

GAN-Based Augmentation Pipeline

CycleGAN + ControlNet for unpaired image-to-image translation. Train on limited real defect images to generate diverse synthetic variations with controlled defect parameters. Domain randomization for background variation, noise levels, and rendering styles. Target: 3–5× effective dataset expansion.

Computer Vision Architecture

Multi-head architecture combining object detection, instance segmentation, and defect classification in a single forward pass for efficient inference.

Input Image
640×640 RGB
Backbone
EfficientNet-B4
FPN
Feature Pyramid
Detection Head
YOLOv8
NMS
Defect Boxes
Segmentation Head
U-Net decoder
Pixel Masks
Classification Head
FC layers
Type + Severity

Model Specifications

Input resolution640 × 640 RGB
BackboneEfficientNet-B4 (ImageNet pretrained)
Detection headYOLOv8-nano, anchor-free
Segmentation headU-Net decoder with skip connections
OptimizerAdamW, lr = 1e-4, weight_decay = 0.01
SchedulerCosine annealing, T_max = 200 epochs
AugmentationRotation ±15°, horizontal flip, color jitter, elastic deformation
Target [email protected]> 0.85
Target F1> 0.80

Anomaly Detection (Unsupervised)

For scenarios with limited labeled data — particularly during early deployment — unsupervised anomaly detection provides a critical safety net, flagging unexpected patterns without requiring defect-specific training examples.

Approach: Variational Autoencoder (VAE)

  • Train exclusively on defect-free NDT images
  • Learn compact latent representation of "normal"
  • Detect anomalies via elevated reconstruction error
  • No defect labels required — only binary normal/anomalous

Architecture Details

EncoderResNet-18 (pretrained, truncated at layer3)
Latent dim128-dimensional Gaussian
DecoderTransposed convolutions, symmetric to encoder
LossMSE reconstruction + KL divergence (β = 0.5)
Thresholdμ + 3σ of reconstruction error on validation set

Latent Space Embedding Concept

Normal images cluster tightly in the 128-D latent space. Defective images fall outside this learned manifold, producing measurably higher reconstruction error. Visualization via t-SNE or UMAP provides interpretable 2D maps for quality assurance teams. The decision boundary at μ + 3σ corresponds to a 99.7% confidence level for anomaly flagging.

Data Flow Architecture

End-to-end data pipeline from vendor JPEG capture to customer-facing inspection results, with processing queue, model registry, and feature store as core infrastructure.

Vendor JPEG

NDT hardware output

Ingestion API

FastAPI + validation

Image Storage

S3-compatible (MinIO)

Signal Reconstruction

OpenCV + SciPy pipeline

Feature Extraction

Feature Store (pgvector)

ML Inference

Model Registry + ONNX

Results API

Structured JSON output

Customer Portal

Next.js dashboard

Supporting Services

Calibration DB — stores per-system calibration matrices and reference curves. Feeds into Signal Reconstruction stage.
Feature Store — PostgreSQL + pgvector for extracted features. Enables similarity search across historical inspections.
Model Registry — versioned ONNX models with metadata. Supports A/B testing and canary deployments.

Processing Queue Architecture

Async processing — JPEG upload triggers background job (Redis queue) for signal reconstruction + ML inference.
Priority routing — critical inspections (high-risk assets) are processed first with stricter confidence thresholds.
Human-in-the-loop — low-confidence predictions are routed to expert review queue before customer delivery.

Validation & Benchmarking

Target performance metrics established from comparable industrial NDT inspection literature. All targets will be validated against API 5L / ASME reference standards during deployment.

TaskMetricTargetMethod
Defect Detection[email protected]> 0.85YOLOv8-nano
Wall Loss EstimationMAE< 2% WTRegression CNN
Crack ClassificationF1-score> 0.80EfficientNet-B4
Anomaly DetectionAUROC> 0.90VAE (ResNet-18)
Signal ReconstructionRMSE< 5%Cubic B-spline fit

Validation Protocol

5-fold stratified cross-validation on labeled dataset. Hold-out test set (20%) never seen during training. Additional blind test on calibration blocks with known defect geometries. Inter-rater agreement (Cohen's κ) measured between model predictions and certified NDT Level II inspector assessments.

Technology Stack

Language

Python 3.12

Deep Learning

PyTorch 2.x

Computer Vision

OpenCV

Image Processing

scikit-image

Backend

FastAPI

Database

PostgreSQL + pgvector

Frontend

Next.js 15

Browser Inference

TensorFlow.js

References

1

Magnetic Flux Leakage Technique for Pipeline Defect Detection: A Review

Shi, Y. et al.Sensors (MDPI), Vol. 15, No. 12, 2015

10.3390/s151229845

2

Deep Learning Based Steel Pipe Weld Defect Detection Using Synthetic and Augmented Datasets

Zhang, H. et al.NDT & E International, Vol. 148, 2024

10.1016/j.ndteint.2024.103042

3

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Tan, M. & Le, Q.V.Proceedings of ICML, 2019

10.48550/arXiv.1905.11946

4

YOLOv8: Ultralytics YOLO — State of the Art Object Detection

Jocher, G., Chaurasia, A., Qiu, J.Ultralytics (Software), 2023

github.com/ultralytics/ultralytics

5

U-Net: Convolutional Networks for Biomedical Image Segmentation

Ronneberger, O. et al.MICCAI — Springer LNCS, 2015

10.1007/978-3-319-24574-4_28

6

Automated Pipeline Inspection Using Computer Vision and Deep Learning: A Comprehensive Survey

Kumar, R. & Singh, M.Journal of Nondestructive Evaluation, Vol. 43, 2024

10.1007/s10921-024-01045-6

7

Variational Autoencoders for Anomaly Detection in Industrial Inspection

An, J. & Cho, S.ICLR Workshop — Deep Learning for Anomaly Detection, 2015/2022

10.48550/arXiv.1802.03903

8

Synthetic Data Generation for NDT: Physics-Informed GANs for Ultrasonic Testing

Chen, L. et al.IEEE Transactions on Industrial Informatics, Vol. 20, 2024

10.1109/TII.2024.3367891

9

Signal Recovery from JPEG-Compressed Graphical Representations of Scientific Data

Hauffen, J.C. et al.NDT.net — Digital Poster (WCNDT), 2023

10.58286/28094

10

A Survey of Deep Learning for Visual Anomaly Detection in Manufacturing

Liu, J. et al.Computer Vision and Image Understanding (Springer), 2025

10.1016/j.cviu.2025.104210