Amir Soltani — AI · Computer Vision

↗

● Medical AI

Automated Medical Image Segmentation

U-Net system for brain tumor segmentation on BraTS 2021. Active learning pipeline reducing annotation time by 35%. Deployed across 5 hospitals via MONAI framework.

0.92 DSC Score −30% Workload 5 Hospitals

U-NetMONAIActive LearningBraTS 2021

↗

● Medical AI

Tooth Detection, Segmentation & Numbering

Intelligent Dental Imaging System for Endodontic Surgery Prediction. Cascade R-CNN on OPG X-rays with contour analysis for overlapping tooth disambiguation.

96% Accuracy −15 min/patient +27% Numbering

Cascade R-CNNDental AIOPG X-raysCAD Integration

↗

● Medical AI

Medical Image Enhancement & Super-Resolution

MRI denoising with SNR improvement from 12 dB to 18 dB on the IXI dataset. CT contrast enhancement via CLAHE for liver lesion detection. Integrated into PACS.

12→18 dB SNR −22% Prep Time PACS Ready

Few-ShotCLAHE3D SegmentationIXI Dataset

↗

● Medical AI

Retinal Vessel Segmentation & Lesion Grading

Transformer-UNet hybrid for automated diabetic retinopathy grading on DRIVE & STARE datasets. Multi-scale feature fusion for accurate vessel tree extraction and microaneurysm detection.

0.94 AUC 97.2% Sensitivity

ViT-UNetDRIVE DatasetRetina AI

↗

● Computer Vision

3D Human Body Pose Estimation

Transformer-based pose prediction with robust occlusion handling. Real-time deployment via TensorRT with under 30ms latency on Human3.6M benchmark.

42mm MPJPE <30ms Latency −4% vs Baseline

TransformerTensorRTHuman3.6MReal-Time

↗

● Computer Vision

ViUNet: Vision Transformer UNet for Segmentation

Novel architecture merging U-Net's localization strengths with ViT's global context modeling. Multi-scale attention mechanisms for robust feature representation across diverse imaging modalities.

Multi-Scale Attn Global Context

ViTU-NetAttentionCustom Architecture

↗

● Computer Vision

Multi-Modal Detection: Pose, Face & Hand

MediaPipe/OpenCV unified system for simultaneous pose, facial landmark, and hand gesture tracking. Low-light face detection optimized via histogram equalization. Deployed in AR applications.

Real-Time 3-in-1 Pipeline

MediaPipeOpenCVARLandmark Detection

↗

● Computer Vision

Gym Workout Recognition via ST-GCN

Spatiotemporal Graph Convolutional Network for real-time gym exercise classification. ONNX deployment enabling efficient inference on fitness applications with 88% accuracy on UCF101.

88% Accuracy UCF101 ONNX Deployed

ST-GCNONNXAction Recognition

↗

● Computer Vision

Image Restoration: Real-ESRGAN & GFPGAN

Historical image restoration pipeline using Real-ESRGAN. Fine-tuned GFPGAN for high-fidelity facial reconstruction. HAT super-resolution model deployed in a production mobile application.

4× Super-Res Mobile App

Real-ESRGANGFPGANHATMobile Deploy

↗

● Computer Vision

Few-Shot Fashion Recommendation System

Hybrid few-shot learning approach for visual fashion item retrieval and recommendation. Siamese networks combined with transformer embeddings for outfit compatibility scoring.

Few-Shot Siamese Network

Few-ShotSiamese NetTransformers

↗

● Detection

Thermal Object Detection: YOLO–Transformer Hybrid

Real-time colormap-invariant thermal detection framework. Hybrid YOLO–Transformer architecture achieving robust detection across varied thermal colormap configurations in industrial environments.

Colormap-Invariant Real-Time

Thermal ImagingYOLOTransformerIndustrial AI

↗

● Detection

License Plate Recognition with YOLOv7

End-to-end plate detection and OCR recognition pipeline using YOLOv7. Optimized for varying lighting, angles, and international plate formats with high-speed inference.

YOLOv7 Multi-Format

YOLOv7OCRLPRReal-Time

↗

● Detection

Multi-Object Tracking with DeepSORT & ByteTrack

Robust multi-object tracking pipeline combining YOLOv8 detection with DeepSORT and ByteTrack. Real-time speed estimation and trajectory analysis for surveillance and traffic monitoring.

MOTA 78.4% 30+ FPS

DeepSORTByteTrackYOLOv8Tracking

↗

● Detection

Industrial Defect Detection System

Automated quality control pipeline for manufacturing defect detection. Anomaly detection via PatchCore and supervised YOLOv8 for surface scratch, dent, and contamination classification.

99.1% Precision Zero False-Pass

PatchCoreAnomaly DetectionManufacturing

↗

● Satellite / Remote Sensing

Scalable High-Res Object Detection via Sentinel

Rare-object detection and matching framework for large Sentinel satellite tiles. Sliding-window + attention-guided detection with geospatial coordinate output and GeoJSON export.

Large Tile GeoJSON Output

SentinelGDALRasterioTorchGeo

↗

● Satellite / Remote Sensing

Road Segmentation via Maxar Open Data & SAM2

Zero-shot and few-shot road network extraction from high-resolution Maxar imagery using Segment Anything Model 2. Automated pipeline for urban planning and infrastructure mapping.

SAM2 Zero-Shot

SAM2MaxarGeoPandasRoad Network

↗

● Satellite / Remote Sensing

Fire & Smoke Detection — Industrial & Wildfire

Dual-domain detection system for fire and smoke in both industrial environments and wildfire satellite imagery. Temporal analysis for early-warning systems with drone integration capability.

Early Warning Multi-Domain

Fire DetectionTemporal CVDrone Integration

↗

● Satellite / Remote Sensing

Change Detection in Multitemporal Satellite Imagery

Siamese network architecture for detecting land-use and structural changes between multitemporal satellite image pairs. Deployed for flood damage assessment and urban expansion monitoring.

Siamese Net Flood Assessment

Change DetectionSiameseMultitemporal

↗

● Edge AI

YOLOv11-EdgeSuite: Universal Mobile Vision Toolkit

Android mobile vision toolkit built in Kotlin using YOLOv11 for detection and segmentation on smartphones. Real-time inference via TensorFlow Lite with camera API integration.

Android Real-Time TF Lite

YOLOv11KotlinTF LiteAndroid

↗

● Edge AI

Edge/Corner Detection on Raspberry Pi 4

GPU-accelerated Harris Corner and Sobel edge detection for real-time video processing on Raspberry Pi 4. Adaptive thresholding for robust low-light performance.

GPU Accelerated Real-Time Video

Harris CornerSobelRaspberry Pi 4OpenCV

↗

● Edge AI

Autonomous Driving Perception — CARLA + Openpilot

Full AV perception stack integrating object detection, lane segmentation, and depth estimation within the CARLA simulator. Openpilot-compatible policy evaluation for urban and highway scenarios.

CARLA Sim Full Stack

CARLAOpenpilotDepth Est.Lane Seg

↗

● Edge AI

Sign Language Recognition — Real-Time ASL

MediaPipe Holistic + LSTM temporal classifier for real-time American Sign Language recognition. Deployed on embedded hardware with under 20ms latency and 94% word-level accuracy.

94% Word Acc. <20ms Latency

MediaPipeLSTMASLEdge Deploy

Amir
Soltani

Passion Meets
Precision

Skills &
Expertise

Featured
Projects

Research
Contributions

By The
Numbers

GitHub Stats

Trophies

Top Languages

Let's Build
Something
Extraordinary

AmirSoltani

Passion MeetsPrecision

Skills &Expertise

FeaturedProjects

ResearchContributions

By TheNumbers

GitHub Stats

Trophies

Top Languages

Let's Build SomethingExtraordinary

Amir
Soltani

Passion Meets
Precision

Skills &
Expertise

Featured
Projects

Research
Contributions

By The
Numbers

Let's Build
Something
Extraordinary