Available for full-time roles

Dhananjay
Sharma

Software engineer with 5+ years building production ML systems and real-time data pipelines — most recently as a research engineer at Stony Brook's Neurobiology Lab, where I wrote Python pipelines for calcium imaging analysis targeting a Nature publication.

4+ yrs industry exp
3.92 MS Data Science · Stony Brook
6 portfolio projects

about me

The story so far

I'm a software engineer with 4+ years of industry experience and an MS in Data Science from Stony Brook University (GPA 3.92, May 2026).

My background spans two distinct chapters. The first was production engineering — building fintech infrastructure at CGI and later at REGART, a RegTech startup in Amsterdam whose compliance platform served EY and HSBC across European financial markets. The second was research — joining Prof. Prerana Shrestha's Neurobiology lab at Stony Brook and building a fiber photometry analysis pipeline from scratch, no prior neuroscience required, to study how fear memories form in mice.

Those two chapters have more in common than they look. Both required learning a domain fast enough to make decisions that actually mattered, then building something reliable enough that real people could depend on it daily.

I'm now looking for ML engineering and SDE roles in AI infrastructure, production ML systems, and applied AI — particularly problems where getting from research to production is the part nobody has fully solved yet.

🎓
MS Data Science · Stony Brook University
GPA 3.92 · Graduated May 2026 · IIT (ISM) Dhanbad undergrad (Math & Computing)
🏗️
Production Engineering Background
CGI Bangalore + REGART Amsterdam · Finance trading platform + RegTech compliance SaaS
🧪
Neuroscience Research
Fiber photometry pipeline · GCaMP8m imaging · Fear memory analysis · behavioral assays

projects

Things I've built

Six production systems across AI observability, real-time infra, multi-agent LLMs, and MLOps. Not tutorials — shipped code.

DeepLabCut · Mouse Behavioral Analysis · Shrestha Lab
01
DLC Video 1
02
DLC Video 2
03
DLC Video 3
04
DLC Video 4
MLOps · Fine-tuning · Live on HuggingFace
LLMOps Pipeline

End-to-end LLM lifecycle: QLoRA fine-tuning Phi-3-mini-4k on MedAlpaca (10,178 samples), automated eval gates with hard quality thresholds, FastAPI serving, and Prometheus + Grafana production monitoring. Fine-tuned model is live on HuggingFace Hub.

Model live on HF Hub Automated eval gate
QLoRA Phi-3-mini-4k MLflow FastAPI Prometheus Grafana HuggingFace Docker
Multi-agent · LLM · Finance
FinSight

3-agent pipeline that generates investment research memos from live SEC EDGAR filings. Agent 1 fetches data, Agent 2 writes the memo, Agent 3 (Critic) fact-checks and flags hallucinations before output. Groq Llama 3.3 70B. Type a ticker, get a PDF memo in 30 seconds.

Critic Agent fact-check Live SEC data
LangGraph Groq Llama 3.3 70B SEC EDGAR pgvector PostgreSQL React Docker
GraphRAG · Research Tool
NeuroAssist

Production RAG agent built for Shrestha Lab (SUNY Stony Brook). Routes questions across four retrieval strategies: paper RAG, code RAG, knowledge graph (NetworkX), and live PubMed search. CrossEncoder reranking. RAGAS faithfulness 0.9–1.0.

RAGAS 0.9–1.0 4-strategy routing
LangGraph ChromaDB NetworkX CrossEncoder PubMed API FastAPI Docker
ML · Time Series · Streamlit
Stock Price Prediction

End-to-end stock price forecasting app using Random Forest regression on Yahoo Finance data. Computes 20/50/200-day moving averages, trend identification, and volume indicators. Interactive Streamlit dashboard with next-day price predictions and technical overlays.

Random Forest yfinance Streamlit scikit-learn Pandas Plotly

experience

Where I've worked

Production systems, research pipelines, and regulatory tech — across three countries.

Stony Brook University · Shrestha Lab Jun 2025 – May 2026
Senior Research Aide · Neurobiology & Behavior
  • Built full fiber photometry pipeline from scratch for GCaMP8m calcium imaging analysis in WT and cHET mice across SAA and LTM sessions
  • Implemented two-level median time-warping, group behavior panels, and CS AUC matched-epoch Excel outputs in modular Python package
  • Developed automated behavioral analysis pipelines for NOR and Open Field assays using DeepLabCut on SeaWulf HPC (SLURM, V100)
REGART · Amsterdam 2020 – 2024
Software Developer · RegTech
  • Architected and developed compliance SaaS platform used by EY and HSBC employees across European financial institutions
  • Built data models for regulatory documents (MiFID II, EMIR) and optimized UI navigation for document management workflows
  • Ensured regulatory compliance across document ingestion, versioning, and audit trail features
CGI · Bangalore 2018 – 2019
Software Developer · Fintech
  • Worked on a finance trading platform integrating real-time currency streaming and multi-window trading interfaces
  • Improved system performance and responsiveness through backend service optimization and frontend state management

skills

Tech stack

Not a buzzword dump — these are tools I've shipped with.

Languages
Python Go SQL JavaScript Bash
ML / AI
PyTorch LangGraph QLoRA RAG ChromaDB pgvector MLflow
Data Infra
Apache Kafka PySpark PostgreSQL Redis Airflow dbt
Platform
Docker SLURM / HPC AWS GraphQL Prometheus
Computer Vision
PointPillars BEVFusion nuScenes KITTI DeepLabCut OpenCV
Research
Fiber Photometry GCaMP8m RAGAS Eval Statistical Analysis

contact

Let's talk

Open to opportunities

ML Engineering · SDE · AI Infrastructure · Applied AI