All work
ML/AI · Mar 2026 – Present

NeoRx

Causal drug-target discovery — Pearl's do-calculus across 8 biomedical databases, end-to-end molecular generation pipeline.

Role
Solo author
Links
PythonPyTorchPyTorch GeometricDoWhyRDKitFastAPIPostgresRedisDockerStable-Baselines3
Problem

What was broken.

Drug discovery is correlative — most ML pipelines learn associations, not causes, leading to expensive false positives downstream. We need pipelines that reason causally and generate molecules conditionally.

Architecture

How it's wired.

A causal graph spans 8 biomedical sources. A SMILES-based BiGRU VAE (4.1M params) generates candidate molecules conditioned on QED, logP, and molecular weight. A hierarchical RL agent (UCB1 over targets, CEM in latent space) explores the 244-dim observation space against a 6-objective reward. AutoDock Vina docking runs in parallel; results feed back via Redis.

Build

What I shipped.

  • Causal pipeline integrating Pearl's do-calculus across 8 biomedical DBs
  • SMILES-based VAE — BiGRU encoder/decoder, 4.1M params, 97% molecular validity
  • Hierarchical RL agent (UCB1 + CEM) with 6-objective adaptive reward
  • End-to-end molecular docking (PDB → PDBQT → Vina) with parallel multiprocessing
  • FastAPI microservice, 108 unit tests, GitHub Actions CI on Ubuntu + macOS
Outcomes

What changed.

0.474
Mean F₁ across 7 disease validations
0
False positives in validation set
97%
Molecular validity from VAE
Tradeoffs

Why these choices.

Picked SMILES over graph generation for speed of training and library maturity. UCB1 + CEM beats vanilla PPO for sparse-reward molecular search. Self-contained HTML reports were a deliberate choice over a UI — reproducibility for collaborators.