ML/AI · Mar 2026 – Present

NeoRx

Causal drug-target discovery — Pearl's do-calculus across 8 biomedical databases, end-to-end molecular generation pipeline.

Role

Solo author

Links

GitHub

PythonPyTorchPyTorch GeometricDoWhyRDKitFastAPIPostgresRedisDockerStable-Baselines3

Problem

What was broken.

Drug discovery is correlative — most ML pipelines learn associations, not causes, leading to expensive false positives downstream. We need pipelines that reason causally and generate molecules conditionally.

Architecture

How it's wired.

A causal graph spans 8 biomedical sources. A SMILES-based BiGRU VAE (4.1M params) generates candidate molecules conditioned on QED, logP, and molecular weight. A hierarchical RL agent (UCB1 over targets, CEM in latent space) explores the 244-dim observation space against a 6-objective reward. AutoDock Vina docking runs in parallel; results feed back via Redis.

Build

What I shipped.

▸Causal pipeline integrating Pearl's do-calculus across 8 biomedical DBs
▸SMILES-based VAE — BiGRU encoder/decoder, 4.1M params, 97% molecular validity
▸Hierarchical RL agent (UCB1 + CEM) with 6-objective adaptive reward
▸End-to-end molecular docking (PDB → PDBQT → Vina) with parallel multiprocessing
▸FastAPI microservice, 108 unit tests, GitHub Actions CI on Ubuntu + macOS

Outcomes

What changed.

0.474

Mean F₁ across 7 disease validations

False positives in validation set

97%

Molecular validity from VAE

Tradeoffs

Why these choices.

Picked SMILES over graph generation for speed of training and library maturity. UCB1 + CEM beats vanilla PPO for sparse-reward molecular search. Self-contained HTML reports were a deliberate choice over a UI — reproducibility for collaborators.

DataQRL

NeoRoute