What was broken.
Last-mile delivery in Nairobi suffers from informal addressing, extreme traffic congestion (Thika/Mombasa Rds), and dynamic rider capacity. Existing GPS routing ignores time-of-day patterns and cannot handle multi-stop optimization or optimal rider-order matching.
How it's wired.
Multi-resolution H3 grid (res 6 zones, res 8 navigation) spans Nairobi. Implicit road graph with 10K+ edges stores OSM priors blended with empirical traces via Bayesian updating. Q-learning agent (current/destination hex + time bucket + day type) learns traffic-aware policies. Hungarian algorithm optimizes dispatch; hierarchical TSP solver handles multi-stop routes via zone clustering.
What I shipped.
- ▸Multi-resolution H3 spatial indexing system with OSMnx road network integration
- ▸Q-learning routing agent with curriculum learning (5→15 hex hops) & reward shaping
- ▸GPS trace ingestion pipeline with validation & Bayesian weight blending (kappa=10)
- ▸Hungarian dispatch optimizer with urgency-weighted cost matrices for rider-order matching
- ▸Hierarchical multi-stop router: zone clustering (res 6) + greedy TSP per cluster
- ▸Synthetic trace generator with temporal noise modeling for traffic patterns
- ▸End-to-end benchmarking: Q-policy vs H3-direct vs random walk (2000 episodes)
- ▸Convergence visualization & system validation suite with 5 health checks
What changed.
Why these choices.
Chose H3 over quadtree for uniform hexagon shapes and better nearest-neighbor queries. Q-learning over A* to learn traffic-aware shortcuts from empirical traces. Used 2-hour buckets instead of continuous time for computational tractability. PostgreSQL over MongoDB for ACID compliance on dispatch assignments. Optimistic Q-value initialization (10s) to encourage exploration early. Matplotlib reports vs Streamlit UI for reproducibility in Jupyter notebook environment.