Most of the world's rivers and streams are ungauged — meaning we have no direct flow measurements for them. This foundational paper, co-authored by our CTO, Alden Keefe Sampson, shows that deep learning can bridge that gap in a way that traditional hydrology has struggled to achieve.
The team trained Long Short-Term Memory (LSTM) networks across 531 basins from the CAMELS dataset — roughly 30 years of daily rainfall-runoff records from catchments spanning the continental US. When tested on basins withheld entirely from training (true prediction in ungauged basins), the LSTM outperformed the calibrated Sacramento Soil Moisture Accounting Model and NOAA's National Water Model reanalysis, both of which had access to basin-specific data the LSTM did not.
The result: a median Nash-Sutcliffe Efficiency of 0.69 for the LSTM in ungauged basins, compared to 0.64 for the calibrated traditional model. This paper established a new benchmark for what's achievable in ungauged basins and underpins the approach at the core of HydroForecast.