Stream temperature is thought to be a primary determinant of the macro‐spatial distribution of stream biota, but we lack temperature data for most streams. Past research often relied on surrogates, such as altitude, latitude, catchment area (ALCA) and air temperature to study biota–temperature relationships. However, temperature surrogates may not accurately represent the thermal environments experienced by the biota and could thus produce misleading inferences regarding such relationships. In the absence of observations, modelled stream temperature could improve both predictions and interpretations of stream biodiversity patterns. We tested this hypothesis by relating stream benthic invertebrate assemblage structure and composition at 92 reference‐quality streams to ALCA, air temperature, modelled stream temperature and measured stream temperature, that is a progression from a coarse surrogate to directly measured water temperature. Modelled stream temperatures were obtained from a U.S.A.‐wide model developed previously with data from 569 reference‐quality sites. Variation in taxonomic composition, measured with an ordination, was strongly and almost identically associated with both modelled and measured stream temperature, but was less strongly associated with ALCA and air temperature. We also built predictive niche models to assess how choice of thermal metrics affected model performance. Model performance was measured as the precision with which each model predicted the number of taxa at a site (i.e. observed‐to‐expected taxon ratios). Niche models that contained modelled or measured stream temperature were more precise than those based on air temperature and ALCA. In addition, we compared the predicted probabilities of occurrence using niche models based on both modelled and measured stream temperatures. The two produced predicted probabilities of occurrence that were statistically indistinguishable for most (79%) taxa. Finally, we calculated thermal optima for each taxon as the abundance‐weighted average of stream temperature at sites where each taxon occurred. Thermal optima produced with modelled and measured stream temperatures were almost identical. Large‐scale models of stream temperature can be sufficiently accurate and precise to detect temperature‐driven patterns in stream biodiversity and should be useful in predicting the effects of climate change and other human‐caused thermal alterations on stream biodiversity.