Spatial Estimation of Seasonal Variability in Chronic Obtrusive Pulmonary Disease Prevalence Based on Geospatial Machine Learning Procedures

An Approach Using Earth-Engined-Based Remote Sensing Data, Health Risks Behavior, and Socio-Demographic Variables Across Census Tracts in Pennsylvania

Abstract

Estimating disease prevalence at a sub-city neighborhood scale enables early, targeted interventions that can save lives and reduce public health burdens. However, accurately identifying neighborhood-level disease prevalence is challenging due to the cost-prohibitive nature of localized data collection, difficulties in accessing confidential medical records, and the complexity of interactions between multiple potential determinants.

To address these challenges and capture a multidimensional perspective on the determinants of chronic respiratory diseases, this study investigates the environmental, socio-demographic, and behavioral predictors of chronic obstructive pulmonary disease (COPD) at the census tract level in Pennsylvania using machine learning models. By localizing the analysis to this geographic granularity, we aim to gain a nuanced understanding of spatial disparities in respiratory health. Drawing on data from tobacco retailer density, census records, health risk behavior surveys, seasonally classified land cover datasets, and satellite-derived vegetation indices, the study identifies the most significant predictors of COPD prevalence. It also evaluates the performance of various analytical techniques, including multilayer perceptron, support vector regression, and random forest regression.

Github Repository: MUSA550-PA-CRD-Prediction

Contributor:

Emily Zhou

Department of City and Regional Planning, Stuart Weitzman School of Design, University of Pennsylvania

Key Words:

chronic respiratory disease, support vector machine, random forest, multiple layer perceptron, deep learning, bayesian information criteria, google earth engine, land cover, geospatial health

Back to top