Scope/Broad Description

Create factors/correlations for the numbers we are seeing for positive test rates and mortality percentages by city/county with other factors such as the covid tests that are being used in each city/county, testing/reporting methodologies , staffing levels, percentage of pop tested, etc.

Example: Florida and Florida counties that are right next to each other and have overlapping zip codes are having strikingly different positive test rates (6 % vs 13%).

Tecnique

factor analysis models / linear regression models

Description of the Experiment

Linear regression data modeling would involve identifying the factors potentially influencing the positive test rates . These would be implemented as the feature vector for the model. We would collect this feature data for the communities/regions we have positive test rates for. Then ,we would use this factor data and positive test rate data to "Train" and "test" our data model . Once the data model is accurately predicting the positive test rates, we can apply linear regression to identify how the factors are correlating with the positive test rate . To be accurate , we need to include as many potential factors as possible and get this data for as many regions as possible

Data Points

Some of the data is made publicly available at the case level by zipcode/county. Other info (like test used/methodologies/processes/staffing levels) would need to be gathered by calling the test centers in each county.

Variables to consider

Predicted

Predictors-Factors