Geographic Information System (GIS) mapping has become a popular and useful technique for identifying areas of need, as well as the health and social contributors to those needs. However the level of data and the usefulness of the resulting images can be a major concern. Using de-identified individual residence, COH creates neighborhood level maps and enables the identification of hot spots as opposed to the diffused image that results from somewhat limited data obtained by utilization of geo-political levels of data (state, county, city, zip code). The maps COH produces have the power to illustrate true geographies which show concentrations of concern. Analyses can then be used strategically to target resources and action, enhance grant funding proposals, and track change over time. Time-series analysis is important to program/project evaluation and increases the ability to identify trends over time.
Our Methodology for Privacy Protection
Children’s Optimal Health employs several levels of privacy protection when mapping sensitive data, especially data that is HIPAA or FERPA protected. Personal identifiers are removed from data tables, geographic points are randomly shifted, point data is rasterized, and and points within small cells are redacted. When creating chloropleth maps by zip code or census tract, we also suppress data where there are a small number of points within a given cell. When creating raster maps, we also apply an algorithm that randomly shifts points before creating the raster. In this way, we ensure privacy without compromising accuracy.
Raster Layer Methodology for Density and Proportion Maps
Many Children’s Optimal Health (COH) maps display density distribution of some particular population of interest. Density maps show where high concentrations of the mapped population live. All COH density maps are rendered from raster datasets. Our GIS tool, ESRI’s ArcMap, supports a variety of ways to calculate and display density maps. We use a methodology that we believe strikes a proper balance between accuracy and ease of interpretation without compromising individual privacy.
We adjust our grid cell parameters to smooth out the distribution of cell values to make the interpretation of hotspots easier to distinguish visually, but retain enough locality to be meaningful at the neighborhood level. All cells with small values are suppressed to protect individual privacy. The definition of “small” may change depending on the overall population density of the surrounding area. Density is expressed in terms of number of individuals within a 300 yard radius circle.
To meet privacy-protection requirements of individuals’ data, residence location latitude and longitude values are randomly shifted within a radius to be determined by the demographics of the surrounding geography. This shifting can introduce significant errors for density values at the cell level. But at the neighborhood level, for example for a one mile by one mile zone, a shift of up to 300 feet does not significantly alter the overall distribution of the population for the neighborhood.
Another kind of COH map that describes a population’s distribution is a proportion map. The COH proportion maps display the distribution of ratios of a specific population’s subset compared to the full set.
Each proportion map is derived from two or more density maps. Since proportion maps display ratios, the calculation of a proportion map’s cell value involves dividing cell values from one set of density maps by the cell values of another set of density maps. For example, an obesity proportion map is derived by dividing the density map of overweight and obese students by the density map of all students with a BMI score, so that each individual cell’s count of students with high BMI is divided by its corresponding count of all students.
All density maps undergo a reclassification process before they are used to derive proportions. Small density cell values are reclassified to 0 (zero) in order to remove them from the calculus and thus protect confidentiality. Since this reclassification occurs before proportion values are calculated, there is no further need to hide any cells in the proportion maps to protect privacy.