Wednesday, December 14, 2022

Calculating Zonal Statistics for Overlapping Polygons in ArcGIS

There are many circumstances where we are interested in obtaining area statistics yet the areas overlap. These might involve extracting land use data in cases where some of the features of interest are overlapping or even highly overlapping.  There are three ways to go about this in ArcGIS, all of which are deeply flawed. One approach is zonal statistics.  However, the standard zonal statistics tool will lead to erroneous results when the zones overlap. The second approach is to use a moving window followed by extracting the values to points. This approach is useful in many cases, especially if you intend to use the raster generated from this step for some other purpose, such as predictive mapping (i.e. habitat modeling). However, this approach is slow and may not be needed if you are just interested in the values themselves. The third approach is to loop through all points, buffer them, clip the raster, and append the values to a table. This approach is also slow as it involves looping through what may be hundreds or even thousands of features.

The solution demonstrated here involves using the zonal statistics tool to calculate mean values accounting for overlapping areas. The values are then joined back using a spatial join and the dissolve tool using a weighted summation. The result is a tool that is faster than the other approaches and works for irregular polygons, such as nested watersheds.

Here is a brief tutorial on how to do this in ArcGIS:

Step 1 – Generate polygons through buffering or watershed delineation

 

In the example above I have seven points (in green) which I’ve buffered to produce seven overlapping buffers shown by the green lines. The underlying raster is the Hansen et al. 2013 global tree cover layer. I’m hoping to get average tree cover for each buffer. In this case I buffered each point using a 30 km buffer opting not to dissolve any of the buffers. Other options might involve delineating irregular polygons such as nested watersheds.

Step 2 – Use the Extract by Mask tool to clip the raster and reduce the size

 

This is an optional step, but by reducing the size of the raster that the zonal statistics will run on you can speed the process up. In this example I’ve clipped the raster to the buffered polygons. 

Step 3 – Run the Union tool using the Join Attributes = ALL

In this case the number of polygons has increased from seven to forty.


Step 4 – Run the Zonal Statistics as Table tool

Using the zonal statistics as table tool I calculate the average tree cover in each of the polygons. I use the union polygon from the previous step as the feature zone data.  I specify OBJECTID (for geodatabases, FID for shapefiles).  The value raster is the tree cover. I select MEAN as the statistic. Note that the zonal statistics operation does not perform the operation on duplicate geometries. Hence, in this case we get twenty rows rather than the full forty that are in the unioned polygon feature class.

Step 5 – Join the zonal statistics table back to the unioned polygons

To get the information from the zonal statistics table back to the unioned polygons I perform  a join. I choose FID (OID if geodatabase) as the first join field, select the zonal statistics table, and select OBJECTID_1 as the second join field and select the “keep only matching records” option.

Step 6 – Export to a new feature class

Right click on the unioned polygons with the join and export to a new feature class. Add the new feature class back into the map.

Step 7 – Add a new field called Prop and calculate it from the AREA field

Create a new field called Prop and specify type=double. Calculate Prop by dividing AREA by a value that represents the entire area. Using the map I identified polygons that made up an entire buffer. In this case the value was 0.232486.

Step 8 – Perform a spatial join to join the feature class from the previous step to the original buffers

Perform a spatial join with the buffers as the target features and the join feature class as the feature class from the previous step. Select the JOIN_ONE_TO_MANY option, keep all target features, and CONTAINS for the match option.

Spatial join - target = original buffers, join = exported, one to many, CONTAINS

Step 9 – Create a new field called Mult and calculate it as MEAN * Prop

Create a new field called Mult and calculate it as MEAN * Prop.

Step 10 – Dissolve the spatial join feature class using the TARGET_FID

Use the spatial join output from the previous step as the input feature class. Use the Target_FID as the dissolve field.  For the statistics field select Mult and then for the statistic select SUM. The map at the bottom of the page shows high tree cover in red.

 

 

Step 11 – Run the feature to point tool to convert the buffers back into points

Run the feature to point tool to convert the buffers back into points. Leave the Inside check box UNCHECKED. Symbolize the new point features SUM_MULT field or better yet create a new field and give it a better name.  See the example below. The larger green circles represent higher levels of tree cover within the 30 km buffers which is shown with the white colors on the underlying raster map.