This is a study of gerrymandering in Alabama. We well test three methods of shape-based compactness scores, assess the representativeness of districts based on prior presidential elections and race. We will then extend prior studies by calculating representativeness of the convex hull of district polygons.
Key words
: gerrymandering, compactness, convex hull,
alabama, political representation Comma-separated list of keywords
(tags) for searchability. Geographers often use one or two keywords each
for: theory, geographic context, and methods.Subject
: Social and Behavioral Sciences: Geography:
Geographic Information SciencesDate created
: 2025-02-17Date modified
: 2025-02-17Spatial Coverage
: Alabama OSM:161950Spatial Resolution
: Census Block GroupsSpatial Reference System
: EPSG: 4269, NAD 1983
geographic coordinate systemTemporal Coverage
: 2020-2024 population and voting
dataTemporal Resolution
: Decennial censusThis is an original study based on literature on gerrymandering metrics.
This study is exploratory in design, with the goal of evaluating usefulness of a new gerrymandering metric based on the convex hull of a congressional district and representative capability inside the convex hull compared to the congressional district.
I plan on using…
groundhog()
for reproducible computational environments
(consistent versions of R and its packages)
here()
for reproducible path names
tidyverse()
includes dplyr()
for
database-style data frames
sf()
provides support for spatial vector data implementing
the OSGeo simple features standards we are accustomed to
stars()
spatial-temporal raster data in R
tmap()
thematic maps, including static maps or interactive
leaflet maps
## Loading required package: conflicted
## Loading required package: groundhog
## groundhog says: No default repository found, setting to 'http://cran.r-project.org/'
## Attached: 'Groundhog' (Version: 3.2.2)
## Tips and troubleshooting: https://groundhogR.com
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## here() starts at /Users/samuelbarnard/Desktop/springTerm25/openGISci/OR-Gerrymander-Alabama
##
## Linking to GEOS 3.13.0, GDAL 3.8.5, PROJ 9.5.1; sf_use_s2() is TRUE
##
## Linking to liblwgeom 3.0.0beta1 r16016, GEOS 3.11.0, PROJ 9.1.0
## Warning in fun(libname, pkgname): GEOS versions differ: lwgeom has 3.11.0 sf
## has 3.13.0
## Warning in fun(libname, pkgname): PROJ versions differ: lwgeom has 9.1.0 sf has
## 9.5.1
##
## Attaching package: 'lwgeom'
##
## The following object is masked from 'package:sf':
##
## st_perimeter
##
## [36mSuccessfully attached 'tidyverse_2.0.0'[0m
## [36mSuccessfully attached 'here_1.0.1'[0m
## [36mSuccessfully attached 'sf_1.0-19'[0m
## [36mSuccessfully attached 'tmap_4.0'[0m
## [36mSuccessfully attached 'tidycensus_1.7.1'[0m
## [36mSuccessfully attached 'knitr_1.49'[0m
## [36mSuccessfully attached 'lwgeom_0.2-14'[0m
## [36mSuccessfully attached 'markdown_1.13'[0m
## [36mSuccessfully attached 'htmltools_0.5.8.1'[0m
We plan on using data sources…
Set up districts file from districts.gpkg
:
## Driver: GPKG
## Available layers:
## layer_name geometry_type features fields crs_name
## 1 districts21 Multi Polygon 7 4 WGS 84
## 2 districts23 Multi Polygon 7 4 NAD83
## 3 precincts20 Multi Polygon 1972 8 NAD83
districts.gpkg
precincts20
Title
: Voting Precincts 2020Abstract
: Alabama voting data for 2020 elections by
precinct.Spatial Coverage
: Alabama OSM:161950Spatial Resolution
: voting precinctsSpatial Reference System
: EPSG: 4269, NAD 1983
geographic coordinate systemTemporal Coverage
: voting precincts used for tabulating
the 2020 electionTemporal Resolution
: annual election (2020)Lineage
: Saved as geopackage format. Processing prior
to download is explained in validation report and readmeDistribution
: Data available at Redistricting
Data Hub with free login.Constraints
: Permitted for noncommercial and
nonpartisan use only. Copyright and use constraints explained hereData Quality
: State any planned quality assessmentVariables
: For each variable, enter the following
information. If you have two or more variables per data source, you may
want to present this information in table form (shown below)
Label
: variable name as used in the data or codeAlias
: intuitive natural language nameDefinition
: Short description or definition of the
variable. Include measurement units in description.Type
: data type, e.g. character string, integer,
realAccuracy
: e.g. uncertainty of measurementsDomain
: Expected range of Maximum and Minimum of
numerical data, or codes or categories of nominal data, or reference to
a standard codebookMissing Data Value(s)
: Values used to represent missing
data and frequency of missing data observationsMissing Data Frequency
: Frequency of missing data
observations: not yet known for data to be collectedLabel | Alias | Definition | Type | Accuracy | Domain | Missing Data Value(s) | Missing Data Frequency |
---|---|---|---|---|---|---|---|
VTDST20 | … | Voting district ID | … | … | … | … | … |
GEOID20 | … | Unique geographic ID | … | … | … | … | … |
G20PRETRU | … | total votes for Trump in 2020 | … | … | … | … | … |
G20PREBID | … | total votes for Biden in 2020 | … | … | … | … | … |
Load variables:
## Reading layer `precincts20' from data source
## `/Users/samuelbarnard/Desktop/springTerm25/openGISci/OR-Gerrymander-Alabama/data/raw/public/alabama_dataset/districts.gpkg'
## using driver `GPKG'
## Simple feature collection with 1972 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.47323 ymin: 30.14442 xmax: -84.88825 ymax: 35.00803
## Geodetic CRS: NAD83
districts23
Title
: US Congressional Districts 2023Abstract
: Alabama congressional districts for the 2024
election.Spatial Coverage
: Alabama OSM:161950Spatial Resolution
: congressional districtsSpatial Reference System
: EPSG: 3857, NAD 1984 Web
Mercator projectionTemporal Coverage
: districts approved in 2023 for use
in 2024.Temporal Resolution
:Lineage
: Loaded into QGIS as ArcGIS feaure service
layer and saved in geopackage format. Extraneous data fields were
removed and the FIX GEOMETRIES
tool was used to correct
geometry errors.Distribution
: Alabama State GIS via
ESRI feature serviceConstraints
: Public Domain data free for use and
redistribution.Data Quality
: State any planned quality assessmentVariables
: For each variable, enter the following
information. If you have two or more variables per data source, you may
want to present this information in table form (shown below)
Label
: variable name as used in the data or codeAlias
: intuitive natural language nameDefinition
: Short description or definition of the
variable. Include measurement units in description.Type
: data type, e.g. character string, integer,
realAccuracy
: e.g. uncertainty of measurementsDomain
: Expected range of Maximum and Minimum of
numerical data, or codes or categories of nominal data, or reference to
a standard codebookMissing Data Value(s)
: Values used to represent missing
data and frequency of missing data observationsMissing Data Frequency
: Frequency of missing data
observations: not yet known for data to be collectedLabel | Alias | Definition | Type | Accuracy | Domain | Missing Data Value(s) | Missing Data Frequency |
---|---|---|---|---|---|---|---|
DISTRICT | … | US Congressional District Number | … | … | … | … | … |
POPULATION | … | total population (2020 census) | … | … | … | … | … |
WHITE | … | total white population (2020 census) | … | … | … | … | … |
BLACK | … | total Black or African American population (2020 census) | … | … | … | … | … |
Load variables:
## Reading layer `districts23' from data source
## `/Users/samuelbarnard/Desktop/springTerm25/openGISci/OR-Gerrymander-Alabama/data/raw/public/alabama_dataset/districts.gpkg'
## using driver `GPKG'
## Simple feature collection with 7 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.47323 ymin: 30.14443 xmax: -84.88825 ymax: 35.00803
## Geodetic CRS: NAD83
Map 2023 districts
## ℹ tmap mode set to "plot".
##
##
## ── tmap v3 code detected ───────────────────────────────────────────────────────
##
## [v3->v4] `tm_text()`: migrate the layer options 'just' to 'options =
## opt_tm_text(<HERE>)'
## [tm_text()] Argument `on_surface` unknown.
blockgroups2020
Title
: Block Groups 2020Abstract
: Vector polygon geopackage layer of Census
tracts and demographic data.Spatial Coverage
: Alabama OSM:161950Spatial Resolution
: census block groupsSpatial Reference System
: EPSG: 4269, NAD 1983
geographic coordinate systemTemporal Coverage
: 2020 censusTemporal Resolution
: 10 year census (2020)Lineage
: Data downloaded from US Census API “pl” public
law summary file using tidycensus in RDistribution
: US Census APIConstraints
: Public Domain data free for use and
redistribution.Data Quality
: State any planned quality assessmentVariables
: For each variable, enter the following
information. If you have two or more variables per data source, you may
want to present this information in table form (shown below)
Label
: variable name as used in the data or codeAlias
: intuitive natural language nameDefinition
: Short description or definition of the
variable. Include measurement units in description.Type
: data type, e.g. character string, integer,
realAccuracy
: e.g. uncertainty of measurementsDomain
: Expected range of Maximum and Minimum of
numerical data, or codes or categories of nominal data, or reference to
a standard codebookMissing Data Value(s)
: Values used to represent missing
data and frequency of missing data observationsMissing Data Frequency
: Frequency of missing data
observations: not yet known for data to be collectedLabel | Alias | Definition | Type | Accuracy | Domain | Missing Data Value(s) | Missing Data Frequency |
---|---|---|---|---|---|---|---|
GEOID | … | code to uniquely identify tracts | … | … | … | … | … |
P4_001N | … | total population, 18 years or older | … | … | … | … | … |
P4_006N | … | total: not Hispanic or Latino, Population of one race, Black or African American alone, 18 years or older | … | … | … | … | … |
P5_003N | … | Total institutionalized population in correctional facilities for adults, 18 years or older | … | … | … | … | … |
Load data:
Acquire decennial census data in block groups using the
tidycensus
package. First, query metadata for the
pl
public law data series.
The issue in the 2023 court cases on Alabama’s gerrymandering was a racial gerrymander discriminating against people identifying as Black or African American. Therefore, we will analyze people of voting age (18 or older) identifying as Black and or African as one race in any combination with other races.
This data is found in public law data series table
P3
.
Query table P3
on
"race for the population 18 years and over"
.
## Reading layer `block_groups' from data source
## `/Users/samuelbarnard/Desktop/springTerm25/openGISci/OR-Gerrymander-Alabama/data/raw/public/block_groups.gpkg'
## using driver `GPKG'
## Simple feature collection with 3925 features and 83 fields (with 1 geometry empty)
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -88.47323 ymin: 30.22333 xmax: -84.88908 ymax: 35.00803
## Geodetic CRS: NAD83
I have previously conducted an analysis with this data using QGIS to determine compactness along with race and party affiliation data.
However, I only conducted my analysis with an area-weighted re-aggregation approach, and did not incorporate convex hulls.
“This study is explicitly an investigation to the modifiable areal unit problem. Aspects of the study are extremely sensitive to the combination of edge effects and scale, whereby complex borders formed by natural features, e.g. coastlines or rivers, vary greatly in perimeter depending on the scale of analysis. We hope that in part, this study establishes a method that is more robust (less sensitive) to the threats to validity caused by scale and edge effects in studies of gerrymandering and district shapes.”
districts23
needs to be re-projected to EPSG:4269 NAD
1983 coordinate system using st_transform()
for the purpose
of geodesic analysis.
From here, we can calculate the percentage of population identifying
as Black using mutate()
.
Census data (blockgroups2020
) also needs to be
re-projected from the WGS 1984 geographic coordinate system to the NAD
1983 geographic coordinate system.
Find the total of people identifying as Black or African American as one race or any combination of multiple races.
X | name | label |
---|---|---|
151 | P3_004N | !!Total:!!Population of one race:!!Black or African American alone |
158 | P3_011N | !!Total:!!Population of two or more races:!!Population of two races:!!White; Black or African American |
163 | P3_016N | !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; American Indian and Alaska Native |
164 | P3_017N | !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; Asian |
165 | P3_018N | !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; Native Hawaiian and Other Pacific Islander |
166 | P3_019N | !!Total:!!Population of two or more races:!!Population of two races:!!Black or African American; Some Other Race |
174 | P3_027N | !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; American Indian and Alaska Native |
175 | P3_028N | !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; Asian |
176 | P3_029N | !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; Native Hawaiian and Other Pacific Islander |
177 | P3_030N | !!Total:!!Population of two or more races:!!Population of three races:!!White; Black or African American; Some Other Race |
184 | P3_037N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; American Indian and Alaska Native; Asian |
185 | P3_038N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander |
186 | P3_039N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; American Indian and Alaska Native; Some Other Race |
187 | P3_040N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; Asian; Native Hawaiian and Other Pacific Islander |
188 | P3_041N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; Asian; Some Other Race |
189 | P3_042N | !!Total:!!Population of two or more races:!!Population of three races:!!Black or African American; Native Hawaiian and Other Pacific Islander; Some Other Race |
195 | P3_048N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; American Indian and Alaska Native; Asian |
196 | P3_049N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander |
197 | P3_050N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; American Indian and Alaska Native; Some Other Race |
198 | P3_051N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; Asian; Native Hawaiian and Other Pacific Islander |
199 | P3_052N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; Asian; Some Other Race |
200 | P3_053N | !!Total:!!Population of two or more races:!!Population of four races:!!White; Black or African American; Native Hawaiian and Other Pacific Islander; Some Other Race |
205 | P3_058N | !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander |
206 | P3_059N | !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; American Indian and Alaska Native; Asian; Some Other Race |
207 | P3_060N | !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander; Some Other Race |
208 | P3_061N | !!Total:!!Population of two or more races:!!Population of four races:!!Black or African American; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race |
211 | P3_064N | !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander |
212 | P3_065N | !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; American Indian and Alaska Native; Asian; Some Other Race |
213 | P3_066N | !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander; Some Other Race |
214 | P3_067N | !!Total:!!Population of two or more races:!!Population of five races:!!White; Black or African American; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race |
216 | P3_069N | !!Total:!!Population of two or more races:!!Population of five races:!!Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race |
218 | P3_071N | !!Total:!!Population of two or more races:!!Population of six races:!!White; Black or African American; American Indian and Alaska Native; Asian; Native Hawaiian and Other Pacific Islander; Some Other Race |
Black
is a sum of all 32 columns shown above, in which
any of the racial categories by which someone identifies is Black or
African American.
Total
is a copy of the population 18 years or over,
variable P3_001N
.
PctBlack
is calculated as
Black / Total * 100
CheckPct
is calculated as the percentage of the population
18 years or older that is either white of one race only
(P3_003N
) or Black or African American as calculated above.
In Alabama, we can expect that this will be close to 100% for most block
groups, and should never exceed 100%.
blockgroups_calc.gpkg
## Deleting layer `blockgroups_calc' using driver `GPKG'
## Writing layer `blockgroups_calc' to data source
## `/Users/samuelbarnard/Desktop/springTerm25/openGISci/OR-Gerrymander-Alabama/data/derived/public/blockgroups_calc.gpkg' using driver `GPKG'
## Writing 3925 features with 6 fields and geometry type Multi Polygon.
Map the percentage of the population 18 or over that is Black or African American.
## ℹ tmap mode set to "plot".
Map approved 2023 districts over the black population
## ℹ tmap mode set to "view".
## Registered S3 method overwritten by 'jsonify':
## method from
## print.json jsonlite
##
## Variable bgcol and bgcol_alpha not supported by view mode