======= Scoring ======= The ``scoring`` module of ``gerrychain`` is a collection of functions that can be used in conjunction with the ``Partition`` class to create more complex updaters beyond what is provided natively in the ``gerrychain`` library. This module also provides a number of methods for analyzing election results of an ensemble generated by a ReCom chain. For this tutorial, we will be working with the following shapefile of the state of Maryland: .. raw:: html
MD Shapefile

Scoring Districting Plans ------------------------- Let's start with the imports that we will need for this section: .. code-block:: python from gerrychain import Graph, Partition, Election from gerrytools.scoring import * import pandas as pd import geopandas as gpd All of our scores are functions that take a GerryChain ``Partition`` and produce either a numerical (plan-wide) score or a mapping from district or election IDs to numeric scores. For our examples, we will use a 2020 Maryland VTD shapefile to build our underlying dual graph, since the shapefile has demographic and electoral information that our scores will rely on. .. code-block:: python graph = Graph.from_file("MD_vtd20/") elections = ["PRES12", "SEN12", "GOV14", "AG14", "COMP14", "PRES16", "SEN16", "GOV18", "SEN18", "AG18", "COMP18"] # use our list of elections ablve to create `Election` updaters for each contest # Ex: in our shapefile, the column `PRES12R` refers to the votes Mitt # Romney (R) received in the 2012 Presidential general election updaters = {} for e in elections: updaters[e] = Election(e, {"Dem": e+"D", "Rep": e+"R"}) The :meth`~gerrytools.scoring.demographic_updaters` function returns a dictionary of ``Tally`` updaters that track the number of people of a given demographic group. You can pass as a list with as many demographic groups as you wish (example below): .. code-block:: python demographic_updaters(["TOTPOP20", "VAP20"]) Which should return something like: .. code-block:: console {'TOTPOP20': , 'VAP20': } And then we can continue to add these to our updaters for our partition, and continue as normal .. code-block:: python # add updaters that track total population, total voting age population, # and Black and Hispanic voting age population updaters.update(demographic_updaters(["TOTPOP20", "VAP20", "BVAP20", "HVAP20", "WVAP20"])) # create the partition on which we'll generate scores # since `MD_CD_example.csv` is a CSV with `GEOID20` -> district assignment, # we need to replace the `GEOID20`s with integer node labels to match the graph's nodes. geoid_to_assignment = pd.read_csv("data/MD_CD_example.csv", header=None).set_index(0).to_dict()[1] assignment = {n: geoid_to_assignment[graph.nodes[n]["GEOID20"]] for n in graph.nodes} partition = Partition(graph, assignment, updaters) Partisan scores --------------- All our partisan scores require at least a list of elections (we'll use our ``elections`` list defined above). Some of them additionally require the user to specify a POV party (in our case, either ``Dem`` or ``Rep``). All of these partisan scores return a dictionary that maps election names to the score for that election; it is up to the user to aggregate (often by summing or averaging) the scores across every election. For a simple example, let's use the score function that returns the number of Democratic seats won in each election. .. code-block:: python seats(elections, "Dem") This will return: .. code-block:: console Score(name='Dem_seats', apply=functools.partial(, election_cols=['PRES12', 'SEN12', 'GOV14', 'AG14', 'COMP14', 'PRES16', 'SEN16', 'GOV18', 'SEN18', 'AG18', 'COMP18'], party='Dem', mean=False), dissolved=False) Note that the output of ``seats(elections, "Dem")`` is of type ``Score``, which functions like a Python ``namedtuple``: for any object ``x`` of type ``Score``, ``x.name`` returns the name of the score, and ``x.apply`` returns a function that takes a ``Partition`` as input and returns the score. See below: .. code-block:: python seats(elections, "Dem").name returns .. code-block:: console 'Dem_seats' and .. code-block:: python seats(elections, "Dem").apply(partition) returns .. code-block:: console {'PRES12': 6, 'SEN12': 6, 'GOV14': 4, 'AG14': 6, 'COMP14': 6, 'PRES16': 6, 'SEN16': 6, 'GOV18': 4, 'SEN18': 6, 'AG18': 6, 'COMP18': 8} Note that we can easily find the number of Republican seats like so: .. code-block:: python seats(elections, "Rep").apply(partition) This gives us .. code-block:: console {'PRES12': 2, 'SEN12': 2, 'GOV14': 4, 'AG14': 2, 'COMP14': 2, 'PRES16': 2, 'SEN16': 2, 'GOV18': 4, 'SEN18': 2, 'AG18': 2, 'COMP18': 0} Moreover, we can pass ``mean=True`` to return the average of the score over all elections, rather than a dictionary: .. code-block:: python seats(elections, "Rep", mean=True).apply(partition) Some partisan scores (``mean_median``, ``efficiency_gap``, ``partisan_bias``, ``partisan_gini``) do not require the user to specify the POV party in the call. This is not because there isn't a POV party, but because these functions call GerryChain functions that automatically set the POV party to be the **first** party listed in the updater for that election. Since we always list ``Dem`` first in this notebook, this means ``Dem`` will be the POV party for these scores— but this is something you should keep in mind when setting up your updaters and your partition. .. code-block:: python # Positive values denote an advantage for the POV party efficiency_gap(elections).apply(partition) which will give us .. code-block:: console {'PRES12': -0.027366954931038075, 'SEN12': -0.1112428189930485, 'GOV14': -0.016952521996415275, 'AG14': 0.0664089504401374, 'COMP14': -0.03643474212627552, 'PRES16': -0.04564932242915228, 'SEN16': -0.02799189191120642, 'GOV18': 0.09144998629410322, 'SEN18': -0.12475998763996132, 'AG18': -0.06082242557828398, 'COMP18': 0.05664447794898745} If you know you want to use a lot of scores, it can be helpful to make a list of the scores of interest, like so: .. code-block:: python partisan_scores = [ seats(elections, "Dem"), seats(elections, "Rep"), # signed_proportionality(elections, "Dem", mean=True), # absolute_proportionality(elections, "Dem", mean=True), efficiency_gap(elections, mean=True), mean_median(elections), partisan_bias(elections), partisan_gini(elections), # Note that `eguia` takes several more arguments — see the documentation for more details eguia(elections, "Dem", graph, updaters, "COUNTYFP20", "TOTPOP20"), ] Now, we can make use of the ``summarize()`` function to evaluate all the scores on this partition: .. code-block:: python partisan_dictionary = summarize(partition, partisan_scores) partisan_dictionary["mean_median"] This will return .. code-block:: console {'PRES12': 0.02205704780736839, 'SEN12': 0.04184519796735442, 'GOV14': 0.0128224074264629, 'AG14': 0.03372274606966308, 'COMP14': 0.026622499095666607, 'PRES16': 0.03478025159124121, 'SEN16': 0.03829214902714728, 'GOV18': 0.0195942524690087, 'SEN18': 0.037782714199074086, 'AG18': 0.03906798945053658, 'COMP18': 0.036168324606223434} and .. code-block:: python partisan_dictionary["mean_efficiency_gap"] gives us .. code-block:: console -0.02151975008383212 Demographic Scores ------------------ Our demographic scores return a dictionary that maps districts to demographic information, either population counts or shares. .. code-block:: python # `demographic_tallies()` takes a list of the demographics you'd like to tally tally_scores = demographic_tallies(["TOTPOP20", "BVAP20", "HVAP20"]) tally_dictionary = summarize(partition, tally_scores) tally_dictionary This will return a dictionary that looks like this: .. code-block:: console {'TOTPOP20': {1: 771992, 7: 772346, 8: 772421, 6: 771907, 3: 773001, 4: 772893, 5: 771418, 2: 771246}, 'BVAP20': {1: 50513, 7: 186256, 8: 84454, 6: 285475, 3: 106681, 4: 258794, 5: 334253, 2: 82315}, 'HVAP20': {1: 40466, 7: 36221, 8: 27363, 6: 44099, 3: 45359, 4: 144187, 5: 43594, 2: 110973}} And .. code-block:: python # `demographic_shares()` takes a dictionary where each key is a total demographic column # that will be used as the denominator in the share (usually either `TOTPOP20` or `VAP20`) # and each value is a list of demographics on which you'd like to compute shares share_scores = demographic_shares({"VAP20": ["BVAP20", "HVAP20"]}) share_dictionary = summarize(partition, share_scores) share_dictionary returns .. code-block:: console {'BVAP20_share': {1: 0.08427654278144459, 7: 0.3075109503392005, 8: 0.1389347687326854, 6: 0.463149987751003, 3: 0.18038569170027308, 4: 0.4331758821894971, 5: 0.5577436821598711, 2: 0.13770530746350554}, 'HVAP20_share': {1: 0.06751399798455716, 7: 0.05980131717762746, 8: 0.045014707140366, 6: 0.07154549893977225, 3: 0.07669701811787184, 4: 0.2413438137099663, 5: 0.07274213867961521, 2: 0.1856474650446164}} Two things to note: Both :meth:`~gerrytools.scoring.demographic_tallies` and :meth:`~gerrytools.scoring.demographic_shares` return *lists* of ``Score`` s (one for each demographic of interest), so if we want to just score one demographic, we'd have to index into the list in order to call ``.function()`` : .. code-block:: python demographic_tallies(["BVAP20"])[0].apply(partition) which returns .. code-block:: console {1: 50513, 7: 186256, 8: 84454, 6: 285475, 3: 106681, 4: 258794, 5: 334253, 2: 82315} Moreover, you can only use these scores on demographic columns that have already been tracked as ``Tally`` updaters when we instantiated our partition. If you try a new column (say, ``WVAP20``) things won't work! .. code-block:: python demographic_tallies(["WVAP20"])[0].apply(partition) gives us .. code-block:: console {1: 457669, 7: 320218, 8: 458845, 6: 234283, 3: 348325, 4: 127814, 5: 178346, 2: 275860} Our last demographic updater is :meth:`~gerrytools.scoring.gingles_districts`, which takes in a dictionary of the same type as ``demographic_tallies`` as well as a ``threshold`` between 0 and 1. Just like the other two demographic scores it returns a list of ``Score`` s, but here the ``Score`` s represent the number of districts where the demographic group's share is above the ``threshold``. (When the threshold is 0.5 — the default — these districts are called *Gingles' Districts*. .. code-block:: python gingles_scores = gingles_districts({"VAP20": ["BVAP20", "HVAP20"]}, threshold=0.5) gingles_dictionary = summarize(partition, gingles_scores) gingles_dictionary and this returns to us .. code-block:: console {'BVAP20_gingles_districts': 1, 'HVAP20_gingles_districts': 0}