Hey there!
This repository contains all the data and code I used to create my short story for the San Antonio Express-News that searched for wind turbine concentrations in Texas.
Questions we answered using the data:
- How many turbines currently operate across the country?
- How many currently operate in Texas?
- Which Texas counties have the most turbines?
- Which wind farms are the largest in Texas?
The analysis and results can be found in my wind-turbine-analysis.ipynb
notebook.
If you're a reporter from another news organization and want to use this data and analysis to create a quick story for your own local market, please do! I'd love to see what you come up with, especially if you add a little twist to it! Reach out to me on social media.
Not the most exciting analysis here, but I wanted to get my feet wet with sharing my work on GitHub. I'm hoping to do more of this in the future.
All the python libraries used in this project are listed in requirements.txt
. You can install them by running pip install -r requirements.txt
in your terminal.
I encourage you to reproduce this project. You can run back my notebook by running nbexec wind-turbine-analysis.ipynb
in your terminal (shoutout to The Markup for putting nbless on my rader!)
You can find the data dictionary for the data used in this project on the USGS' website here.
But for quick reference, I'm including the data dictionary below:
Key | Value Type | Key Description |
---|---|---|
case_id | number (integer) | Unique stable identification number. |
faa_ors | string | Unique identifier for cross-reference to the Federal Aviation Administration (FAA) digital obstacle files. |
faa_asn | string | Unique identifier for cross-reference to the FAA obstruction evaluation airport airspace analysis dataset. |
usgs_pr_id | number (integer) | Unique identifier for cross-reference to the 2014 USGS turbine dataset. |
t_state | string | State where turbine is located. |
t_county | string | County where turbine is located. |
t_fips | string | State and county fips where turbine is located, based on spatial join of turbine points with US state and county. |
p_name | string | Name of the wind power project that the turbine is a part of. Project names are typically provided by the developer; some names are identified via other internet resources, and others are created by the authors to differentiate them from previous projects. Values are that were unknown were assigned a name based on the county where the turbine is located. |
p_year | number (integer) | Year that the turbine became operational and began providing power. Note this may differ from the year that construction began. |
p_tnum | number (integer) | Number of turbines in the wind power project. |
p_cap | number (float) | Cumulative capacity of all turbines in the wind power project in megawatts (MW). |
t_manu | string | Turbine manufacturer - name of the original equipment manufacturer of the turbine. |
t_model | string | Turbine model - manufacturer's model name of each turbine. |
t_cap | number (integer) | Turbine rated capacity - stated output power at rated wind speed from manufacturer, ACP, and/or internet resources in kilowatts (kW). |
t_hh | number (float) | Turbine hub height in meters (m). |
t_rd | number (float) | Turbine rotor diameter in meters (m). |
t_rsa | number (float) | Turbine rotor swept area in square meters (m2). |
t_ttlh | number (float) | Turbine total height from ground to tip of a blade at its apex in meters (m). |
retrofit | number (integer) | Indicator of whether the turbine has been partially retrofit after initial construction (e.g., rotor and/or nacelle replacement). 0 indicates no known retrofit. 1 indicates yes known retrofit. |
retrofit_year | number (integer) | Year in which the turbine was partially retrofit. |
t_conf_atr | number (integer) | Level of confidence in the turbine attributes. 1—No confidence: no attribute data beyond total height and year, 2—Partial confidence: incomplete information or substantial conflict between, 3—Full confidence: complete information, consistent across multiple data sources. |
t_conf_loc | number (integer) | Level of confidence in turbine location. 1— No turbine shown in image; image has clouds; imagery older than turbine built date, 2— Partial confidence: image shows a developed pad with concrete base and/or turbine parts on the ground, 3— Full confidence: image shows an installed turbine. |
t_img_date | number (integer) | Date of image used to visually verify turbine location. Note if source of image is NAIP, the month and day were set to 01/01. |
t_img_srce | string | Source of image used to visually verify turbine location. |
xlong | number (float) | Longitude of the turbine point, in decimal degrees. |
ylat | number (float) | Latitude of the turbine point, in decimal degrees. |
eia_id | number (integer) | Plant ID from Energy Information Administration (EIA). |
I used this script to extract the Texas state border outline from the U.S. Census Bureau's TIGER/Line® shapefiles for the animated Flourish map.
You can find the Texas state border shapefile in output/texas.geojson
.