For the analysis I have chosen a dataset from the EPA, the early version of the 2008 Toxics Release Inventory "a publicly available EPA database that contains information on toxic chemical releases and waste management activities reported annually by certain industries as well as federal facilities."
The dataset comes as a tab-delimited .csv file, and can be downloaded from data.gov
This dataset will be used for a series of exercises in analysis and visualization. The results of the work, along with the commands used to achieve those results, will be posted here.
The data downloads into a self-extracting .zip file. This will automatically decompress when executed on a Windows machine, on a Mac or something else you might have to do a little more work.
Also, to save some time and learn from my mistakes, you should open the data sheet in Excel or another spreadsheet program and add a new column to the start of the file that will hold a unique id number for each row.
Step 1: Load the ggplot library (assuming it is already installed on your system).
Step 2: Read the text file into a data object in R. This command is based on a Windows version of R - Mac, Linux or other users won't have to use the double slashes.
chem = read.delim("C:\\Users\\dnfehren\\Desktop\\tri_2008_US_v08.txt")
Step 1: Attach the data object to your R workspace, this will save some typing later.
Step 2: Grab just the rows of the dataset that deal with Washtenaw County (or choose your own county), and assign it to a new data object.
local_chem <- subset(chem, County == 'WASHTENAW')
Step 1: Create the initial ggplot object by telling ggplot where your data is coming from and basic aesthetic information about the factors and fill color.
pie <- ggplot(local_chem, aes(x=factor(1), fill = factor(Chemical)))
Step 2: A pie chart is really a bar chart mapped using polar coordinates, so the next layer of the graph adds a bar geometric element of a specific width and with a black border.
pie = pie + geom_bar(width=5, color="black")
Step 3: The last step is to layer the polar coordinate system on top of those bar geometries to get the appearance of a pie. The angle of the pie slice, theta, is taken from the y coordinate in what would have been a bar chart/p>
pie = pie + coord_polar(theta="y")
Step 4: display the pie
Click for larger images
Pie Maker R Script this can be loaded in R and used to reproduce this exercise's commands.