D.C. Climate Data, Past & Future¶

Conor Moore¶

Introduction¶

For this tutorial we will be taking a look at Climate Data for Washington D.C. We will be also looking at data relating to greenhouse gas emmissions, population size, and total energy usage for the district. The primary goal is to estimate the rate of temperature change we can expect in the area due to GHGE as well as overall climate change. The secondary goal is to test the correlation of D.C.'s temperature change to various factors. These will be accomplished through usage of linear regression techniques as well as the Pandas DataFrame structure.

Prior Knowledge¶

Pandas DataFrame¶

Going into this, it is recommended to have some familiarity with the Pandas DataFrame structure, a tutorial on its usage can be found here and the documentation can be found here. Essentially the DataFrame is a method of structuring and holding data in a table format for easier analysis.

Linear Regression¶

The second important library we will be using is Sklearn's linear regression library. Its usage is fairly simple to pickup, and there will be instructions on its usage in this tutorial, but in case you would like some experience using it prior to going into this, here is a simple tutorial using preloaded data.

Graphing Libraries¶

For this particular tutorial we will be using the Matplotlib Pyplots library for which documentation may be found here. There are a number of plotting libraries available for use, but pyplot allows a more seamless use of the linear regression tool for customizing regression results. We will be using three distinct pyplot functions for this tutorial; SubPlot, Scatter Plot, and Plot.There will be a comments in the code about how/why we will be using these but for now its good to have a basic familiarity with what they do.

Libraries¶

import pandas as pd
import matplotlib.pyplot as plt
import pylab 
import numpy as np
from sklearn import model_selection, linear_model, datasets
import operator
import random
import statsmodels.api as sm
from scipy import stats
from sklearn.linear_model import Ridge
from yellowbrick.regressor import ResidualsPlot

Data Collection and Organization¶

The DC climate data comes from the NOAA website where there was a PDF version of the data. That data was then downloaded and converted into .csv file using an external program prior to being used in this project. The .csv file will be included in the project links. The columns giving seasonal averages per year along with semi-annual averages were dropped because this project will be looking at either monthly data specifically or yearly averages.

The second data set is from the CAIT (Climate Analysis Indicators Tool) website. It holds greenhouse gas emmissions for each US state over a period of about 20 years.

#Setting up Green House Gas table
gas_data = pd.read_csv("CAIT 2.0 U.S. States GHG Emissions - csv.csv")
#Pulling relevant columns
rep_data = gas_data[['State', 'Year', 'Total GHG Emissions Including LUCF (MtCO2e)', 'Total CO2 (excluding LUCF) (MtCO2e)'
                    ,'Total CH4 (MtCO2e)', 'Total N2O (MtCO2e)', 'State GDP (Million US$ (chained 1997/2005))',
                    'Population (People)', 'Total Energy Use (Thous. tonnes oil eq. (ktoe))']]
rep_data = rep_data.dropna(axis='columns')

#Getting DC specific data
for index, row in rep_data.iterrows():
    if row['State'] != 'District Of Columbia':
        rep_data = rep_data.drop(index)
gas_data = rep_data

#Setting up Temperature Table
temp_data = pd.read_csv("dcatemps.csv")
temp_data = temp_data.drop(['ANN','WINTER','SPRING','SUMMER','AUTUMN', '1ST HALF', '2ND HALF'], axis = 1)
temp_data = temp_data.dropna()
months = list(temp_data.columns.values)
years = temp_data['YEAR']
del months[0]
del years[len(years) - 1]
frame = pd.DataFrame()

#Rearranging the table for line graphing
for mon in months:
    data = temp_data[mon]
    del data[len(data) - 1]
    data = np.array(data)
    sers = pd.Series(data, index = years)
    frame = frame.append(pd.Series(data, index = years), ignore_index = True)
frame['MONTH'] = months
frame['MONTH LABEL'] = [0,1,2,3,4,5,6,7,8,9,10,11]
frame = frame.set_index('MONTH') #Setting the index allows the graph later on to set Y-values as month
frame.head()

Table 1¶

This table (frame) is a reorganization of the data so each column contains a year's worth of temperature data. This was done in order to graph each year as a line later on. Graphing rows in the structure of temp_data is difficult for matplot to do.

temp_data.head()

Table 2¶

This second table (temp_data) holds the data in its original format where each row contains the year and the year's worth of data.

gas_data.head()

Table 3¶

This third table holds the District of Columbia Greenhouse gas output data from 1990 - 2011, the population changes during that time, the GDP, and the total energy usage. In this table the unit of measurement for GHG emmission is the "Metric tons of carbon dioxide equivalent" or MtCO2e, and Kilo Tonnes of Oil Equivalent for the energy usage. Using equivalent measurements allows for easier standardization of a particular value or emission. For instance the GHG emissions are all measured in CO2 Equivalent even though we are measuring CH4 and N2O. Knowing that all these values are measured in this same way makes the data meaningful even if they're measuring strictly differnt things.

Graphing Monthly Temperatures¶

For this first part we will do a simple line graph plotting the changes in monthly temperature by year. This section will use our rearranged dataset "frame". As discussed we changed the format of it so that each years monthly averages would be accessed by row instead of column to make doing this graph easier. We will also be sampling the dataset. What this means is that instead of using all the data in the set, we will be selecting some amount of them (in this case 35) to be representative of the whole dataset. We will be doing this because in order to represent the total amount of years we have data for we would have to put almost 150 lines of varying colors, as well as markers to indicate those colors. Having this sheer amount of data wouldn't help to provide more insight and would instead hurt reader comprehension. Make sure to, when representing data, take into account whether or not you need to show all of the data in order to get your point across.

plt.rcParams['figure.figsize'] = 15, 10

#Line plots the graph data based on set of years
def plot_parts(years):
    x = months
    for year in years:
        y = frame[year]
        plt.plot(x, y, label = year)
        pylab.legend(loc = 'upper left')
    plt.title("Washington DC Average Monthly Temps from {} - {}".format(min(years), max(years)))
    plt.xlabel("Month")
    plt.ylabel('Temperature (°F)')
    plt.show()
    
#Selecting sample because adding all of them inhibits graph comprehension
select = []
while len(select) <= 35:
    val = random.choice(years.tolist()) 
    if val not in select:
        select.append(val)
plot_parts(sorted(select))

Graphing Data By Month¶

This section will have our introduction to linear regression! For this part of the project we will be graphing each months data individually as well as plotting a regression line to illustrate how the temperature is trending. We will also be making our first scatter plot as well as using SubPlot for the first time! The purpose of this section is to practice running linear regression over a single variable dataset as well as plotting this single variable along with the regression line. The regression object has a function .predict(X) that given the value input as X, it will output its estimate corresponding value. This will be key for graphing our regression line later.

#This function runs 100 linear regressions and returns a dictionary with each score as the key and a list containing the 
#regression and the X & y test and training sets
def regr_lst(X, y):
    best_regr = {}
    for x in range(0, 100):
        lst = []
        #It's important to split the data into testing and training data, training data is what will be used to create the 
        #regression while testing data is what's used to determine how well that regression fits.
        X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=.3)
        regr = linear_model.LinearRegression().fit(X_train, y_train)
        lst.append(X_train)
        lst.append(X_test)
        lst.append(y_train)
        lst.append(y_test)
        lst.append(regr)
        best_regr[regr.score(X_test, y_test)] = lst
    return best_regr

try:
    i = 1
    for column in temp_data:
        if column != "YEAR":
            X = temp_data[column] #X is formatted like this [x1, x2,...,xn]
            X = np.array(X).reshape((-1, 1))# for single value regression we need X to look like this [[x1],[x2],...,[xn]]
            y = temp_data['YEAR']
            best_regr = regr_lst(X, y)#Getting the dictionary of regressions
            best_lst = best_regr[max(best_regr.keys())]#Getting the regression with the highest score
            X_test = best_lst[1]
            regr_l = best_lst[4]
            plt.subplot(3, 4, i) #For subplots (x, y, i) x and y are how many total subplot you want so 3 rows of 4 plots
            plt.subplots_adjust(wspace=.4, hspace = .4)
            i += 1
            plt.scatter(x = temp_data['YEAR'], y = temp_data[column]) #Graphing the given data
            plt.plot(regr_l.predict(X_test),X_test, color = "black") #Graphing the black regression line
            plt.title(column)
            plt.xlabel("Year")
            if column == 'JAN' or column == 'MAY' or column == 'SEP': 
                plt.ylabel("Temp (°F)") #I did this because having a y-label on each row instead of graph is easier to read 
except ValueError:
    pass
plt.show()

Graphing Yearly Averages¶

Now that we've plotted the all the data by month, it's now time to plot all the data by year. The above graphs all appear to be trending upwards, but are they? If so, by how much? That's what analysing the yearly changes can help us figure out. We will also be making use of Sklearn's score function for linear regression. This function gives a score for how well the regression fits the actual data. It runs an R^2 analysis for determining "goodness-of-fit". In addition to that analysis, we will be making a residuals plot to give visual evidence of best fit. If the residuals plot is randomly distributed, then that is evidence for our model being a good fit, if it's organized then we have a problem. You can read more about residuals plots here and here.

#Adds average yearly temp column to table
def add_avg_col(table):
    avgs = []
    for index, row in table.iterrows():
        sum = row.sum() - row['YEAR']
        avg = sum / 12
        avgs.append(avg)
    return avgs

temp_data['YEAR AVG'] = add_avg_col(temp_data)
X1 = temp_data['YEAR AVG'] #Setup for this regression is similar to the first due to the use of only one variable
X1 = np.array(X1).reshape((-1, 1))
y1 = temp_data['YEAR'].values

best_regr1 = regr_lst(X1, y1)
max_lst1 = best_regr1[max(best_regr1.keys())]
X_train1 = max_lst1[0]
X_test1 = max_lst1[1]
y_train1 = max_lst1[2]
y_test1 = max_lst1[3]
regr_l1 = max_lst1[4]

#We will be adding the score of the regression to the graph to provide some context for how well it fits the data
text = "".join(("D.C. SCORE USING RSS TEST: %.2f"%(100 * regr_l1.score(X_test1,y_test1)), "%"))
ax = plt.scatter(X_test1, y_test1, color = "red")
ax = plt.plot(X_test1, regr_l1.predict(X_test1), color = "black")
ax = plt.title("Washington D.C. Average Yearly Temperature")
ax = plt.xlabel("Average Temperature(°F)")
ax = plt.ylabel("Year")
plt.annotate(text, xy=(0.05, 0.95), xycoords='axes fraction', fontsize = 10)
plt.show()

#This section will allow us to plot the residuals to help determine if our regression model is a good fit
ridge = Ridge()
visualizer = ResidualsPlot(ridge, hist = False)
visualizer.fit(X_train1, y_train1)
visualizer.score(X_test1, y_test1)
visualizer.poof()

Predicting Future Temperatures¶

For this section we will be using a regression analysis for predicting future temperatures given just past temperatures. New to us this section is the .coef_ function. This returns the coefficients of the regression. In the case of single variable it will return one coefficient. Given two variables it will return two coefficients and so on. Basically it gives us the rate of change for the regression line. Remember to take into account that this doesn't count outside factors and is purely estimating the temperature change based on historical data and doesn't take things like climate change into account.

def predict_range(regr_l):
    years_lst = []
    pred_lst = []
    for x in range(2018, 2100):
        years_lst.append(x)
        pred_lst.append(regr_l.predict([[x]]))
    pred_lst = [item for sublist in pred_lst for item in sublist]
    text = "".join(("SCORE USING RSS TEST: %.2f"%(100 * regr_l.score(X_test,y_test)), "%"))
    avg = "".join(("RATE OF PREDICTED TEMP INCREASE: %.2f"%regr_l.coef_, "°F",  "/YEAR"))
    plt.plot(pred_lst, years_lst, '.r-',markersize = 12) 
    plt.plot()
    plt.title("Predicted Yearly Average Temperature of Washington DC from 2018 - 2099")
    plt.ylabel("Year")
    plt.xlabel("Predicted Temperature (°F)")
    plt.annotate(text, xy=(0.05, 0.95), xycoords='axes fraction', fontsize = 10)
    plt.annotate(avg, xy=(0.05, 0.9), xycoords='axes fraction', fontsize = 10)
    plt.show()

X = temp_data['YEAR']
X = np.array(X).reshape((-1, 1))
y = temp_data['YEAR AVG'].values
best_regr = regr_lst(X, y)
max_lst = best_regr[max(best_regr.keys())]
X_test = max_lst[1]
y_test = max_lst[3]
regr_l = max_lst[4]
predict_range(regr_l)

Outside Data¶

Okay now that we've looked at past and future temperatures around D.C. its time to look at external factors that could contribute to these temperature changes! We will be using the subplots again, and this time we will be looking at greenhouse gasses, the population of D.C. and the amount of energy used by the district. We will be visualizing this data in order to get an idea of how they've changed over the last few decades and to provide us some context for our measured temperature data if it turns out that they have some correlation.

transplant = {}
#Pulling temperature data from other table
for index, row in temp_data.iterrows():
    if row['YEAR'] >= 1990:
        transplant[row['YEAR']] = row['YEAR AVG']
        
rows = []
for index, row in gas_data.iterrows():
    if row['Year'] in transplant:
        rows.append(transplant[row['Year']])
gas_data['Year Average Temp °F'] = rows

#Setting up the graphs(Some repetition)
#GHG Plot
plt.subplot(3, 1, 1)
plt.subplots_adjust(wspace=.4, hspace = .4)
plt.scatter(x = gas_data['Year'], y = gas_data['Total CO2 (excluding LUCF) (MtCO2e)'], color = "blue", label = "Total CO2")
plt.scatter(x = gas_data['Year'], y = gas_data['Total N2O (MtCO2e)'], color = "black", label = "Total N2O")
plt.scatter(x = gas_data['Year'], y = gas_data['Total CH4 (MtCO2e)'], color = "red", label = "Total CH4")
plt.gca().legend(('Total CO2','Total N2O', 'Total CH4'))
plt.title('Top 3 GHG Emissions in D.C.')
plt.xlabel('Year')
plt.ylabel('MtCO2e')

#Population Plot
plt.subplot(3,1,2)
plt.plot(gas_data['Year'],gas_data['Population (People)'])
plt.title('D.C. Population Over Time')
plt.xlabel('Year')
plt.ylabel('Population')

#Energy Usage Plot
plt.subplot(3,1,3)
plt.plot(gas_data['Year'], gas_data['Total Energy Use (Thous. tonnes oil eq. (ktoe))'])
plt.title('Total Energy Usage of D.C. (In Thousand Tons of Oil Equivalent)')
plt.xlabel('Year')
plt.ylabel('KTOE')
plt.show()

Slicing Data¶

Now its time to do regression on multiple variables! Because of the limitations of visalization we won't be making a graph for this section. We will be using the Statmodels API to provide visuals in the form of summary charts. From these summary charts we can get numerical data to validate or contradict our hypothesis. Our current hypothesis is that at least one, if not many, of these factors affect temperature averages in the D.C. area.

gasses = gas_data.loc[:, 'Total CO2 (excluding LUCF) (MtCO2e)':'Total N2O (MtCO2e)'].values
pop = gas_data['Population (People)'].values
energy = gas_data['Total Energy Use (Thous. tonnes oil eq. (ktoe))'].values
each = gas_data.loc[:, 'Total CO2 (excluding LUCF) (MtCO2e)':'Total Energy Use (Thous. tonnes oil eq. (ktoe))'].values
y = gas_data['Year Average Temp °F'].values

Testing Effect of Greenhouse Gas Emmissions on Temperature Change¶

test = sm.add_constant(gasses) #Creating the test from out dataset
estimate = sm.OLS(y, test) #Generating regressor
res = estimate.fit() #Fitting our regressor
print("\t\t     TESTING GHG EMMISSIONS ON TEMP INFLUENCE\n", '-' * 75, "\n\n", res.summary())

		     TESTING GHG EMMISSIONS ON TEMP INFLUENCE
 --------------------------------------------------------------------------- 

                             OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.149
Model:                            OLS   Adj. R-squared:                  0.007
Method:                 Least Squares   F-statistic:                     1.049
Date:                Fri, 14 Dec 2018   Prob (F-statistic):              0.395
Time:                        20:43:48   Log-Likelihood:                -32.106
No. Observations:                  22   AIC:                             72.21
Df Residuals:                      18   BIC:                             76.58
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         55.9025      5.510     10.147      0.000      44.327      67.478
x1             0.4325      0.905      0.478      0.639      -1.469       2.334
x2            71.8045     92.064      0.780      0.446    -121.614     265.223
x3           -31.2718     26.360     -1.186      0.251     -86.652      24.108
==============================================================================
Omnibus:                        0.395   Durbin-Watson:                   1.859
Prob(Omnibus):                  0.821   Jarque-Bera (JB):                0.523
Skew:                           0.046   Prob(JB):                        0.770
Kurtosis:                       2.251   Cond. No.                     1.75e+03
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.75e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

Analysis¶

In the summary we can see x1, x2, and x3, these represent coefficients in the regressor. Where x1 is the Total CO2, x2 is the Total CH4, and x3 is Total N2O. The P > |t| is a representation of each coefficients P-Value. If this P-Value is <= .05 then typically you can disprove the Null Hypothesis. Essentially saying that your hypothesis (that that molecule has a relationship to the temperature) is correct. In this case, none of the values were < .05 so it seems to indicate that taken together there is no relationship. The conclusion will explain why this may or may not be entirely true.

Testing Effect of Population on Temperature Change¶

test = sm.add_constant(pop)
estimate = sm.OLS(y, test)
res = estimate.fit()
print("\t\t     TESTING POPULATION ON TEMP INFLUENCE\n", '-' * 75, "\n\n", res.summary())

		     TESTING POPULATION ON TEMP INFLUENCE
 --------------------------------------------------------------------------- 

                             OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.116
Model:                            OLS   Adj. R-squared:                  0.072
Method:                 Least Squares   F-statistic:                     2.630
Date:                Fri, 14 Dec 2018   Prob (F-statistic):              0.120
Time:                        20:57:19   Log-Likelihood:                -32.520
No. Observations:                  22   AIC:                             69.04
Df Residuals:                      20   BIC:                             71.22
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         43.7697      9.057      4.833      0.000      24.877      62.663
x1          2.522e-05   1.56e-05      1.622      0.120   -7.22e-06    5.77e-05
==============================================================================
Omnibus:                        1.366   Durbin-Watson:                   1.900
Prob(Omnibus):                  0.505   Jarque-Bera (JB):                1.059
Skew:                          -0.299   Prob(JB):                        0.589
Kurtosis:                       2.107   Cond. No.                     2.22e+07
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 2.22e+07. This might indicate that there are
strong multicollinearity or other numerical problems.

Testing Effect of Energy Consumption on Temperature Change¶

test = sm.add_constant(energy)
estimate = sm.OLS(y, test)
res = estimate.fit()
print("\t\t     TESTING ENERGY CONSUMPTION ON TEMP INFLUENCE\n", '-' * 75, "\n\n", res.summary())

		     TESTING ENERGY CONSUMPTION ON TEMP INFLUENCE
 --------------------------------------------------------------------------- 

                             OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.238
Model:                            OLS   Adj. R-squared:                  0.200
Method:                 Least Squares   F-statistic:                     6.238
Date:                Fri, 14 Dec 2018   Prob (F-statistic):             0.0213
Time:                        20:58:28   Log-Likelihood:                -30.893
No. Observations:                  22   AIC:                             65.79
Df Residuals:                      20   BIC:                             67.97
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         80.5803      8.861      9.093      0.000      62.096      99.065
x1            -0.0047      0.002     -2.498      0.021      -0.009      -0.001
==============================================================================
Omnibus:                        1.359   Durbin-Watson:                   2.312
Prob(Omnibus):                  0.507   Jarque-Bera (JB):                1.208
Skew:                          -0.504   Prob(JB):                        0.547
Kurtosis:                       2.450   Cond. No.                     1.91e+05
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.91e+05. This might indicate that there are
strong multicollinearity or other numerical problems.

Analysis¶

This section was testing the change in energy usage in the D.C. area and its P-Value is < .05 which would seem to indicate that it has a relationship to the change in temperature. This is the only variable to indicate this relationship so far.

Testing Effect of all Variables on Temperature Change¶

test = sm.add_constant(each)
estimate = sm.OLS(y, test)
res = estimate.fit()
print("\t\t         TESTING ALL ON TEMP INFLUENCE\n", '-' * 75, "\n\n", res.summary())

		         TESTING ALL ON TEMP INFLUENCE
 --------------------------------------------------------------------------- 

                             OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.612
Model:                            OLS   Adj. R-squared:                  0.457
Method:                 Least Squares   F-statistic:                     3.948
Date:                Fri, 14 Dec 2018   Prob (F-statistic):             0.0144
Time:                        21:18:31   Log-Likelihood:                -23.458
No. Observations:                  22   AIC:                             60.92
Df Residuals:                      15   BIC:                             68.55
Df Model:                           6                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         93.0663     24.964      3.728      0.002      39.856     146.276
x1             2.6259      0.888      2.958      0.010       0.734       4.518
x2           -64.0979    171.439     -0.374      0.714    -429.512     301.316
x3          -100.8679     39.960     -2.524      0.023    -186.040     -15.696
x4            -0.0001   6.38e-05     -2.348      0.033      -0.000   -1.38e-05
x5          6.531e-06   3.54e-05      0.185      0.856   -6.89e-05    8.19e-05
x6            -0.0052      0.002     -2.333      0.034      -0.010      -0.000
==============================================================================
Omnibus:                        0.533   Durbin-Watson:                   2.752
Prob(Omnibus):                  0.766   Jarque-Bera (JB):                0.532
Skew:                          -0.317   Prob(JB):                        0.767
Kurtosis:                       2.578   Cond. No.                     5.60e+08
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 5.6e+08. This might indicate that there are
strong multicollinearity or other numerical problems.

Testing Analysis¶

All of the sets, with the exception of energy usage over time, have a P-Value > .05. This doesn't necessarily confirm or reject the Null hypotheis for the different sets. Due to the fact that we started measuring greenhouse gas emissions relatively recently, we have limited data for specific cities like D.C. When all the factors are measured as a whole CO2 has a P-value of .01 which would invalidate the Null hypothesis. N20, population, and Total Energy Consumption all also fall under .05. This can also be misleading because, once again, we have a relatively small dataset and adding multiple variables to a small dataset can skew the results. However, just based on the results given in this limited test, it would appear that tracking D.C. energy usage would be a good predictor of the rate of energy increase.

Conclusion¶

Thank you so much for looking through this tutorial. I hope you learned something from it whether it be something simple like how to plot data or something more complex like regression testing. Interpreting climate data is a field that is becoming more and more important as the effects of climate change start to accelerate and having the ability to correlate factors leading to its acceleration is an important step in slowing it down. Furthermore the ability to accurately predict future outcomes from past events is a good skill to be able to employ in general.

	1871	1872	1873	1874	1875	1876	1877	1878	1879	1880	...	2008	2009	2010	2011	2012	2013	2014	2015	2016	MONTH LABEL
MONTH
JAN	32.6	31.7	32.2	39.4	28.5	40.4	29.4	33.5	30.8	41.9	...	40.0	31.7	35.3	33.7	40.8	40.3	32.2	35.6	34.9	0
FEB	35.9	33.7	34.8	37.2	28.8	36.7	39.4	39.8	32.2	40.8	...	41.0	39.8	34.2	41.8	44.4	38.4	37.8	30.3	39.9	1
MAR	48.0	35.4	41.8	44.5	39.1	39.4	41.0	49.4	43.8	41.8	...	49.0	45.2	51.2	45.6	56.8	43.8	42.9	45.3	53.5	2
APR	58.2	56.0	53.1	47.1	48.0	51.4	52.9	58.3	52.8	55.5	...	58.9	57.0	60.9	58.8	58.3	58.9	57.3	59.4	56.9	3
MAY	63.9	67.4	63.6	63.8	63.6	64.5	61.9	62.5	65.7	70.5	...	64.7	65.5	69.4	68.4	71.4	66.7	68.5	73.2	63.9	4

	YEAR	JAN	FEB	MAR	APR	MAY	JUN	JUL	AUG	SEP	OCT	NOV	DEC	YEAR AVG
0	1871	32.6	35.9	48.0	58.2	63.9	73.2	74.0	76.8	62.3	58.1	42.3	32.1	54.783333
1	1872	31.7	33.7	35.4	56.0	67.4	75.4	81.1	79.0	69.0	55.5	42.5	31.0	54.808333
2	1873	32.2	34.8	41.8	53.1	63.6	75.1	79.8	74.8	68.0	54.9	40.8	40.5	54.950000
3	1874	39.4	37.2	44.5	47.1	63.8	77.5	78.9	72.7	70.1	55.9	44.6	39.2	55.908333
4	1875	28.5	28.8	39.1	48.0	63.6	72.9	77.0	71.9	64.6	53.6	41.0	36.8	52.150000

	State	Year	Total GHG Emissions Including LUCF (MtCO2e)	Total CO2 (excluding LUCF) (MtCO2e)	Total CH4 (MtCO2e)	Total N2O (MtCO2e)	State GDP (Million US$ (chained 1997/2005))	Population (People)	Total Energy Use (Thous. tonnes oil eq. (ktoe))
8	District Of Columbia	1990	4.637697	4.449613	0.066460	0.128476	51175	605321	4561.627465
60	District Of Columbia	1991	4.589824	4.396313	0.065405	0.131594	50241	600870	4715.848877
112	District Of Columbia	1992	4.580535	4.375139	0.064809	0.139104	50598	597565	4706.550233
164	District Of Columbia	1993	4.790213	4.561491	0.067170	0.140237	51310	595301	4876.798577
216	District Of Columbia	1994	4.771525	4.521485	0.065197	0.141142	51290	589239	4757.226583