A Look into Public Service, Residential and Educational Amenities by Sub district

For this assignment I decided to look into the variance between education, public service, and educational amenities by sub district types.  Sub district types include business, harbor park, industrial, mixed use, residential among others. Different neighborhoods have different higher proportions of specific sub districts that are often correlated with restrictiveness. Specifically, one of my previous findings was that neighborhoods with a high proportion of residential sub districts were more likely to be restrictive. This could mean that sub districts are a unit of measurement worth looking at if we want to understand the dynamics of land use in areas where we see certain sub districts clustered. To begin I created a new variable that combines all of the education, residential, and public services amenities. Then I aggregated these new variables by sub district before proceeding with my analysis.

To begin I wanted to see if there was a relationship between public service amenities and education amenities. There are a number of different ways to check the relationship between these amenities. One of the most effective ways is by running a t-test. T-tests can only check two variables at first so looking at education and public service amenities aggregated by sub district we can see whether there is a statistically significant difference variation these two amenity types:

These results reveal that there is no statistically significant variation between them. A p-value of .8 firmly demonstrates an insignificant relationship between the two variables. The t-test is not our only tool when looking at sub districts and the different amenities.

Another tool we can use is the chi squared function which determines whether these categories of amenities are independent of sub district type. Looking at public service amenities we have an interesting result:

Paired t-test

data:  edag$totedu and puag$totpub

t = -0.25993, df = 9, p-value = 0.8008

95 percent confidence interval:

-106.7314   84.7314

mean of the differences -11

These results reveal that there is no statistically significant variation between them. A p-value of .8 firmly demonstrates an insignificant relationship between the two variables. The t-test is not our only tool when looking at sub districts and the different amenities.

Another tool we can use is the chi squared function which determines whether these categories of amenities are independent of sub district type. Looking at public service amenities we have an interesting result:

Pearson’s Chi-squared test

data:  puag$totpub and puag$subdistric

X-squared = 80, df = 72, p-value = 0.2424

With a p value of .24 the relationship is insignificant and we find these variables to be independent of one another. This is surprising as one might expect public service amenities to exist in certain sub district types such as commercial or mixed use spaces but this does not appear to be the case. One category of amenity is residential and one of the sub district types is residential. For this result one would logically expect a dependent relationship but when we run a chi-squared test we find:

Pearson’s Chi-squared test

data:  resagg$rtot and resagg$subdistric

X-squared = 90, df = 81, p-value = 0.2313

The p-value of .23 shows us once again that the variables are independent. There may be a number of reasons why this may be the case but first it may be helpful to look at this relationship graphically so we can visually observe trends.

subdismelt <- melt(resagg[2:10,], id.vars = ‘subdistric’) ggplot(subdismelt, aes(variable, value)) + geom_bar(aes(fill = subdistric), position = “dodge”,stat=”identity”)

subdis

This graph shows that it is not just residential sub district types that have residential amenities. Residential is also the most common sub district type by a large margin. So despite the fact that in the past I have observed a relative restrictiveness when it comes to sub district types, this does not necessarily play out when aggregating by amenity types. While we can run ANOVA using this data and get a result:

educsubdis<- aov(education~subdistric, data=zone)

              Df Sum Sq Mean Sq F value    Pr(>F)   

subdistric     9 43.545  4.8383   83.44 < 2.2e-16 ***

Residuals   1627 94.342  0.0580                     

The results are necessarily meaningless because we are comparing two categorical variables. This is why the chi-squared test was used as it is actually useful for this data set which is almost entirely categorical data.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s