For this week, I expanded my analysis of the character of residential properties and culture. I wanted to continue exploring the effect of having “cultural” sites in a neighborhood. Cultural sites include schools, churches, and libraries. I will continue working on this definition of cultural sites and will make some adjustments moving forward. It is likely that the inclusion of universities skews some of the results.

I began by making a new binary variable for whether a neighborhood had greater than 1 cultural site, or had 1 or fewer.

TADCT <- read.csv(“/TADCT.csv”)

TADCT$CULT_YES <- ifelse(TADCT$Cult_count > 1, 1, 0)

summary(TADCT$CULT_YES)Min. 1st Qu. Median Mean 3rd Qu. Max.

0.0000 0.0000 1.0000 0.5899 1.0000 1.0000

The summary shows that just about 60% of census tracts in Boston have more than 1 cultural site.

I next ran a t.test to see the effect of cultural sites on median residential value.

t.test(resValue_median~CULT_YES, data=TADCT)

data: resValue_median by CULT_YES

t = 1.3661, df = 78.846, p-value = 0.1758

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-43396.78 233278.77

sample estimates:

mean in group 0 mean in group 1

512173.2 417232.2

The test shows that there is a difference in means between census tracts with more than one cultural site. Confirming earlier analysis, tracts with more cultural sites have lower median home values. However, the t.test has a p-value of 0.18, which is not statistically significant.

Next I conducted an ANOVA statistical test to compare means of residential parcel values by neighborhood.

Anova <- aov(resValue_median~BRA_PD, data=TADCT)

summary(anova)

Df Sum Sq Mean Sq F value Pr(>F)

BRA_PD 16 6540283795839 408767737240 3.274 0.000065 ***

Residuals 156 19475545536842 124843240621

—

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

5 observations deleted due to missingness

The summary of the ANOVA shows that there are statistically significant differences in means of residential properties between neighborhoods. The Tukey test conducts pairwise analysis between neighborhoods. The sample below shows some of the neighborhoods have higher or lower mean residential property values.

Fenway/Kenmore-East Boston 119815.833

Hyde Park-East Boston -16012.500

Jamaica Plain-East Boston 117075.449

Mattapan-East Boston -8616.667

North Dorchester-East Boston -21960.417

Roslindale-East Boston 31747.756

Roxbury-East Boston -14513.640

As expected, Fenway has higher value residences than East Boston, as does Jamaica Plain. Mattapan and Roxbury have lower values than East Boston.

I next constructed a chart to compare the residential home values.

melted <- melt(TADCT[c(15,13)], id.vars=c(“BRA_PD”))

means <- aggregate(value~BRA_PD,data=melted,mean)

names(means)[2] <- “mean”

ggplot(data=means, aes(x=BRA_PD, y=mean)) + geom_bar(stat=”identity”,position=”dodge”, fill=”blue”) + ylab(“Mean”)

ses <- aggregate(value~BRA_PD,data=melted, function(x) sd(x, na.rm=TRUE)/sqrt(length(!is.na(x))))

names(ses)[2]<-‘se’

means <- merge(means,ses,by=’BRA_PD’)

means <- transform(means, lower=mean-se, upper=mean+se)

levels(means$Type)<-c(“Downtown”,”Industrial/Institutional”,”Park”,”Residential”)

bar <- ggplot(data=means, aes(x=BRA_PD, y=mean)) + geom_bar(stat=”identity”,position=”dodge”, fill=”blue”) + xlab(“Neighborhood”) + ylab(“Mean”) + ggtitle(“Residential Parcel Values by Neighborhood”) + theme(axis.text.x = element_text(angle = 90, hjust = 1))

bar + geom_errorbar(aes(ymax=upper, ymin=lower),position=position_dodge(.9))

This graph shows the range of means of residential parcels by neighborhood. The South End and Back Bay each stand out for having very high total mean values, but a wide variance.