City Exploration 3: Blue Bikes

Mia Scholes 12/6/23

Background

I have been analyzing Blue Bike data from September 2022. In past weeks, I’ve looked at station use as defined by trips starting at each station within the month, and factors that may predict how well a station performs. I formed three new measures that I thought might explain the success of a station. These new measures were:

  1. Station Density: The average distance away the three nearest other Blue Bike stations are. The idea behind this measure was that stations near many possible end points are more useful than those that are more isolated.
  2. Distance to an MBTA stop: Distance, in meters, to the closest major (subway or high-frequency bus) stop. I hypothesized that connection to other modes of transportation makes a station more useful, especially to help solve the “last mile” problem.
  3. Distance to the protected bike network / Connectivity score: I used a .shp file containing Boston’s bike network and split it into overall bike network and protected bike network, and found that distance to the protected network especially is correlated with more use of a Blue Bike station.

In running a multivariate regression with these three measures across each station in Boston, I found that they only account for approximately 30% of the variation in trips beginning at each station. I then added planning district to the regression, and found that the R2 value jumped to about 0.52. Still, these factors did not explain use as much as I’d expected them to. So I turned to the residuals, or the variation not explained by the model. The basis of this City Exploration is virtually visiting and attempting to understand the Blue Bike stations that most underperformed and overperformed the model’s expectations.

Analysis: Data Guidance

To understand which stations over/underperformed expectations, I took the regression run without planning district data and merged its residuals with information on each station, its trip count, and the factors accounted for in the model. Because Blue Bike station density was not found to be significant, it was excluded. When sorted by the size of residual, the following stations overperformed the most:
Table 1: Most overperforming stations, as measured by trip count, in Sept 2022

As well as the stations that most underperformed:

Table 2: Most underperforming stations, as measured by trip count, in Sept 2022

Importantly, all of the overperformers were still less than 400 meters from both the protected bike network and an MBTA stop and more than 6,500 monthly trips taken. This means that this overperformance was not in spite of having poor predictors as indicated by the model’s independent variables, but that these stations have very, very high trip counts beyond what the model would predict.

I also found it notable that five of the six most underperforming stations are in East Boston. Also strangely, the second largest underperformer was a station in South Dorchester – Shawmut T Stop – that is extremely close to both the MBTA and the protected bike network and yet had even fewer trips taken than any of the East Boston stations. Because there were so many peculiarities in the residuals data that I wanted to explore, I took a virtual approach to look at some of these over- or under-performing areas.

Exploration

Overperforming Stations

The first stop was simple. Forsyth St. at Huntington Avenue is the main Blue Bike station on Northeastern’s campus, and I can certainly personally vouch that in September it experiences incredible turnover. It makes sense that students, a group less likely to be connecting to mass transit and more likely to be willing to cycle despite a lack of protected lanes, are using the Forsyth St. stop so much more than the model would predict.

Some of the other stations on the overperforming list tell a similar story. The third station on the list, Commonwealth Ave. at Agganis Way, is on Boston University’s campus next to their basketball arena. The fifth overperformer at Ruggles is the second of Northeastern’s on-campus stations.

Underperforming Stations

First, I wanted to look at the one underperforming station not in East Boston. The Shawmut T Stop station, despite being exceptionally close to both the protected bike network and an MBTA stop, gets very little ridership.

Figures 1 & 2: The Shawmut Blue Bike station (on map and aerial view)

At first glance, this Blue Bike station is notably far away from downtown Boston. Anyone in this area is almost certainly going to take the T toward downtown and their car for any other trips around South Dorchester. In theory, this station should help solve the “last mile” problem for people getting between the T stop and their homes. And it’s less than a meter away from the protected bike network, which should make those trips easy, right? Let’s look back at the protected network in Boston. Dorchester seems to have a couple tiny strips of protected lanes connected to nothing.

Figure 3: Location of Shawmut T stop Blue Bike station relative to protected bike network

It happens that Sharp St. just south of the bike station is a shared use path, meaning it can only be used by bikes and pedestrians. This qualifies it as a protected bike infrastructure, though it connects users to nothing around them. This points to a clear flaw in the protected network measure, because it is only meaningful for portions that are actually part of a real network.

The underperforming stations in East Boston add to this theory as well. Despite being on the surface close to both transit and protected bike lanes, it is very hard to actually get anywhere by bike from Eastie. Because no bridge into downtown Boston is accessible by bike, and the neighborhood is fairly well served by both transit and car infrastructure, East Boston bike stations are not as useful as the established measures may make them seem.

Conclusion

This exploration in particular made it clear to me how important it is to vet your measures. It would be easy to dismiss to case of the South Dorchester bike station and use it to say that the people in Dorchester may be uninterested in biking regardless of the factors analyzed, when in reality the way those variables presented in R was very different than the way they presented in the real world.

I think that there are many factors that contribute to the use of a Blue Bike station beyond what I analyzed here. In addition to cleaning up the existing measures, in future analysis I would add median age of tract surrounding each station or presence of a college campus, distance to Boston’s CBD, and age of the bike dock and surrounding bike infrastructure – as mode shift is a process that takes time.


Leave a comment