Building Latent Constructs

My latent construct is classifying Airbnb’s on the basis of the amenities included in them. For the same, I define luxurious Airbnb as those which have a pool and/or a gym.

Using the string detect command, we can see that there are 1233 unique listings which have Pools in them. Now we do the same to figure out gyms.

We see that there are 2803 listings with gyms. It is interesting to see that there are more listings with gym in them than pools. One reason for this can be Boston’s weather. Since its cold most months, people would less likely want to go into pools. Otherwise, I believe if it were any other city/state, usually the scenario observed is opposite.

Now that we have this data with us, I want to aggregate this with neighborhoods to see where are these listings concentrated. Is it in a particular region or is it generally distributed.

The listings are ploted on the basis of their mean. We see that highest mean is in West End followed by Jamaica Plain. It gives us a fair idea of areas that have pools and those who don’t. Similarly, we plot a neighborhood wise graph for Airbnb’s with gyms.

From the graph, we see that gyms are comparatively less than pools. A west end listing has maximum mean followed by that in Chinatown, and Fenway. In this scenario, it is quite possible to consider the West End listing an outlier as well. Now for our main analysis, it is important that we merge these two plots together and see if there any overlaps in our data.

From the merged plot, we see that there is somewhat of a strong correlation between listings having pool and gym. The mean of both are usually in an 0.1 range from each other with exceptions in West Ends, Fenway and Chinatown. But since the mean of gym is higher in all these three areas, this means people prefer having gyms in their airbnbs more than pools.


Leave a comment