Please refer to this post for scraping code

Data Analysis

As always, let’s load the goodies and create theme for GGPLOT2.

Next, we will remove the first row when the container was set up before scraping.

AirBNB doesn’t have the date issue as Microsoft. It seems like Glassdoor changed the way they reported date sometime prior to 2014. So, converting the date for AirBNB is very simple.

Then we separate the columns into two.

Next is text processing.

As AirBNB is pretty new, there are that many reviews. So I simply scraped the entire reviews which include international reviews. Let’s categorize them into “Domestic” and “International.”

Let’s see the number of reviews by month.

The review just only started in 2014; each month there has been only between 10 and 20 reviews. Well, I’d think the data is on the few sides. So it won’t be necessary to separate the reviews by Group. I’ll just simply process the entire dataset for ngram chart.

Let’s start with Pros

“Travel Coupons,” “Travel Credits,” “Smart People,” “Free Foods,” “Health Insurance.” Okay it’s about the same theme as Microsoft. Perks are important. Smart people love working with smart people. The differences are the Travel Credits… and most importantly, compensation. Hm.

Could compensation be in the Cons?

Okay, “Hyper Growth,” “Career Development,” “Middle Management,” “Upper Managment.” I’m not surprised by those comments. But “Growing Pains”?

We need to dig a bit deeper in that. Let’s filter the only Cons_2 with the phrase.

There are 27 comments with “Growing Pains.” I went to take a glimpse at the dataframe and found that mostly was about the hyper growth of the company. I guess that wore down to the culture. But hey at least compensation is not in the cons. So at least they are satisfied with their salary.