Tag Archives: python

Exploring querying parquet with Hive, Impala, and Spark

At Automattic, we have a lot of data from WordPress.com, our flagship product. We have over 90 million users, and 100 million blogs. Our data team is constantly analyzing our data to discover how we can better serve our users. In 2015, one of our big focuses has been to improve the new user experience. […]

Posted in wordpress | Tagged , , , , , | Comments Off on Exploring querying parquet with Hive, Impala, and Spark

Why doesn’t Mac update standard UNIX utilities?

I am currently teaching a course on programming for linguists. We are using python, but for the first few classes, I have been going over some standard UNIX utilities like cd, ls and such, plus using regular expressions with grep and sed. I actually don’t use sed that much. I tend to reach for perl, […]

Posted in linguistics, linux, mac osx, perl | Tagged , , , | 1 Comment