Category: linguistics

MacOS tip of the day: Dictionary

If you’re like me, you find yourself looking up words in the dictionary fairly frequently. For me, this is particularly true, since I am an American living in Germany, and even though I speak German quite well, there are few days that go by where I don’t need to look up a word or two.…

May 10, 2021
UNIX tip of the day —
duplicate and replace lines with awk

Today I got some data I wanted to add to my machine learning training datasets for named entity recognition. My system is designed to be used with output from automatic speech recognition (ASR). It is frequently difficult to be certain whether ASR output will contain hyphens or not, e.g. (email, vs e-mail) so frequently I…

January 18, 2019
Vim regex-fu for LaTeX

When writing a beamer presentation with LaTeX, I organize my presentation into sections and subsections. Frequently, the title of the first frame (slide) in a subsection has the same name as the subsection. Let’s say I start off with the following structure: \section[corpora]{Accessing text corpora} \subsection[gutenberg]{The Gutenberg Corpus} \subsection[chat]{The web and chat Corpus} \subsection[brown]{The Brown…

September 24, 2009
Why doesn’t Mac update standard UNIX utilities?

I am currently teaching a course on programming for linguists. We are using python, but for the first few classes, I have been going over some standard UNIX utilities like cd, ls and such, plus using regular expressions with grep and sed. I actually don’t use sed that much. I tend to reach for perl,…

September 15, 2008
Bash one-liners to the rescue

I recently find myself using handy bash one-liners more all the time. I think that this is where unix/linux can really start to shine. There are so many little programs that just do one thing, and one thing well. But the ability to combine these together through pipes means you have extremely flexible and powerful…

July 15, 2008

Join 22 other subscribers