If you are a close friend of mine, you’ll likely know that I don’t like emoji. I know that I am in the minority opinion on this topic, so I don’t often bring it up, because I don’t foresee it being very useful or making me popular. However, I recently owned up to it at work, based on a writing prompt in my blogging group, which asks: “What’s something you believe that other people disagree with you about?” Thus I have finally decided to espouse my views on emoji publicly.
Background – on language and writing systems
My training is in linguistics. One fact about language which is commonly taught in most introductory linguistics classes is the fact that human language is virtually infinite. Noam Chomsky, the founder of generative linguistics, wrote:
An essential property of language is that it provides the means for expressing indefinitely many thoughts and for reacting appropriately in an indefinite range of new situations.
Noam Chomsky, Aspects of the Theory of Syntax, 1965, pg. 6
While Chomsky is taught more in most linguistics classes today, Humboldt wrote something similar over 100 years prior
For language is quite peculiarly confronted by an unending and truly boundless domain, the essence of all that can be thought. It must therefore make infinite employment of finite means
Wilhelm von Humboldt, On Language: on the diversity of human language construction and its influence on the mental development of the human species, 1836, p. 91
Chomsky’s theory of generative grammar was essentially a system to describe the combination of the finite means, in a way that could be applied to all languages. While 19th century and early 20th century linguistics had been concentrated primarily on the description and comparison of individual languages, Chomsky sought to discover universals of language. Most linguists today would agree that all languages have these four levels – phonology, morphology, syntax, and semantics. Most any linguistics curriculum will have at least one course in each.
- Phonology – which sounds are considered unique (phonemes) and how can they be combined
- Morphology – what is the smallest unit of meaning (morphemes) and how can they be combined into meaningful word and sub-word units
- Syntax – how can words be combined into sentences
- Semantics – why a sentence receives a certain logical interpretation
Each of these levels can be described by a finite set of symbols and rules (or constraints, in Optimality Theory). That is how we end up with the infinite creativity of language. All humans know and use these levels of language daily, though on a mostly unconscious level. Other than cocktail parties full of linguists, most people don’t actively discuss phonology and morphology in their free time. However, most people are aware of how a word which violates the phonotactics (the rules for combining sounds) of their language will sound strange. For example English does not allow for the consonant cluster kn, unlike German, which does. That is the humor behind the joke in Monty Python and the holy grail, when the Frenchmen call the Englishmen “silly knights” (pronounced as [knɪgɪt], instead of [nait]). English, being a German language, did used to allow the kn consonant cluster. At some point the k was simply deleted, but the spelling was not updated to reflect the change in pronunciation. Also, I think that all of my readers will be able to understand the word unbloggable, which I can not find in any dictionary, but yet should be readily understood as “not able to be blogged”. To understand this one must know that the prefix un means “not”, and the suffix able means “can be done”. One must also be familiar with the word blog which originated as a shortened form of web log, a kind of diary written on the World Wide Web. Finally, whether you know it explicitly or not, most any noun can easily be turned into a verb in English, simply by treating it as one. Just to be complete, I’ll also mention the importance of the spelling with the double gg, to indicate a “short o” sound instead of a “long o”. Compare e.g. “stop” and “stoppable”.
Note that when linguists talk about language, we are primarily talking about spoken language (including sign languages). Spoken language is acquired during childhood, simply by exposure. Written language must be explicitly taught. No one knows exactly when human language first came about (most estimates are between 50,000 to 500,000 years ago), but many agree that a pre-requisite for the development of language was the lowering of the larynx, which allows for a much richer set of sounds to be created. Most other mammals have a vocal tract with essentially one cavity, which can create sounds with one resonant frequency at a time (formant). The lowered larynx of homo sapiens gives us essentially two resonating cavities, which can produce three to five resonant frequencies at a time (because of harmonic frequencies), which results in languages such as English, which have 15 distinct vowels.
At this point we have covered some of the basic building blocks of language. For thousands of years, humans were able to take advantage of these advanced combinatorial skills to create increasingly complex thoughts. They could pass down knowledge from one generation to the next through the use of language much more rapidly than other species, such as which plants were poisonous, and which were not. This was useful, but as large civilizations began to emerge in Egypt and Mesopotamia around 5000 years ago, the need for a more scaleable system of communication arose, and thus writing was born. Most of the modern alphabets, including Latin, Greek, Hebrew, Arabic, Cyrillic, and likely Brahmic scripts all originated from the hieroglyphs developed in Egypt starting around 2800 B.C.E. Hieroglyphs were mostly logograms, i.e. a “written word”, from the Greek logos (word) + gramma (thing written). In other words, every hieroglyph represented one word or concept. Emoji are another form of logogram, as are Hanzi, used in writing Chinese. The big drawback with logograms is that in order to represent an infinite number of possible words, you need an infinite number of pictures. It is simply not practical1. At some point, 24 Egyptian Hieroglyphs called phonetic complements were introduced to help guide pronunciation of the pictographs. The Proto-Sinaitic alphabet was based on these phonetic complements, which later developed into the Phoenician alphabet, upon which most all other alphabets are based. The Phoenecian alphabet contained about 24 distinct letters, and was likely the first phonemic alphabet. Finally, we had a way to represent the infinite nature of language using a small finite set of symbols!
The origins of Emoji
For most of written history (going back about 5000 years), very few people knew how to read and write. Part of that had to do with technology. Writing on clay tablets was very difficult and time consuming. The switch to papyrus around 2500 years ago was a big step, as was the switch to paper around 1500 years ago. Nonetheless, in Europe from 500-1500, primarily only monks could write, and they wrote not in their native language, but in Latin. After Gutenberg popularized the movable type printing press in 1450, and the various Christian reform movements in Europe increased the practice of writing in one’s own native language as opposed to Latin, writing really started to take off. With increased access to education, world literacy rose from about 12% in 1820 to over 86% in 20152.
While the goal of a written language is to record spoken language, and we have the excellent tool of an alphabet to do so, throughout most of written history, we still have had a dichotomy between spoken and written language. Spoken language has mostly been used for synchronous communication (and before the invention of the telephone, radio, television, internet, etc. exclusively so), while written language was used for asynchronous communication. Thus the style of written and spoken language has historically been quite different. Synchronous communication is often time-constrained, resulting in the use of time-saving devices like contractions. Perhaps more importantly, face-to-face communication has the benefit of audiovisual communication. By supplementing spoken language with hand and facial gestures, as well as contextual knowledge of the physical environment, a terse statement may actually reveal a lot of information. Humor, in particular sarcasm or facetiousness, is often communicated through intonation, which is absent in written communication.
With the proliferation of near real-time written communication via email, text messaging, twitter, and the like, written communication has often become much more synchronous, and in many cases replaced situations in which spoken language was previously used, such as telephony. Writers noticed the lack of gestures in written language which were used to aid comprehension in spoken language, and started to emulate them through the use of emoticons, such as ;)
(winking face – often used to indicate sarcasm). I spend much of my day in semi-synchronous communication at work via Slack and P2 and I occasionally do like to make a joke or two, and thus use the winky face to clarify that I am joking, lest the recipient of my message be offended, or some other miscommunication were to happen. I also like the use of +
1 to indicate simple agreement, which in face-to-face communication might be accomplished via a head nod in many cultures.3
While I feel that a spoken conversation is often more efficient than a synchronous text-based conversation, there are definitely benefits to text-based communication, and the tools of the internet have certainly allowed us to increase the speed of innovation dramatically. Thus I see the use of near synchronous written language as mostly a positive innovation. We have already noted some of the drawbacks, and emoticons helped us overcome them.
Emoji go too far
That is where I think we should have stopped – emoticons – used to clarify written language or emulate common gestures. Unfortunately, we didn’t stop there. At some point, plain-text emoticons evolved into image-based emoji. At first, it was the same set of winking-face, smiley-face etc., but then they kept adding and adding. And then some applications like Slack allowed people to create their own custom emoji, and people found this really fun, so they started using them all the time. They also allow for animated emoji, which I find to be very distracting, and I have heard that for some neurodiverse people can actually be seizure inducing. When I first joined Automattic in 2015, I quickly got tired of all of the emoji in Slack. In particular, people were using them to replace words, such as “I’m going for a 🏃” , (i.e. “I’m going for a run”). For me, it is much easier to read the word “run” than to squint at a tiny picture and try to guess what it means. At some point I realized that Slack allows you to turn off Emoji, which substantially improved the quality of my work life.
A year ago or so, an emoji fan at work decided to add all the custom emojis people had added to the company Slack instance to our internal blogs (P2) as well. People don’t use them quite as often on P2s as on Slack, but then sure enough I happened upon this gem today, about what this person was planning to do this week:
More , +
I assume that this first emoji is a ketchup bottle, meaning to “catch up” on things they had missed, the 20th century phone icon probably means having phone calls, and the little squarish item looks sort of like a desktop calendar which were popular 30 years ago, so I guess that might mean meetings?
As far as I know, there is no feature to turn these off for P2. But then I started looking for my own. I realized that these are not unicode emoji, but rather just small inline images. The person who added the feature was nice enough to add the custom names as alt text to the images, so I was able to write a user script4 which replaces the images with their alt_text
, which gave me this:
More :ketchup:, :slack call: + :agenda:.
I find this much easier to read (though not exactly great prose). The programming language Perl is often lambasted as being a “write once, read never” language, because its syntax is often so terse. I find that text full of emoji are similar. The writer probably has a specific intention when using a peach emoji, but the reader is forced to guess from one of several possible interpretations (apparently the peach emoji is used to sometimes represent the human buttocks, because of a similar appearance). Just as using jargon and abbreviations obfuscates the meaning of a text, so do emoji. If one’s goal in writing something down is to efficiently and effectively convey meaning to the reader, emoji should be avoided.
Conclusion
I would like to thank Daniel Reeves, Harold Felty, and Ulrike Steindl for reading a draft of this essay. In particular, Ulrike pointed out the key difference in my emoji examples. Adding a winky face to a sentence serves as an intonational or pragmatic guide to the reader, while replacing content words such as nouns and verbs with pictures is a completely different sort of use of emoji. I posit that replacing text with emoji decreases language comprehension, while supplementing text with the occasional use of emoji as pragmatic markers increases language comprehension, with the following caveat:
If you find yourself punctuating every sentence with :-) to avoid misunderstandings, learn to write better.
Matthew Butterick, Practical Typography,
Footnotes
* According to Betteridge’s law, the answer is no
1In practice, writing systems like Hanzi only use several thousand characters commonly, and many words are represented as a combination of two or more characters, similar to a syllabary.
2https://ourworldindata.org/literacy
3In parts of the eastern meditteranean such as Cyprus, Turkey, Greece, and Bulgaria, nodding your head actually means no
4For reference, here is the script I used:
document.querySelectorAll( 'img.emoji').forEach( item => {
var text = ':' + item.attributes.alt.nodeValue + ':';
var theParent = item.parentElement;
theParent.attributes.style.nodeValue="font-family:mono";
theParent.innerHTML=text;
});
P.S. I could have made this essay much shorter and to the point, but I was inspired by my colleague Conrad Lee who gave a great talk about diversion, including a reference to the novel The Life and Opinions of Tristram Shandy, Gentleman, in which the main character does not appear until volume three, due to the narrator’s many diversions.