Posted 06.12.2024

[Disclaimer: This post includes a shameless plug for my VSCode theme that (obviously) represents my views]

Introduction

Nothing beats coding with a perfectly tweaked editor, every single shortcut, behavior and visual element painstakingly customized for your exact taste and needs.

But getting to that point involves a lot of experimentation, trying out different plugins, themes and tools and the appearance and highlighting of the code itself is naturally a major aspect of creating your perfect coding experience. And there are a lot of choices.

So imagine my surprise when, while looking up some general approaches and techniques used for syntax highlighting, I stumbled upon a number of articles and websites talking about how programmers should turn off syntax highlighting entirely, claiming that it’s unneeded or somehow even a bad idea.

I was honestly not expecting anyone to have such an extreme stance on something so universal, but then again, I also firmly believe that “everyone is doing it so it has to be great” is rubbish reasoning and that it’s always worth checking out attitudes outside my comfort zone.

So I considered the arguments. And I tried programming without any highlighting out for a bit.^[1] In the end, I was not convinced. But I do think I got a good idea of where many of the criticisms stem from and why others might prefer it. And this alone is still worthy of discussion.

Not all Syntax Highlighting is equal.

Technology is always evolving. This means that when criticizing technology, we should be careful of the pitfall that is criticizing an already outdated standard.

The articles I read were not exactly ancient, but old enough that many of their examples for “poor” syntax highlighting are (by now) not actually as common anymore. When I talk about Syntax Highlighting, I am talking about a fully featured modern approach, so for clarification let’s take a step back and look at how this particular technology has changed over time.

The Dark Ages

In the early days, the idea of syntax highlighting mainly centered around the approach of colouring specific keywords. Like emphasizing “var” or “function” in case you somehow forget that they create variables and functions. I fully agree that this mostly missed the point.

As a programmer you don’t need “help” to understand the grammar of the language which is static and is generally parsed passively after a while, instead it would be far more useful to know what hides behind all the identifiers that you and your fellow programmers define during programming.^[2]

Syntax highlighting may also play some inference games like flagging anything as a function if you call it with parentheses. That’s cute for reading code but doesn’t help you while programming as you already need to know if something is a function before you would try invoking it.

While it’s clear that this approach is suboptimal note that the criticism of keyword/token-based highlighting being mostly superfluous is a criticism concerning the lack of improvement provided not about an active drawback. The difference between something that doesn’t help and something causing a problem is significant because with the latter it could indicate that the issue is one of execution rather than an entirely misguided concept. In other words: It doesn’t sound to me like the solution is no or less syntax highlighting. It sounds like the solution is better syntax highlighting.

And as we’ll see, technology is getting better.

From Syntax to Semantics

The major difference which sets modern approaches of syntax highlighting apart from its naive origins is the concept of Semantic Highlighting, which relies on a language server parsing the actual code instead of only matching text patterns. This offers better insights as it enables the editor to “remember” additional information for example how a token was initialized. Semantic highlighting can tell you if a property of an object is read-only, if a variable name refers to a function and if that function is asynchronous or read-only and so on. It can do so at any point in the code, allowing colors to convey information that might not be readily apparent from reading that particular portion of code. This can be especially helpful in large projects.

Maybe bringing up syntax highlighting that highlights more than syntax is moving the goalposts. But this more elaborate approach is also what many editors and IDEs are already utilizing.

Note how in the above example the semantic highlighting on the right is keeping track of the exact type of the variables, as well as modifiers such as readonly.

Dismantling the Training Wheels

Now we’re getting into the more abstract arguments which highlight (pun intended) some interesting attitudes about coding in general. One of them is an analogy that I have heard multiple times while researching the criticism leveled at Syntax Highlighting: It involves likening the functionality to training wheels on a bicycle. This always bugged me.

Analogies have a tendency to come with baggage. They will always have some aspect in which their implications diverge from what they’re describing and this becomes a problem when someone starts to attribute properties to a concept which only apply to the metaphorical part of the comparison. In this particular example that baggage gets outright weaponized.

“With training wheels on your bike you don’t have to think about the balance of your bike. With Syntax highlighting in your code, you don’t have to think as much about the contents of your code,” that’s how the comparison goes. Fair enough. Now the argument continues: “But we all know that training wheels are terribly impractical and cumbersome and we teach kids to ditch them as soon as they can. So programmers should do the same.”

The issue is, of course, that only the beneficent parallel, the removal of a cognitive load, has been established between the two concepts. The negative ones are nebulously implied by association.

And even when other more reasonable arguments are presented afterwards, the groundwork set by the initial analogy seems manipulative. Specifically the association of childishness that the term “training wheels” automatically includes, in turn creating a reluctance to take an opposing position which has been framed as defending something inherently immature.

But let’s forget sneaky implications and societal expectations and think about the facts of the matter: Why exactly are training wheels considered bad? From the top of my head, I can think of a number of reasons. The weight is distributed differently, the maneuverability is much worse since you can’t really lean into corners, and they also introduce additional friction impacting breaking and accelerating besides being generally worse in rough terrain. So yes, training wheels on bikes are a liability for experienced bicycle riders… but what are the development equivalents of those weaknesses when it comes to code?

This is where the misplacement takes place since obviously highlighting doesn’t impact the capability of the code itself in any way.^[3]

Are there for example languages that compile differently and less performant if syntax highlighting was applied to them?

At this point there could be an accusation of intentional misinterpretation; Clearly, the analogy was not aimed at code capabilities but at developer capabilities. The way that programmers who employ syntax highlighting presumably think about the code because they don’t need to think “as much” about it, but once again I have never seen any real examples of which processes could be actually inhibited.

Are there any programming styles or techniques that programmers using more syntax highlighting just inherently can’t get their head around or can’t apply efficiently? Functional programming? Generator functions? Dependency injection? Is there anything besides anecdotal proof?

Another argument is an anti-complexity one. Similar to advocating “self documenting” code, the premise is that anything that “requires” external help to parse is inelegant and over-engineered. You should just know what everything in your code is and does and if you don’t know just by looking at it, your code sucks and you should rewrite. I’m not entirely opposed to this view but keeping things simple only goes so far once the size of a codebase grows.

It is also a criticism that could be leveled at basically any tool that fetches information from other parts of the project like autocompletion, skip-to-definition functionality, linting, even project-wide type checking in general. There are certainly those who take the argument to that extreme, developers who prefer to work entirely with the simplest of text editors without additional features.

However this is a philosophy which is not just at odds with the specific topic of syntax highlighting, but the entirety of the usual modern programming approach of utilizing intelligent code editors and IDEs, a sort of programming primitivism. And that’s a bigger discussion altogether.

On one hand, we could do with less complexity. On the other hand, with the right tools, some complexity can be automated to a degree that it no longer even appears complex to the programmer. Sometimes this approach results in bloat. But among productivity tools syntax highlighting is among the least invasive, so it feels like the wrong place to start optimizing.

Who’s the Target Audience?

Something else that tends to sneak in when people discuss syntax highlighting like some “training wheel” or another sort of crutch, is the idea that it’s a tool designed to help beginners.

From this perspective it is very clearly flawed. The practical issues of understanding code that most novice coders experience, such as idiosyncratic language features, untangling control flow or the general architecture of a complex app are all mostly outside the scope of syntax or even basic semantics. It’s not going to help you to know whether a function is async or not if you don’t already understand the difference in functionality.

The other hidden implication of the “beginner” argument is that experience will eventually offset the advantage that highlighting provided at first, and at that point it would have nothing to add to your understanding of the code, becoming pointless.

From these two premises, we can see how some form a bleak conclusion about syntax highlighting: It doesn’t actually help beginners, and if you are not a beginner you don’t need it.

But that is only true if we accept both premises. I don’t believe that beginners are the main “audience” of syntax highlighting because I also don’t believe that it’s aiming to compensate for a lack of knowledge that experienced developers have.

We should not think of such tools as bridging a gap from one skill level to another, but rather as a tool that compounds whatever skill level you already possess. It might be true that with increasing experience the load that is shifted onto the highlighting lessens, but would that truly lead to a complete neutralization of benefits? From the same premises, you can also make the argument that, when more information is directly present, the more you know, the better you can make use of that information.

Aside About Natural Language and Vocabulary

There was at least one article which included a quip showing some English sentences with its different kind of words, verbs, nouns, adjectives and so on, all coloured differently and observed that this was unnecessary and doesn’t help understand those sentences better. It was an elegant comparison, which is why it’s worth addressing why it doesn’t stand up to scrutiny: What is ignored here is that natural languages and constructed/formal languages are completely different beasts.

Because of the complexity and ambiguity of natural language, as well as the cultural aspects involved in learning it, it’s interpreted and absorbed very differently than a formal language. It’s true that distinct words have distinct meanings, but to understand the overall meaning we have to (and naturally do) read between the lines instead of relying on the specific grammatical classification of a word. The language itself is a construct that is defined culturally. But formal languages like programming languages act more as a pure framework, essentially the core of the language is small and strictly defined and different projects construct an individual vocabulary rather than adding to the language in general. Both of these factors, strictness of design and high variation between projects, keep highlighting relevant.

The Dreaded Information Overload

About the most grounded claim, I have heard as a counterargument against colorful highlighting is that “too much” information leads to confusion and diminished focus.

I can see how particularly bright and saturated color schemes could have this effect on someone, but at the same time, this isn’t a problem that I experienced. So how to proceed? Arguing that something is a matter of preference is fine, but it’s not exactly enlightening on its own. And I suspect that dealing with information density is something that goes further than a simple like or dislike. So let’s talk about it.

Now as you might have guessed, I’m on team maximalism. I think more information is better and most utilizations of semantic highlighting still don’t go far enough.^[4]

I want as much information as I can have as soon as I can have it, and that includes getting it in the form of colours. Sure, different autocomplete and code analysis tools besides direct highlighting could tell me all I need to know about a variable or other token. But why should I be forced to put in the extra “work” of hovering over a piece of code for a tooltip or trigger the autocompletion when that info could already be there? It doesn’t unnerve or confuse me when there’s information on the screen that I don’t need now. If I really don’t need it, I can just not focus on it.^[5]

To elaborate on why this works for me, let me construct an analogy of my own, the analogy of the ketchup on the cluttered dinner table.^[6]

When there is a lot of stuff on the table, I just don’t care. Everything on it just merges together into the collective fact of “There is stuff on the table”. No details. If you ask me just a minute after I left the table about what was actually on the table, I couldn’t tell you. I’ll remember the things I used like a salt shaker^[7], but if there was for example a bottle of ketchup on the table and I didn’t use it I won’t know that it was even there. Same for anything else. Was there Mayo? Was there butter? Who knows. To me there was only “stuff”. Even while I’m sitting at the table I might not notice that the ketchup is right in front of my face until I specifically start looking for it.

I realize that this might be a somewhat extreme case, but filtering and grouping stimuli into easily “digestible” units is what everyone’s brain does to some extent. You don’t register every hair on someone’s head individually, the hair you see is already a generalization.

From my armchair psychology standpoint, it seems like different people have different thresholds at which they naturally generalize information. A natural default resolution if you will. Anything more granular requires not exactly effort but intentional focus. I’ll consciously look for the ketchup to separate it from the other clutter on the table, just like I’d intentionally look for the meaning of a specific color to separate it from the general abstraction of “there is colored text”.

This might sound like a handicap so far and it certainly has its downsides^[8] but here’s an obvious trade off to having a “higher resolution” when it comes to information density: There are more distinct pieces of information to keep track of at once, requiring more effort.

To such a “high res” person^[9] the additional information can seem annoying or even intrusive. To them, the action that requires focus is no longer the act of homing in on the 30% of information that you need at the time, but rather actively and constantly shutting out the 70% that you don’t need.

I’m not saying that this type of perception should be seen as bad or OCD or anything like that, I wouldn’t even be surprised if it was actually more common than my personal heavily “compressed” way of perceiving details. What I am saying is that, clearly, there’s a spectrum here and many articles touting some “objectively” best approach don’t seem to acknowledge that people could fall on any point on said spectrum. Or they might realize that those differences exist but claim that it’s less grating to deal with much less information than you are capable of processing than dealing with more.

Yet, even if that was the case I doubt this would justify creating a development experience with minimum information density in mind; That’s “lowest common denominator” thinking and especially in a field where highly personal customization is abundant this seems pointlessly restrictive. Why let precious mental bandwidth go to waste because of an issue that might not even affect you?

Conclusion

Subjectivity is often invoked as a thought-terminating cliché to discredit attempts at criticizing or overhauling any long-standing practice. This is not what I want to accomplish here. I simply feel that it’s worth having a variety of solutions for a variety of people.

And if we understand what causes the difference in views, we can also understand along which lines we can implement improvements to the status quo.

For me, this meant realizing that I could deal with many more colours than most color schemes provide and that it was worth maximizing the amount of information that highlighting can offer.

But for some turning it down or off might very well be the right choice.

^{^}	There is a nice VSCode extension for it called Syntax Off. While (spoiler, I guess) I disagree with the no highlighting sentiment, it's still worth trying out… if only to see how much you end up hating it.

^{^}

Obviously, naming conventions can go a long way here. I would expect “flipPage” or anything that sounds like an action to be a function and “isInteger” to be a boolean (or at least something that returns one), as well as class names being capitalized for examples, but there are so many factors that could potentially be relevant that “encoding” all of them in the names themselves would become very cumbersome.^[1]

^{^}

There are tools for which the training wheel analogy actually makes sense, such as WYSIWYG Editors. I remember Dreamweaver CS4^[2] injecting hideous amounts of bloated JS and generating just kind-of-working ugly HTML. It was something that actually introduced an additional layer of abstraction[3] with inherent weaknesses that you couldn't overcome without getting past the whole visual approach.

^{^}	[Shameless plug incoming!] Because of this, I have designed, and am still maintaining, a Visual Studio Code colour theme that aims to provide distinct highlighting for any and all semantic tokens and modifier combinations called Semantic Rainbow.^[3]

^{^}

I also take this stance with UIs in general, which I guess could be its own little rant. In short, the compulsion to make everything minimalist and “clean” and squirrel all functionality away into weird subdivided menus makes the user experience unnecessarily complicated compared to just showing the tools. Here, we're only talking about colours, which do not even take up more space on the screen.

^{^} I sincerely hope to avoid the manipulativeness of the “Training Wheels” metaphor, as most people have dinner tables, and people of all ages can like ketchup.^[4]
^{^} This is purely hypothetical. In reality, I pretty much never put extra salt on my food.

^{^}

I feel like perceiving information in too large chunks can be especially detrimental for debugging. Sometimes, an obvious error slips by because it's located in a piece of code that I (unintentionally) skim rather than re-parse whenever I look at it because of the intuitive impression that I already know what's there.

^{^} If you must know, my primary reference here is actually my wife.

^{^} At this point, we can also ask ourselves if remembering a very detailed naming convention to determine the token type is any more efficient than remembering a colour pattern.
^{^} Perhaps the newer versions are better at this, but nothing could make me go back.
^{^} It's the colourful theme used in the header image.
^{^} If you disagree, feel free to mentally replace “ketchup” with a condiment of your choice.

Modal
Marginalia

Syntax Highlighting and Focus

Introduction

Not all Syntax Highlighting is equal.

The Dark Ages

From Syntax to Semantics

Dismantling the Training Wheels

Who’s the Target Audience?

Aside About Natural Language and Vocabulary

The Dreaded Information Overload

Conclusion

Comments

Modal Marginalia

Syntax Highlighting and Focus

Introduction

Not all Syntax Highlighting is equal.

The Dark Ages

From Syntax to Semantics

Dismantling the Training Wheels

Who’s the Target Audience?

Aside About Natural Language and Vocabulary

The Dreaded Information Overload

Conclusion

Comments

Modal
Marginalia