Obfuscation Techniques: Visually Similar Characters

Obfuscation Techniques: Visually Similar Characters Featured Image

One of the more nasty tricks we have in our obfuscation toolbelt is the reality that text has a lot of visually similar characters. If we sprinkle enough of them around, we can have some tough to read code!

Table of Contents

Similar Looking Characters

Previously, I had talked about the benefit of naming conventions and how abandoning them could allow for fun obfuscation techniques like shadowing built-in functions. Well, as it turns out, there are potentially far more effective techniques for rendering code practically unreadable. And, you’ll be surprised at just how easy they are to implement.

One such technique is to consider characters that are similar looking and use the hell out of them in your naming conventions. For example, capital ‘O’ and ‘0’ are very similar looking, so it would be possible to litter these throughout your variable and function names for maximum chaos.

In fact, there are a lot of characters that look similar. A common nasty one is lowercase ‘l’ and ‘1’. Depending on the font, these can look identical. For example, you might have tried to learn the Unix/Linux command line where ‘ls’ is a common command. When I’ve shown this command to students, it’s common for them to think it says “1s” and not “ls’, leading to hilarious bugs.

As it turns out, there are also visually similar character combinations. For example, ‘d’ kind of looks like ‘cl’ at a glance. Meanwhile, ‘m’ looks like ‘rn’. And of course, any form of 1337 (i.e., leet) speak can work as lookalikes.

Here’s a quick table I put together of character lookalikes that you can abuse in your own code. For a more complete reference, I might recommend the following source on homoglyph detectionOpens in a new tab..

CharacterLookalike
l1
O0
dcl
mrn
b6
S5

Some of these might look immediately different, but you’ll never notice them when they’re buried in code. Or worse, as I’ll describe shortly, spammed!

Spamming Similar Looking Characters

Whenever we talk about naming conventions, we usually look for ways to make our code more readable. As I stated before, by abandoning naming conventions, we can make our code much harder to read. To demonstrate this, let’s look at a function that prints “Hello, World!”:

def hello_world():
  print("Hello, World!")

Any time someone wants to use this function, they would call it by name:

if is_first_program:
  hello_world()

This is, of course, very easy to read. It’s practically in plain English. To make things much harder, we can rename our function using our similar looking characters:

def O00O00OO0O00OOOO00O0O():
  print("Hello, World!")

Now, the function is still somewhat obvious in what it does, but it’s absolutely not obvious when it’s called:

if is_first_program:
  O00O00OO0O00OOOO00O0O()

And guess what! We can do this to every single thing we can name, such as variables, functions, and classes.

if OO000OOOOO00O00O:
  O00O00OO0O00OOOO00O0O()

Good luck figuring out what this chunk of code does! Here’s the same code block with different similar looking characters:

if ll1l11l111lll1l1ll1l:
  lllll1l1llll1l1l1ll()

It’s an absolute masterpiece, isn’t it?

Possible Limitations

One of the downsides of rewriting all your symbols with visually similar characters is that most IDEs will have no problem parsing these for their users. Therefore, a simple CTRL+CLICK on a function call will take them straight to the function declaration.

That said, this technique combined with a variety of other obfuscation techniques discussed in this series could easily push any source code into unreadable territory. Of course, as I always say, these techniques are more for fun, and I wouldn’t recommend actually employing them as a form of security.

With that said, let’s go ahead and call it for the day. If you found this article interesting, I’d encourage you to keep browsing the following articles:

And if you really want to take your support to a new level, we have a variety of ways to support the site. Otherwise, take care!

Obfuscation Techniques (6 Articles)—Series Navigation

In our field, everyone likes to talk about best practices, but it’s sometimes difficult to make the case for what is and isn’t a best practice. On the other hand, I think it’s very easy to come up with ways to make code worse, but who would want to read about ways to make their code bad? I’ll tell you who! People who want to obfuscate their code, or at least that’s what I tell myself. Regardless, that’s my cover story for putting together this new series, and I figure it’ll be a lot of fun.

Jeremy Grifski

Jeremy grew up in a small town where he enjoyed playing soccer and video games, practicing taekwondo, and trading Pokémon cards. Once out of the nest, he pursued a Bachelors in Computer Engineering with a minor in Game Design. After college, he spent about two years writing software for a major engineering company. Then, he earned a master's in Computer Science and Engineering. Today, he pursues a PhD in Engineering Education in order to ultimately land a teaching gig. In his spare time, Jeremy enjoys spending time with his wife, playing Overwatch and Phantasy Star Online 2, practicing trombone, watching Penguins hockey, and traveling the world.

Recent Posts