While considering the various obfuscation techniques, I was thinking about the role comments play in readability. After all, we’ve all heard that writing comments is a best practice, yet it’s very easy to comment poorly. So, what if we actually weaponized our comments against our readers? That seems like an effective obfuscation technique.
Table of Contents
Why We Comment Code
When it comes to writing code, the truth is that code is not self-documenting—no matter how hard we try. For instance, we can try to write extremely explicit code by taking care to use good naming conventions such as in the following example from a thread on StackOverflow:
float computeDisplacement(float timeInSeconds) { const float gravitationalForce = 9.81; float displacement = (1 / 2) * gravitationalForce * (timeInSeconds ^ 2); return displacement; }
But even with a well written function like this, there are limitations. For example, from a design by contract perspective, there might be a precondition that we care about which is not obvious by the code snippet. Off the top of my head, I would imagine that not all inputs are valid due to wraparound concerns.
In some languages, we could potentially identify preconditions by adding asserts or including guard clauses. However, we create a new problem. Why do those preconditions exist? That’s where writing comments becomes important. We use them to signal intent behind a segment of code to our readers.
Therefore, from an obfuscation perspective, we can undermine the purpose of comments to our advantage. On one hand, we can leave all comments out and let the user stumble into the variety of pitfalls related to “self-documenting code” (e.g., overflowing the displacement calculation). On the other hand, we could use comments to send all the wrong signals to the reader. In this article, we’ll take a look at the latter.
Writing Malicious Comments
To write truly evil comments, you have to get in the mind of the reader. In other words, what types of things do you think someone reading code cares about? In the remainder of this section, we’ll look at a couple of nasty commenting techniques.
Misleading Documentation
Personally, the bulk of the types of comments I write are just documentation (e.g., a list of parameters with descriptions, a description of the return value, etc.). Therefore, one obfuscation strategy would be to intentionally mislead the reader into thinking a function does something that it does not. A silly example from one of my own APIs would be something like this:
def add_code(code: str, lang: str): """ A convenience method which adds a code block to the document. :param code: the text for the code block, will be autoformatted :param lang: the spoken language of the user """
As you can probably imagine from the docs, the purpose of this method is to add a code block to a document, which happens to be Markdown. If you’re familiar with Markdown, then you might be familiar with the fenced code block syntax. In this syntax, all you need is the code you want in a code block and the programming language to feed to the syntax highlighter.
Of course, somewhat comically, I have actually lied in my docs above about the purpose of those parameters. For example, the code you pass in should be preformatted; there is no autoformatting. Likewise, I don’t think it matters what language you or anyone else speaks. However, what does matter is the programming language to signal to the syntax highlighter.
Irrational Intent
As I mentioned previously, coders don’t usually care how a code segment works. They can typically figure that out by looking at it (or the docs). Instead, they often care more about the intent behind the design. Therefore, we can take advantage of the reader by lying about why certain segments of our code exist.
Something I thought of quickly would be to include a code segment that does nothing but leave a comment signaling that it has tremendous value. For example, you might have seen this hilarious comment before:
# you may think that this function # is obsolete, and doesn't seem to do # anything. and you would be correct. # but when we remove this function # for some reason the whole program # crashes and we cant figure out why, # so here it will stay
And the thread from above references many such real-world examples such as:
// if you delete this it will stop working in IE8 while (false) {}
However, in my mind, that requires you to go beyond commenting and actually include some useless code. Therefore, the comments really should be lying about the intent behind useful code. For example, after digging through some of my own code, I found the following private method:
def _get_indent_size(self, item_index: int = -1) -> int: """ Returns the number of spaces that any sublists should be indented. :param int item_index: the index of the item to check (only used for ordered lists); defaults to -1 :return: the number of spaces """ if not self._ordered: return 2 # Ordered items vary in length, so we adjust the result based on the index return 2 + len(str(item_index))
The purpose of this method is to give us an integer representing the indentation from the current list to a sublist. In this case, that indentation defaults to two spaces. However, when using an ordered list (e.g., 1, 2, 3, …, 11, 12, 13), the width of the ordered portion is dynamic. After all, the number “1” takes up one space while the number “14” takes up two spaces. As you can see, the intent behind the design is clearly documented by the following comment:
# Ordered items vary in length, so we adjust the result based on the index
So, how could we go about lying here? My first thought was to rewrite the comment to just lie about what the code does:
# Calculate the indentation as a function of the pixel width of the number
But, this doesn’t really describe intent. So, what if we keep the idea of pixel width but change the original comment slightly:
# Numbers have different pixel widths, so we adjust the result based on the index
Now, rather than the comment pointing to the width of the indent in terms of spaces, the reader is going to think we calculate indentation as a function of pixel widths. It’s subtle, but it’s sure to confuse the reader quite a bit.
Useless Comments
Finally, a silly obfuscation technique is to just litter the code with useless comments. You got an if statement? Tell everyone that it’s an if statement. Created a variable? Let everyone know you created a variable!
// Creates an integer that stores 7 int x = 7; // An if statement if (condition) { ... } // Adds one to i i = i + 1;
Surely, you can take this to an extreme and make your code completely unreadable.
Drawbacks of Relying on Comments for Obfuscation
One of the downsides of using comments to aid in obfuscation is that compilers and similar tools typically delete comments by default. So a trick someone might use to remove all of the comments is to compile your program and then decompile it. All of the useful naming conventions will most likely be gone but so will the comments.
With that said, this series is mostly for fun, and I wouldn’t recommend putting much stock in obfuscation anyway. That said, if you found this fun, there’s a lot more where that came from, such as the following:
- Obfuscation Techniques: The Yoda Conditional
- Rock Paper Scissors Code Golf
- How to Obfuscate Code in Python: A Thought Experiment
As always, you can take your support even further by heading over to my list of ways to grow the site. Otherwise, have a good one!
Recent Posts
Teaching at the collegiate level is a wonderful experience, but it's not always clear what's involved or how you get there. As a result, I figured I'd take a moment today to dump all my knowledge for...
It's been a weird week. I'm at the end of my degree program, but it's hard to celebrate. Let's talk about it.