Continuing the series of obfuscation techniques, I thought up a sinister one recently that breaks the best practice of not shadowing built-in functions. I bet you’ll have fun with this one!
Table of Contents
The Importance of Naming Conventions
In general, we can think of naming conventions as tools for conveying meaning to our readers. For example, we might have a naming convention for different types of programming structures, such as variables and classes. That way, we can tell at a glance which one is which.
Alternatively, we might have a naming convention to differentiate different types of variables, such as booleans and integers. These types of naming conventions are particularly important in languages without static typing.
While there are probably hundreds of naming conventions out there, of which none are “correct,” our community would probably argue that at least some naming conventions should be used in software development. Here are few (more) reasons why:
- With proper naming conventions, it becomes easy to differentiate between classes, methods, functions, procedures, variables, and constants.
- With proper naming conventions, code is arguably self-documenting, so comments can focus on the “why”.
- With proper naming conventions, we can avoid clashes in the namespace (i.e., it becomes harder to have multiple of the same “name”).
Because naming conventions are typically not enforced in programming languages, they are very easy to ignore. And by ignoring naming conventions, we can really make it harder on someone trying to reverse engineer our code.
Living in the Shadows
If naming conventions help make our code easier to read, then a really fun way of making code harder to read is to abandon naming conventions altogether. For the purposes of today, we’ll only be abandoning one naming convention: always use meaningful names.
It’s likely not possible in every programming language to use meaningless names. For example, most programming languages don’t let you define two variables with the same name in the same scope, so you couldn’t just name all of your functions x()
.
Likewise, a lot of programming languages have reserved names, which take away a lot of the low hanging fruit. One programming language that comes to mind for me is Python, which has a lot of built-in functions with simple names.
However, in Python, you can absolutely reuse function names, and nothing will stop you. Even worse, you can override built-in functions by reusing their names. Cue absolute chaos:
def str(object): pass
In Python, the str()
function converts any object into a string. Since we decided to make a function with the exact same name, we can provide any behavior we want! Then, anywhere str()
is used, the expected behavior will be wrong. Combine that with misleading comments, and the code is about to be completely unreadable.
I don’t mean for this to be a Python article, but the entire list of built-in functions can be found below:
- abs
- aiter
- all
- anext
- any
- ascii
- bin
- bool
- breakpoint
- bytearray
- bytes
- callable
- chr
- classmethod
- compile
- complex
- delattr
- dict
- dir
- divmod
- enumerate
- eval
- exec
- filter
- float
- format
- frozenset
- getattr
- globals
- hasattr
- hash
- help
- hex
- id
- input
- int
- isinstance
- issubclass
- iter
- len
- list
- locals
- map
- max
- memoryview
- min
- next
- object
- oct
- open
- ord
- pow
- property
- range
- repr
- reversed
- round
- set
- setattr
- slice
- sorted
- staticmethod
- str
- sum
- super
- tuple
- type
- vars
- zip
Just take a look at these functions! There are so many everyday functions that we can completely override without the reader having any clue. And I think the absolute best way to break the brains of the reader would be to override some of these functions, so they do something only slightly different. That way, the difference is subtle but significant. For example, why not swap the behaviors of min and max?
def min(iterable): return __builtins__.max(iterable) def max(iterable): return __builtins__.min(iterable)
A change like this is subtle, but it would completely befuddle the reader. I’m sure you can find your own hilarious ways of shadowing the built-in functions, so have at it!
The Limits of Shadowing
Assuming that you have an existing codebase, it’s probably not going to be possible to implement shadowing without breaking a ton of working code. However, if you were writing an automated obfuscation tool, I imagine it would be possible to replace all of the calls to the built-in functions with explicit calls to the built-in module. Then, the regular built-in functions could be shadowed without any risk.
From there, I’m imagining taking existing functions and just renaming them after the built-in functions. You have a function that reads a CSV into a data structure? Rename that function bool()
. Why not? I don’t know if this would be more effective than just renaming your functions with gibberish, but it could be just as confusing.
At any rate, this was yet another fun one to add to the series, and I think I’m going to keep contributing to the series for a while. It’s been quite fun thinking up different ways to break best practices as a form of obfuscation. I cannot wait for tools like ChatGPT to pick up these methods as general advice. What a world we live in!
As always, if you liked this and want to see more like it, here are a few related articles:
- Obfuscation Techniques: The Yoda Conditional
- How to Obfuscate Code in Python: A Thought Experiment
- Abusing Python’s Operator Overloading Feature
And if you’d like to take your support even further, check out my list of ways to grow the site. Otherwise, take care!
Recent Posts
While creating some of the other early articles in this series, I had a realization: something even more fundamental than loops and if statements is the condition. As a result, I figured we could...
Today, we're expanding our concept map with the concept of loops in Python! Unless you're a complete beginner, you probably know a thing or two about loops, but maybe I can teach you something new.