Software development is a weird thing that comes with strange rules that seemingly change all the time. This can be particularly frustrating to new folks who learn rules like the use of ==
to check for equality in Java. Turns out, it’s not really that simple.
Table of Contents
Understanding the ==
Operator
Typically, when educators first introduce equality in Java, we refer to the primitive types. For instance, if we want to check the equality between two numbers, we teach students to use the ==
operator as follows:
5 == 6; // false 4 == 4; // false
And of course, students very quickly internalize ==
as the equals operator.
Unfortunately, the problem is that this very quickly breeds misconceptions and bad habits. For example, a quick trap that students run into is the use of ==
with doubles. Any idea what the following code returns?
.1 + .2 == .3;
Intuition argues that the result is true since the math checks out. Unfortunately, doubles don’t adhere to our brand of math, so they sometimes result in nasty rounding errors. As a result, this expression returns false.
While students tend to use ==
with doubles, it’s not uncommon for them to also write boolean expressions like this:
boolean isTall = true; isTall == true; // true, but could be simplified to isTall or !isTall depending on use
And of course, this idea extends to reference types, where we generally discourage students from comparing with ==
altogether. Otherwise, students might be confused why the following code does not work:
Square s1 = new Square(4); Square s2 = new Square(4); if (s1 == s2) { ... } // Reference types compare memory locations, not object contents
Yet, even if students fully understand all of these rules around equality, a weird situation can arise with one type in particular, strings. In the remainder of this article, we’ll take a peak at exactly why strings can sometimes behave like primitive types with equality.
The Curious Case of String Equality
If you write a simple Java program to compare two seemingly equivalent strings in a Java, you may be surprised to find that ==
works just fine. The following example will illustrate just that:
String a = "Jeremy"; String b = "Jeremy"; a == b; // true
Without knowing that ==
compares the addresses of reference types, you might not be surprised by this result. Surely, the two strings are equal. Yet, for folks who know the secrets of equality, this result can be quite perplexing. How can two objects created in separate variable definitions have the same address? No other object works like this.
As it turns out, the secret lies in the way that Java handles strings through a process known as string interning. Every time a String literal is created in a Java program, the compiler interns it implicitly. Basically, this means that any strings created in the form above will be checked against a collection of strings already declared. If a copy exists, no new memory is allocated. Instead, an alias is created.
Now, aliasing is generally not considered a problem with immutable reference types like strings. After all, you can’t change the value of a string (well, that’s not technically true, but that’s a topic for another time), so there’s no risk associated with the alias. But, it does raise a question about the consequences of aliasing as it relates to equality. In this case, new folks might be tricked into a false sense of security around equality with strings in Java. See, there are a lot of ways in which two of the same string may actually have separate memory addresses. For instance, what if we use the String constructor explicitly?
String a = "Jeremy"; String b = "Jeremy"; String c = new String("Jeremy"); a == b; // true a == c; // false
And as you can imagine, these aren’t the only ways strings can be created. What about concatenation?
String a = "Jeremy"; String b = "Jeremy"; String c = "Jere" + "my"; a == b; // true a == c; // true
Weirdly, if one of the components of the concatenation is a variable, we’re back to failure:
String a = "Jeremy"; String b = "Jeremy"; String suffix = "my"; String c = "Jere" + suffix; a == b; // true a == c; // false
And, of course, the example that spawned this whole article comes from my course where students are asked to create a loop that prompts a user to continue. If the user enters “y”, the loop is meant to keep going. Yet, if we use == on user generated input, our loop condition will fail:
while (userInput == "y") { // This will never return true userInput = in.nextLine(); }
Funnily enough, Java actually allows you to account for this by interning strings manually using their intern()
method. While I’m sure there’s some perfect use case for this, it does feel like a bit of a hack. That’s why I tend to advise students to stick to the equals()
method with all reference types.
Why Would Java Do This?
String interning is a bit of a weird feature given that most folks don’t know it’s happening or why. As far as why, there doesn’t seem to be a ton of literature on it. The claims I’ve seen argued tend to be around space efficiency (i.e., having fewer string objects floating around). Other folks have mentioned that ==
is actually faster than equals()
, so it might be advantageous in performance-based software like game development. Of course, like I mentioned, that feels like a hack that could very easily break.
Regardless, the perspective I have comes from an educator who has to navigate the sorts of misconceptions that can spawn out of a feature like this. As a result, I generally don’t like the idea of String interning, or at least I would like it a lot more if it was applied to all strings by default. Alternatively, equality could be overloaded to perform the equals()
method automatically on reference types. This is something that Python does really well, and you can still test for identity using the is
keyword. That way, ==
becomes the equality operator, not sometimes the identity operator, and educators everywhere would be saved from navigating the misconceptions related to equality.
Anyway, that’s about all I have time to cover today. As usual, if you liked this sort of thing and want to read more like it, check out the following related posts:
- How to Compare Strings in Python: Equality and Identity
- Explain Like I’m Five: Method Overloading
- Be Careful with String’s Substring Method in Java
And if my schilling for Python is finally paying off, maybe you’ll check out one of these Python resources (#ad):
- Effective Python: 90 Specific Ways to Write Better Python
- Python Tricks: A Buffet of Awesome Python Features
- Python Programming: An Introduction to Computer Science
Finally, if you’d like to take your participation to the next level, feel free to browse my ways to grow the site. Otherwise, take care! And, I’ll see you very soon.
Recent Posts
Recently, I was thinking about how there are so many ways to approach software design. While some of these approaches have fancy names, I'm not sure if anyone has really thought about them...
Inside the Mind of an Engineer: How to Make Societal Issues Worse
Today, it feels like things have shifted in the mindset of the average engineer, and we're unknowingly making the world a worse place to live.