The Problem Enums Are Intended to Solve

A photo of someone holding a compass with the title of the article overlayed.

It’s a special day when I cover a Java topic. In this one, we’re talking about Enums, and the problem(s) they are intended to solve.

Table of Contents

Introducing the Problem: Storing Categorical Data

Recently, I was giving my students an introduction to Enums (which I pronounce as “ee-numb”) because we use them in some of our APIs. However, I didn’t give us enough time to actually cover the topic, so this article is intended to go more in depth for the folks who were interested.

Typically, the way that I introduce enums as a concept is to describe a scenario where enums might be useful and ask students for their approaches. For example, imagine we want to make an object to store a hockey player. There is a lot of information that we might want to store about that player, but for the sake of argument, how might you store the player’s position?

In the world of data science, the player’s position would be referred to as categorical data or nominal data. Categorical data is generally limited in what can be done with it, but in the data visualization space, categories are useful for doing things like coloring points in a scatterplot or grouping data for analysis. As a result, I’m just going to refer to this as the problem of storing categorical data.

Therefore, given the hockey player example, how might you store categorical data? Generally, there are two answers that most folks (i.e., my students) give. The “obvious” answer is to use a string while the slightly less obvious answer is to use an integer. Next, we’ll take a look at these options in detail.

Using Strings to Track Categorical Data

Let’s say we use a String to store our player’s position. That might look as follows in Java:

public class HockeyPlayer {
  private String position;
}

After all, we could store the positions easily as strings (e.g., “Goaltender”, “Center”, “Left Winger”, etc.). But, what’s the problem? In general, I try to stay away from strings whenever possible because of a really common problem: typos. As a silly example of where things can go wrong, consider the following method:

public class HockeyPlayer {
  private String position;
  private int goals;
  private int assists;
  private int blocks;
  private int saves;

  public int getPrimaryMetric() {
    switch (this.position) {
      case "Left Winger":
      case "Right Winger":
      case "Centre":
        return this.goals;
      case "Left Defenseman":
      case "Right Defenseman":
        return this.blocks;
      case "Goaltender":
        return this.saves;
      default:
        return this.blocks;
    }
  }
}

Like I said, it’s a silly example, but let’s imagine that forwards, defenseman, and goalies have some primary statistic that they’re categorized by. It’s silly because defenseman aren’t really categorized by their blocked shots, but you get the idea.

Anyway, have you spotted the bug yet? If not, it’s the “Centre” typo, and it’s a subtle one. After all, “Centre” is a reasonable spelling in British English. Typically, however, you will see it spelled as “Center” in the context of hockey. The consequence being we return blocks for centers rather than goals.

Now, how long do you think it would take you to spot this bug? If you do proper testing, it might not take you very long to uncover it. However, it’s more likely that this is a private helper method that you’re using throughout your code. In that case, how long do you think it will take you to notice this bug?

Of course, typos aren’t the only problems strings have. You also need to remember to use .equals() over the equality operator. Likewise, there are plenty of ways that strings can end up with zero width characters in them or other strange characters that resemble more common characters (e.g., the Greek question markOpens in a new tab.). As a result, I generally try to avoid strings whenever possible.

Using Integers to Track Categorical Data

So, what should we do instead? Another common option is to use integers. The idea being that there is less room for mistakes with integers. After all, it’s possible to use the wrong number, but it’s not as easy to use an invalid number. Here’s the same code with integers instead:

public class HockeyPlayer {
  private int position;
  private int goals;
  private int assists;
  private int blocks;
  private int saves;

  public int getPrimaryMetric() {
    switch (this.position) {
      case 0:
      case 1:
      case 2:
        return this.goals;
      case 3:
      case 4:
        return this.blocks;
      case 5:
        return this.saves;
      default:
        return this.blocks;
    }
  }
}

Of course, we now have a new problem. What do these numbers mean? Surely, you could litter your code with comments to make it clear what each number means, but you’d have to do that everywhere. Also, comments tend to go out-of-date, meaning they might be lying about what a number means.

Using Constants to Track Categorical Data

Ultimately, both strings and integers have problems on their own, so you might cleverly realize that the solution is to use constants. That way, you can still use “strings” in the sense that you have descriptive variable names. And, you can also continue to use integers if you’d like:

public class HockeyPlayer {
  private int position;
  private int goals;
  private int assists;
  private int blocks;
  private int saves;

  private final int LEFT_WINGER = 0;
  private final int RIGHT_WINGER = 1;
  private final int CENTER = 2;
  private final int LEFT_DEFENSEMAN = 3;
  private final int RIGHT_DEFENSEMAN = 4;
  private final int GOALTENDER = 5;

  public int getPrimaryMetric() {
    switch (this.position) {
      case LEFT_WINGER:
      case RIGHT_WINGER:
      case CENTER:
        return this.goals;
      case LEFT_DEFENSEMAN:
      case RIGHT_DEFENSEMAN:
        return this.blocks;
      case GOALTENDER:
        return this.saves;
      default:
        return this.blocks;
    }
  }
}

Generally, I think this is a fine compromise, but it does mean that the player’s position is still just an integer. Therefore, it is still possible to litter your code with magic numbers. It’s also possible to use the wrong constants by accident just because they’re the same type (e.g., LEFT_CIRCLE referring to the area of the ice in place of LEFT_WINGER).

Using Enums to Track Categorical Data

To solve this problem, we introduce enums. Like constants, enums are just a wrapper over integers. However, enums bring in a new benefit: type checking. So, not only do we eliminate typos and readability issues by using enums, but we also eliminate the issue of using the wrong constants. Here’s what that might look like:

public class HockeyPlayer {
  private Position position;
  private int goals;
  private int assists;
  private int blocks;
  private int saves;

  public enum Position {
    LEFT_WINGER, RIGHT_WINGER, CENTER, 
    LEFT_DEFENSEMAN, RIGHT_DEFENSEMAN, 
    GOALTENDER
  }

  public int getPrimaryMetric() {
    switch (this.position) {
      case LEFT_WINGER:
      case RIGHT_WINGER:
      case CENTER:
        return this.goals;
      case LEFT_DEFENSEMAN:
      case RIGHT_DEFENSEMAN:
        return this.blocks;
      case GOALTENDER:
        return this.saves;
      default:
        return this.blocks;
    }
  }
}

Now, anywhere you want to specify a player’s position, you have a type you can request (i.e., Position). Suddenly, positions are type checkable, which makes it much harder to make mistakes. Only Position values can be passed in, so the only mistake you can make is passing the wrong position.

Enums are also pretty cool because they’re just classes. As a result, you can add a constructor and even functions. That way, if you want to store any additional data alongside the Enum values you can. For example, maybe you want to store a human readable version of the constant name (e.g., “Left Winger”) for places where you might display it. Likewise, abbreviations are common (e.g., “LW”).

Finally, the last thing I’ll say to sell Enums is that they usually play really nicely with modern IDEs. In other words, Enums give you the gift of autocomplete, and your IDE might even suggest the correct one automatically. This happens because you give the IDE a lot more semantic information with Enums than you do with constants or strings. Likewise, an IDE might even help you generate the Enum in the first place—sort of like how Excel will generate rows of data by inferring a pattern.

Should You Use Enums?

Broadly speaking, Enums are one of those niche features that you’ll rarely use. More often than not, they’re going to add complexity to code that may otherwise be only a short script. That said, if you have categorical data that you’re using a lot, it’s probably a good idea to look into Enums.

When students are working on their own projects, I sometimes recommend Enums for situations where categorical data is stored, such as:

  • Days of the Week (e.g., MONDAY, TUESDAY, etc.)
  • Months in a Year (e.g., JANUARY, FEBRUARY, etc.)
  • Cardinal Directions (e.g., NORTH, SOUTH, EAST, WEST)
  • Planets in the Solar System (e.g., EARTH, VENUS, MARS, etc.)
  • Status of Processes/Events (e.g., ERROR, READY, RUNNING, etc.)
  • Sports Teams (e.g., PENGUINS, RANGERS, BLUE JACKETS, etc.)
  • Suits for Cards (e.g., HEARTS, DIAMONDS, etc.)
  • Colors (e.g., RED, GREEN, BLUE)

However, sometimes there are dozens of categories, and it doesn’t make sense to list them all out. Likewise, sometimes there are a few categories you have in mind but maybe the user wants to make some custom categories (e.g., categories for line items in a bank account). In that case, I would stick with strings or at the very least provide an OTHER Enum. Finally, Enums seem to be unpopular with folks who need to save as much space as possible, which makes sense because they’re basically classes masquerading as integers.

Of course, because I personally use Enums so infrequently, I don’t really have many experiences where they went wrong. As a result, I’ll point you to some other folks who say to avoid them (e.g., Why you shouldn’t use Enums in your CodeOpens in a new tab. and Why you shouldn’t use Enums!). Be aware that Enums are different in every programming language, so the critiques you’ll see in the links above might only apply to specific languages.

With that said, let’s call it a day here. As always, if you liked this article, there are definitely more what that came from:

Likewise, feel free to show your support by heading over to my list of ways to grow the site. Otherwise, we’ll see you next time!

Coding Tangents (44 Articles)—Series Navigation

As a lifelong learner and aspiring teacher, I find that not all subjects carry the same weight. As a result, some topics can fall through the cracks due to time constraints or other commitments. Personally, I find these lost artifacts to be quite fun to discuss. That’s why I’ve decided to launch a whole series to do just that. Welcome to Coding Tangents, a collection of articles that tackle the edge case topics of software development.

In this series, I’ll be tackling topics that I feel many of my own students have been curious about but never really got the chance to explore. In many cases, these are subjects that I think deserve more exposure in the classroom. For instance, did you ever receive a formal explanation of access modifiers? How about package management? Version control?

In some cases, students are forced to learn these subjects on their own. Naturally, this forms a breeding ground for misconceptions which are made popular in online forums like Stack Overflow and Reddit. With this series, I’m hoping to get back to the basics where these subjects can be tackled in their entirety.

Jeremy Grifski

Jeremy grew up in a small town where he enjoyed playing soccer and video games, practicing taekwondo, and trading Pokémon cards. Once out of the nest, he pursued a Bachelors in Computer Engineering with a minor in Game Design. After college, he spent about two years writing software for a major engineering company. Then, he earned a master's in Computer Science and Engineering. Today, he pursues a PhD in Engineering Education in order to ultimately land a teaching gig. In his spare time, Jeremy enjoys spending time with his wife and kid, playing Overwatch 2, Lethal Company, and Baldur's Gate 3, reading manga, watching Penguins hockey, and traveling the world.

Recent Code Posts