A Java Style Guide

3. Identifier naming

In this section, the style guide defines a convention for naming identifiers in the code.

One of the most well known naming conventions is perhaps the "Hungarian notation", developed by Charles Simonyi at Microsoft in the 1970s. The idea behind the Hungarian notation is to name identifiers to reflect their data types, i.e. if the identifier is a char, a long or perhaps a pointer to a pointer to some structure (pointers are of course not an issue in Java). See chapter 7 for more details on the Hungarian notation and other style guides and naming conventions.

In the naming convention proposed in this style guide the main idea is to name identifiers in a way that clearly expresses their role and scope in the code, which often is much more important than their data types.

3.1 General principles

The main principle for naming identifiers is simple: name things in the source code to make their role, i.e what they are used for, quite clear. It is much easier to understand source code if the names of variables, parameters, methods and other identifiers clearly give the reader a clue to what they are used for, than if the names are non-descriptive and you have to disassemble the algorithm to understand what each identifier is used for.

If, for instance, a Map is used to store file names, give it a name that communicates this fact:

    Map aFileNames;
    File aCurrentFile;
    ...
    aFileNames.add(aCurrentFile.getName());

is much easier to understand than

    Map map;
    File f;
    ...
    map.add(f.getName());

If a name consists of several words, they are concatenated using the InfixCaps style, where every word starts with a capital letter and the other letters are in lowercase. No underscores are used to separate the words. A exception to this rule is the names of packages, see below.

Another general principle is to use abbreviations only if they are generally accepted and in all other cases use whole words. Rather use too many characters in the name than too few to avoid being cryptic or ambiguous. A name like FldNmBf is far less clear than FieldNameBuffer, and could just as well be interpreted as FolderNumberBuffer. Fans of the Hungarian notation usually claim that code should be easy to read, not to recite. Our naming convention springs from the belief that a name that actually can be read aloud has much better chances of being intelligible than a name that can't be pronounced.

Also, be sure to use terminology from the problem domain of the source code rather than using terms that are more general. If an organisation calls their customers clients, the corresponding class should be named Client, not Customer. A related issue is to find the correct English term when working with domains in other languages. For instance, the Swedish financial term börspost translates to lot, not to stock batch which may be a tempting first try, but will only be confusing for a person with knowledge of the domain.

3.2 Prefixes

A major part of this naming convention is the use of one-letter prefixes to express the scope of an identifier in the source code. This is a naming convention method that is used by other code style guides as well, e.g. the one used by Apple in the MacApp framework and Microsoft's style guide for MFC.

The prefix letter should be in lowercase and the rest of the name should use the InfixCaps naming style, e g aResult or fCounter. The following prefix letters are used:

a for local variables
c for class variables
f for instance variables
k for constants
p for method parameters

The other types of identifiers do not have prefix letters in their names.

If you can easily see the scope of an identifier in a code snippet, it will often improve the understanding of the code. Take the following statement as an example:

    if (theBytes[currentPos] == pattern[matchedThisFar])

Suppose we are trying to analyse what this comparison means. To understand how an algorithm is implemented, it is important to understand how the various identifiers are used. It is then of great help to know if an identifier is a variable local to the method, if it is a parameter or perhaps an instance variable of the class. To get this information we must check the method's declaration of variables (which may appear anywhere in the body), the method's head and in the declaration of the instance variables before we can get this information.

Should we use prefix letters this information is directly available to us:

    if (pBytes[aCurrentPos] == fPattern[fMatchedThisFar])

There are other benefits with identifier prefixes as well. Consider this all too common pattern for a setter:

    private String name;

    public void setName(String name)
    {
        this.name = name;
    }

Now assume a typo in the parameter name:

    private String name;

    public void setName(String nmae)
    {
        this.name = name;
    }

The code still compiles, but no longer does what it is supposed to do. A similar bug was present in java.lang.annotation.AnnotationTypeMismatchException in JDK 1.5 (but was fixed in 1.6):

    private final String foundType;
    ...
    public String foundType() {
        return this.foundType();
    }

3.3 Packages

Names of packages are in lowercase only and should be short, preferably consist of only one word.

Packages are created in hierarchies, where the first two levels always are the reversed domain name of the company that the code is developed by or for, e.g. org.myire or com.thecustomer. This is a de facto standard in Java. Note that using uppercase for the first part (e.g. COM) is no longer recommended.

Classes that are designed for a more general use, i.e. are not specific to a project or an application, should be placed in package hierarchies that correspond to the core Java packages. For instance, a subclass to InputStream should be placed in org.myire.io and a JDBC utility class should be put in the org.myire.sql package.

Packages belonging to a specific project or application should be placed in packages where the project's name is the level in the hierarchy that follows immediately after the reversed domain name, e.g. com.thecustomer.theproject.

3.4 Classes and interfaces

Classes and interfaces have names that begin with a capital letter and otherwise follow the general principle for identifier naming. If the class is a subclass of another class, this can be indicated by giving it a name that is a specialisation of the superclass' name. Note that names of classes and interfaces do not begin with a prefix letter.

Class names should be nouns whereas names of interfaces should be either nouns or adjectives. If an interface's main purpose is to indicate that the classes implementing the interface share a common characteristic, the name should be an adjective, e.g. Serializable. Interfaces that are used to provide a uniform way to access different implementations of a functionality should have names that are nouns, e.g. Connection or Enumeration.

Examples:

class Query
class BetterFrame extends Frame
interface Sortable
interface ArticleResultSet

3.5 Methods

The name of a method doesn't begin with a capital letter but otherwise follows the InfixCaps style (also known as CamelCase), e.g. countStackFrames. Note that methods don't have prefix letters.

Method names should be active verbs followed by a noun if applicable, e.g. getRadius. Do not use nouns only since that can cause ambiguity. For instance, does the method radius return the radius or does it perhaps calculate it or maybe even set it?

The name should be as descriptive as possible and try to communicate to the reader what the method does. Avoid being too general, checkQuerySyntax is much more descriptive than check. Also avoid generic words that add no meaning like stuff or info. It is hard to understand what methods like checkStuff or getInfo actually do.

Use the word is as the first word in methods that return a boolean value representing whether an object is in a certain state or has a certain attribute, e.g. isConnected. This is especially important with words that have the same form for nouns and verbs. A method called empty could very well check if a list is empty, but it could also empty the list of all its elements.

Be consequent when naming methods. If the method that sets the width is called setWidth, the method used for setting the height should not be called useHeight. In general, methods that are used for getting and settings attributes of a class should have names that follow the pattern getX and setX.

Sadly enough, the JDK classes are very inconsistent when it comes to method names. In JDK 1.1, there was an attempt to introduce the getX/setX pattern. For instance, in java.awt.Component the method bounds was deprecated in favour of getBounds. However, later additions such as the java.nio package have returned to the old practice of having the same noun as method name for both getters and setters. The inconsistency found in java.lang.StringBuffer, where the getter is called length and the setter is called setLength, should serve as a warning of what happens when there are no clear rules to follow.

3.6 Constants

Constants, i.e. final class or instance variables, have names that begin with the prefix letter k. Groups of related constants should have the same word(s) in the beginning to indicate this. It is easier to understand that the following constants are related

kFieldID
kFieldTotalBalance
kFieldValidUntil

than to understand that these constants are related:

kIDField
kTotalBalanceField
kValidUntilField

Many core Java classes have taken up the de facto standard from C and C++ to use uppercase letters only for names of constants, e.g. Calendar.WEDNESDAY. Unfortunately, this is not the case in all of the core Java classes, there are variants like Color.green (although the variant Color.GREEN was introduced later).

To use uppercase letters only for constants may be appealing to programmers with a C and C++ background, and is in itself a functional approach. However, to be consistent with the other naming rules we use the prefix letter convention also for constants.

This rule to use InfixCaps for constants is normally the first rule in this style guide that people disagree with, since many programmers tend to favour using uppercase letters for constant names. If this is the case for you, this rule may be a good candidate for local adaption. As long as every team member uses the same convention, it really doesn't matter if they use InfixCaps or uppercase only for constant names. Be flexible. Be consistent. Be happy.

3.7 Class variables

Class variables, or static data members if you like, have names that begin with a c.

An example:

static int cCounter

3.8 Instance variables

Instance variables, also known as field variables, member variables or data members, have names that begin with an f, e.g. fQueryText.

When naming instance variables, be sure not to reuse names from instance variables in a superclass. Although this is allowed in Java, it only leads to ambiguous code. If there is a reason to have an instance variable in a subclass, a descriptive name that is different from the one used in the superclass can always be found.

3.9 Parameters

The names of method parameters start with a p, e.g. pHeight or pImageID.

3.10 Local variables

The prefix for names of local variables in methods is a lowercase a, which comes from the term automatic variable and reveals this style guide's C language roots. The reason a lowercase l is not used is that it can easily be confused with a capital I in some typefaces.

Don't confuse the prefix a with the indefinite article a. The correct name is aOutputStream, not anOutputStream.

The naming convention for local variables also applies for common identifiers such as temporary variables and counters. Follow the general principle described above and use descriptive names like aNumMatches and aTemporaryFileName instead of n and tmp. An exception to this rule may be loop index variables that have no additional scope, where the use of i and j is so commonly accepted that it doesn't impact on readability. However, there are cases where you should consider if more descriptive names for the loop variables could improve readability. Take this example:

    for (int i = 0; i < pRows.length; i++)
        for(int j = 0; j < pColumns.length; j++)
            if (i > j)
            ...

The meaning of the comparison if (i > j) may not be completely clear at a first glance, although seasoned programmers will have little trouble recognising the nested loop idiom. However, if we use more descriptive names, the meaning of the comparison is beyond doubt for anyone:

    for (int aRowIndex = 0; aRowIndex < pRows.length; aRowIndex++)
        for(int aColumnIndex = 0; aColumnIndex < pColumns.length; aColumnIndex++)
            if (aRowIndex > aColumnIndex)
            ...

Contents