Finding and Comparing objects
Comparing objects
In some cases, we must retrieve and compare data from memory for the use of our program. Think of finding out whether a certain transaction occurs in multiple datasets, or computing some aggregate scores for the same person, item, or identifier across multiple data sets, or keeping track of how often certain events occur during a simulation. To do this, we need two things: a structured way to store data (such as an array, ArrayList or ...), and a way to compare new to old objects.
Please consider the following example:
Integer i1 = new Integer(12);
Integer i2 = new Integer(12);
System.out.println(i1 == i2);
System.out.println(i1.equals(i2));
will print
Just like in real life, there is a distinction between two variables with the same contents (copies) and variables referring to the exact same object (references).
The ==
operator tests what is stored in the two variables. For primitive types, this is the actual value, whilst for non-primitive types, this is a reference (or memory address). Let us look into this difference a bit deeper.
Using the code below, let us take a look at how the system processes the code to gain a better understanding of the compiler's behavior.
int a = 12;
int b = 12;
Integer i1 = new Integer(12);
Integer i2 = new Integer(12);
Integer i3 = i2;
In practice, the memory of a computer is linear. It is a very long array of bits. Although arrows are visually appealing, when we compare references we actually compare memory addresses, like in the picture below.
Here, the linear computer memory is displayed by a sequence of numbers and words. Since a and b are primitive types, actual values are stored in the memory. When we compare a==b
, we compare the numbers 12 and 12.
Since i1
is a non-primitive, the memory address of the object it references is stored in the memory. The 49 tells it that it can be looked up from character 49 onwards. i2
is stored in the memory from index 63 onwards. When we compare i1==i2
, we compare 49 with 63.
When we want to compare the contents of objects, the equals
method should be used, instead of the ==
operator. In the Introduction to Programming course, you learned that the Object class contains a method equals(Object other)
. The default implementation of this method does the same as the == operator, but it can be overridden. Most classes that you are familiar with, such as String
and Integer
, do so. The fact that it is overridden in the Integer
class is the reason why in the example above, i1.equals(i2)
returns true.
For strings, equals
works as expected in that it declares two strings identical in content to be 'equal' even if they are two separate objects. The String class has replaced the default equals
with its own implementation.
If we want to compare our own classes using the equals
method, then it must be defined inside the class. The method created accepts an Object
-type reference as a parameter, which can be any object.
We suggest that you do not try to write the code for an overridden equals
method yourself, but use IntelliJ or Eclipse to generate it for you, since doing so correctly is non-trivial. If you try to do this yourself,
you will typically see that the following elements are used: the comparison first looks at the references. If the argument points to the same object in memory, they should be equal. If they are two separate objects, this is followed by checking the parameter object's type with the instanceof
operation - if the object type does not match the type of our class, the object cannot be the same. Then, if the class is the same, a cast is performed after which the object variables are compared against each other.
There is one more issue to take into account when you override the equals method: if we do so, we should take care of the hashCode
method as well. To understand why this is the case, you have to understand what the contract of the hashCode
method is, and what it is used for, which we discuss in the next section, Equals and Hashcode.
Test your knowledge
In this quiz, you can test your knowledge on the subjects covered in this chapter.
Suppose a fellow student has implemented a class Dataset
. Suppose you have obtained two references to two Dataset
objects held in variables a
and b
.
Which of the following statements are true?
- If
a == b
returnstrue
, thena.equals(b)
must also returntrue
. - If
a == b
returnsfalse
, it is still possible thata.equals(b)
returnstrue
. - Assuming
a
andb
are not null, it is guaranteed thata == b
anda.equals(b)
always return the same. - Assuming
a
andb
are not null, it is possible thata == b
anda.equals(b)
always return the same. - If
a.equals(b)
returnsfalse
, thena == b
must also returnfalse
.