One of the most useful (and often one of the most fundamental) concepts when learning to code is looping. Loops are all around us, in virtually every language we know, and even if your language of choice doesn’t include primitive loops, there is likely a way to achieve executing the same block of code a fixed number of times, or by implementing some kind of basic means of iterating.
The first programming language that I ever wrote code in was Python, but the first programming language I really got a firm grasp on, and that I still use to this day, is Java.
Over Java’s history, it’s evolved immensely: JDK 11 has introduced modules, which is another way of organizing a codebase, JDK 8 introduced lambda functions (a feature which existed in many other langauges including C# and JavaScript for some time), and JDK 5 was when the enhanced for-loop was introduced.
JDK 5’s last update was over a decade ago, but regardless of how old it is, I can’t help but notice that sometimes people are continuing to use C-Style for-loops, and not only that, but I can find very heavy usage of C-Style loops in a lot of code bases in other languages as well, such as C#. I want to write this blog post as a caution to those who are continuing to use this loop, and provide reasons as to why whenever I see C-Style loops in a code review, I’ll almost always ask someone to refactor it.
My first big gripe about C-Style for-loops is that they are extremely prone to bugs. For every line of code you write within a C-Style for-loop, there could be an error in almost every line, and there can be potentially a number of bugs that are present before you even start talking about what’s in the loop. Let’s look at a simple example, printing numbers in an array:
int[] nums = new int [] { 1, 2, 3, 4, 5 };
for (int i = 0; i < nums.length; i++)
{
System.out.println(nums[i]);
}
There’s technically nothing wrong with this code. We created an array, looped over each of the elements in that array, and printed each number out to the console. However, this loop is prone to all kinds of bugs. This code works without defects or error IF and ONLY IF:
i
is not mutated within the loop, and nothing is stopping it from being manipulated either temporarily orpermanently
i
is always incremented by 1 after each execution of the loop<
condition doesn’t changei
’s initial value does not changeIf any of these things changes for any reason, we run the risk of encountering ArrayIndexOutOfBoundsException
s,
not looping over each element, or skipping elements within the array. For having the simple task of iterating over
5 numbers, this seems like a lot of risk to impose at runtime.
We can mitigate all of these issues by using alternative methods to looping over this structure. Since JDK 5, developers have been able to take advantage of enhanced for-loops, also referred to as for-each loops (yet again another construct that’s always been available in C#).
Using an enhanced for-loop, the code now looks like this:
int[] nums = new int [] { 1, 2, 3, 4, 5 };
for (int num : nums)
{
System.out.println(num);
}
This code will iterate over the array, and on each invocation of the loop it will assign the a value to the
variable num
, then we can use that number without having to access the array! This addresses all of the
concerns we had about C-Style for-loops in terms of stability.
Enhanced for-loops can be used with any data structure that implements Java’s Iterable
interface, and because the Collection
interface
extends Iterable
, we can use it for every List
and Set
in our code.
Speaking of Collection
s, let’s discuss my next big gripe:
Let’s start this discussion with another example. I have a List
that I’m iterating over and I want to
print each element in the List
, and let’s start with a C-Style loop:
List<Integer> nums = new ArrayList<>();
nums.add(1);
nums.add(2);
nums.add(3);
nums.add(4);
nums.add(5);
for (int i = 0; i < nums.size(); i++)
{
System.out.println(nums.get(i));
}
Again, same as before: nothing too fancy about this code, and it works fine. Now, let’s switch from using ArrayList
to LinkedList
:
List<Integer> nums = new LinkedList<>();
nums.add(1);
nums.add(2);
nums.add(3);
nums.add(4);
nums.add(5);
for (int i = 0; i < nums.size(); i++)
{
System.out.println(nums.get(i));
}
This is the exact same output, and at face value the only real difference is the fact that we’re using a different implementation
of the List
interface. However, that’s an extremely crucial difference in this situation.
ArrayList
is what it sounds like: it’s an implementation of the List
interface that uses an array as it’s underlying structure.
This means that extracting data out of the list can happen in constant time.
However, LinkedList
is a linked data structure. With an ArrayList
, the compromise for performance happens when adding elements to
the list: the array has to “grow” by creating a new array with a larger size than the current underlying array, and moving all the
elements to that new reference, and then garbage collection will take care of the rest. LinkedList
s are really efficient
during the population of the list because the structure can be created in such a way that growing the list can happen in constant
time. However, this comes at a cost of accessing the data by index: each .get
call will yield an n^2 time complexity, n being the
index you’re trying to access.
Just by switching from ArrayList
to LinkedList
, we exponentially increased the time complexity of this code. However, enhanced
for-loops can help mitigate this issue:
List<Integer> nums = new LinkedList<>();
nums.add(1);
nums.add(2);
nums.add(3);
nums.add(4);
nums.add(5);
for (Integer num : nums)
{
System.out.println(num);
}
But why is this more beneficial in terms of time complexity? Remember: List
s are Collection
s which are Iterable
. Each implementation of
a List
comes with a built-in Iterator
which can be used
to efficiently iterate over the list. Normally, Iterator
s are designed to be as efficient as possible. Iterator
s can be scoped to
the class implementing them to allow them to have scope to the internal members of the data structure, which allows them to do things like
access links within a LinkedList
directly so that the next link can be fetched without having to loop over them like most .get
operations
would normally do.
Why wouldn’t we just use ArrayList
everywhere if this was the case though? You can control whether or not your API uses ArrayList
, sure, but
that won’t stop other engineers from using some other kind of implementation in theirs, so guard yourself against these types of caveats
by using the right loops.
What are some other ways we could achieve the same kind of results without using for-loops? In JDK 8, Java was introduced to lambdas and
Stream
s, which aim to vastly improve our code by allowing us to manipulate structures by passing function references to Streams
to
perform work. We can write code like this to do the same kind of work:
nums.forEach((num) -> System.out.println(num));
This is a lambda that just calls System.out.println
on a variable called num
that gets initialized to a new element in the List
on each invocation. We can also simplify this code even further with some syntactic sugar:
nums.forEach(System.out::println);
Apart from enhanced for-loops and lambdas, the only other thing I could recommend is using the actual Iterator
class in conjunction with
a while
loop:
Iterator<Integer> numIterator = nums.iterator();
while (numIterator.hasNext()) {
System.out.println(numIterator.next());
}
This has less issues than a C-Style for-loop does, but it can still pose issues if .next
is called more than once within
the while
block. Nevertheless, it’s just as efficient as an enhanced for-loop. If I remember correctly… I think
Java bytecode compiles down in such a way that a while-loop with an Iterator
and an enhanced for-loop are equivalent.
So now that I’ve gone on and on about how you shouldn’t use a C-style for-loop, let’s talk about when you should actually use one.
There might be some cases when you simply cannot use an enhanced for-loop, or where the other methods I outlined don’t make logical sense. Maybe you code has to execute a finite number of times, and it’s not based on a collection or data structure of any kind. Maybe it’s based on some count, or a requirement stating that something needs to be done X-times.
I spoke about time complexity in this post, but truth be told enhanced for-loops are not the single most efficient loop
there is. In cases involving the need to iterate over data, yes, enhanced for-loops are your best bet, but they come
at a slight performance cost of needing to keep track of the state of the Iterator
, as well as the additional calls
to get the info out of the Iterator
. For applications where performance, both in terms of memory and time
consumption, it might not be wise to utilize Iterator
s or enhanced for-loops.
Thanks for letting me air out my grievances about one of our oldest language constructs! If you enjoyed this post, I’d love to hear your feedback on Twitter or Dev! Be sure to check out my blog for other posts like this!