FirstOrDefault() is a Code Smell

Conceptually, every computer program is just a series of data transformations. Most of our algorithms would take some sequence of data as an input, massage this data in some way and output another sequence. These transformations are often applied using the canonical functions Filter, Map and Reduce or, if we speak C#, the equivalent IEnumerable extensions – Where(), Select() and Aggregate().

There are also cases, though, when your algorithm needs to find a single element from the input collection based on a logical condition. In those cases, there are also a variety of options to consider. However, I think many developers are using just one of the available methods pretty much mechanically – this is the IEnumerable extension method FirstOrDefault().  I see that as a bad habit. We should be more thoughtful when selecting the right method for the task at hand. In the following sections, I will introduce FirstOrDefault() as well as some other alternative methods that may prove to be a better choice in some particular circumstances. Concretely, the methods I’m about to examine are: First(), SingleOrDefault() and Single().

Note: All of the methods presented in this article have two overloads  – one without any input parameters(also known as a “niladic” method) and one with a single input(a “monadic” method) parameter of type Func<TSource,bool> representing a predicate for the collection elements to be checked against. The examples below will be using the monadic version, but the statements made would hold for the parameterless overload as well without any loss of generality. I will also skip the asynchronous versions as they don’t change anything in the semantics.

First, let’s define a simple Person class to work with:

public class Person
{
    public int Id { get; set; }
    public string Name { get; set; }
}

and a small collection People of Person objects as test data:

private static IEnumerable<Person> People => new[]
{
    new Person{Id = 1, Name = "John"},
    new Person{Id = 2, Name = "Alice"},
    new Person{Id = 3, Name = "John"}
};

Using this simple setup, let’s move to presenting each of the methods individually.

FirstOrDefault()

Signature
public static TSource FirstOrDefault<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) 

FirstOrDefault() returns the first element that satisfies the predicate. In case none of the elements matches the predicate, the method will return the default value for the type of the elements in the collection e.g. null for reference types, 0 for integers etc.

Example
People.FirstOrDefault(p => p.Name == "John"); // John with Id 1
People.FirstOrDefault(p => p.Name == "Bob"); // null

The first call returns the first person with name John. The second call looks for a Person with a name Bob. Such a person does not exist therefore the method returns null which is the default value for any reference type.

It’s worth mentioning one detail regarding performance here. Note that when the first FirstOrDefult() call above gets to the first John, the method can return without traversing the whole collection. This can be beneficial in some cases but should not be the only reason to choose FirstOrDefault() if there is a better logical alternative.

First()

Signature
public static TSource First<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)

First() has stronger restrictions than FirstOrDefault(). It will again get the first element matching the predicate, but the difference comes when no element satisfies the logical condition. In this case it will throw an InvalidOperationException.

Example
People.First(p => p.Name == "John"); // John with Id 1
People.First(p => p.Name == "Bob"); // throws System.InvalidOperationException

SingleOrDefault()

Signature
public static TSource SingleOrDefault<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)

SingleOrDefault() is similar to FirstOrDefault(). The only difference is that if more than one element matches the predicate it will throw an exception.

Example
People.SingleOrDefault(p => p.Name == "Alice"); // Alice with Id 2
People.SingleOrDefault(p => p.Name == "Bob"); // null
People.SingleOrDefault(p => p.Name == "John"); // throws System.InvalidOperationException

Single()

Signature
public static TSource Single<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)

Single() is the most restrictive method of all. It will return a result only if a single element matches the predicate condition. If no element is found or there is more than one matching element, Single() throws an exception.

Example
People.Single(p => p.Name == "Alice"); // Alice with Id 2
People.Single(p => p.Name == "George"); // throws System.InvalidOperationException
People.Single(p => p.Name == "John"); // throws System.InvalidOperationException

Why is FirstOrDefault() a Code Smell?

A “code smell” is something you should pay extra attention to when you see it. This doesn’t mean it is always wrong or necessarily represents some bad design. There is nothing wrong with FirstOrDefault() per se and it definitely has its’ use cases but I do believe we should be careful when we encounter it and the reason is quite simple. What is the percentage of the real-world use cases when we logically need only the first element as a result from a lookup? In other words, how often do we know that potentially many elements would match our condition, but we intentionally take only the first one and ignore the rest? I think certainly a lot less than the actual usage of FirstOrDefault() that I’ve seen. A single element lookup should be a lot more common. An example is looking for something concretely by its unique identifier which we do all the time. This can be a user, a product, a book or a thousand of other things. In those cases, we wouldn’t really expect more than one element. So, we either find the thing we are looking for and get a single element or we get nothing if the element is missing. Either way, FirstOrDefault() wouldn’t make much sense. Single() or SingleOrDefault() would be a lot more appropriate.

There is one specific use case directly signifying that FirstOrDefault() should be replaced with First() or Single(). This is when FirstOrDefault() is called but the result is not checked against the default value, rather we are directly trying to access something from the returned object. For example:

var person = People.FirstOrDefault(p => p.Name == "John");
var name = person.Name;

This piece of code is definitely problematic, but there can be a couple of reasons. First, when it’s actually possible for the person not to be found. In this case we’ll get a null reference exception. And second, if the system state guarantees that we’ll always find a person. This means we need to use a more assertive method like First() or Single() in order to express our expectations more clearly.

Why is FirstOrDefault() Overused?

So, why do developers tend to use FirstOrDefault() a little too often? Here are some reasons…

Data Duplication

Sometimes we just get duplicated data in our system. Truth is – if we don’t have any hard constraints at the storage level, we will eventually get duplicates. In those cases, although logically we would be searching for a single record, we need to use FirstOrDefault() as SingleOrDefault() would throw an exception in case of a duplicated record. The recommendation here is quite obvious – try to introduce some stronger consistency guarantees at the database level. Of course, this is easier said than done as it may require some quite daunting cleanup tasks upfront. Aside from that, I would recommend logging a warning message every time you encounter more than a single record when exactly one record is expected logically. Hopefully this will make the data duplication problem more visible so you can deal with it on time.

Fear

What’s one of the biggest fears for developers? That’s midnight calls from management getting you on the line with fifteen more individuals you’ve never heard of before. They all expect you to fix the “showstopper” ASAP. You check the logs and immediately see this:

Unhandled Exception: System.InvalidOperationException: Sequence contains more than one matching element at System.Linq.Enumerable.SingleOrDefault[TSource](IEnumerable`1 source, Func`2 predicate)

You’ve used SingleOrDefault() with all the good intentions to express your contract explicitly. But this got you into trouble. Does it mean you should get back to using FirstOrDefault() everywhere? I don’t think so. You just have to think about and re-consider the guarantees you really have in your system. As already mentioned, if you don’t have a database constraint, you will get duplicates at some point. In these cases just use FirstOrDefault() and log a warning in case of duplicates. If you do have the uniqueness guarantees – stick with the “single” semantic methods. Remember – it’s fine to be defensive but being over defensive will harm your codebase.

Ignorance

On a Friday afternoon you may prefer to think more about the great weekend in front of you rather than the best method to use. I get it, but this doesn’t change the fact that your code is your artifact. It shows your professionalism and your attention to details. You may not find the best method to use on a Friday afternoon, but you better make sure you do that first thing on Monday morning.

Summary

In this article I closely examined the pitfalls of FirstOrDefault() alongside three other methods that may be a better choice depending on the use case – namely First(), SingleOrDefault() and Single(). I hope I managed to convince you that you should be extra careful when you come across FirstOrDefault() in your codebase and probably consider substituting it with a more appropriate method that would convey your contract more naturally.

I would even challenge you to do a search through your codebase for FirstOrDefault and examine the first ten usages. I’m pretty sure at least five of them would be better off with SingleOrDefault(). Let me know if I’m right in the comments section below.

Thanks for reading!

Resources

  1. https://stackoverflow.com/questions/1745691/linq-when-to-use-singleordefault-vs-firstordefault-with-filtering-criteria
  2. https://weblogs.asp.net/zeeshanhirani/which-one-is-faster-singleordefault-or-firstordefault
  3. https://stackoverflow.com/questions/1024559/when-to-use-first-and-when-to-use-firstordefault-with-linq
  4. https://stackoverflow.com/questions/1745691/linq-when-to-use-singleordefault-vs-firstordefault-with-filtering-criteria
  5. https://www.matheus.ro/2018/01/29/clean-code-avoid-many-arguments-functions/
3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
André Vieira
André Vieira
4 years ago

Good read, mate.

You’ve touched on it when describing FirstOrDefault(), but I think it’s worth highlighting the performance implication of SingleOrDefault(): as it needs to verify the uniqueness of the given element in the enumerable, it would have to iterate it all — in the worst case — after finding the first match, checking for a second occurrence.

When dealing with large collections, that may be deemed as not affordable, even though it offers the most logical sense.

Anyway, thanks for the food for thought.

André Vieira
André Vieira
4 years ago

No, actually.

Have you been in a position where using FirstOrDefault rather than SingleOrDefault helped a lot with a performance issue?

No, actually.

I was just making an exercise; someone dealing with a very large collection through LINQ to Objects could end up avoiding to employ SingleOrDefault() because of the performance penalty. (Although I tend to agree with your “bigger fish to fry” remark.)

Yes, asymptotically speaking, both are O(n), but average cases in real-world examples could show gains in the order you hypothesized (“10 times faster for example”).

Anyway, edge case exercise notwithstanding, your point regarding the unrestricted usage of FirstOrDefault() that we see in practice is absolutely correct.

Site Footer

Subscribe To My Newsletter

Email address