List vs set - What's the difference?

491    Asked by AashnaSaito in Salesforce , Asked on Feb 23, 2023

 I have always assumed that it is better to use a Set in my query filters instead of a List. For example:

Set parentIds = generateParentIds();
List children = [SELECT Id FROM Child__c WHERE Parent__c IN :parentIds];
My reasoning was:

Set can be more easily (and quickly) generated from a Map via keySet().

Set ought to consume less memory than List (no longer sure this is true).

null values can more easily be removed from a Set.

Set cannot contain any duplicate values.

What are the actual benefits to using Set instead of List? What are the drawbacks? What factors help decide? For example I know if the field is nillable, then the ability to remove null is much more important, because it helps avoid a table-scan.

List vs set
Use a List when:
Ordering is important
  You want to access elements by index (zero-based)

You want to be able to sort the elements

You need multiple dimensions

  Use a Set when:

Ordering is not important

You want the elements to be unique (as defined by field set for sObjects or equals and hashCode methods)

You only need to iterate over all the records or test if it is present in the collection. See note* on ordering.

You want to do Set wise operations, such as union (addAll), intersection (retainAll), and relative complement (removeAll)

You already have a Map where the keys are the values you want, allowing you to use Map.keySet() rather than iterating and adding. This can be useful when combined with a SOQL query that directly populates the Map.

One important distinction is that a Set is an "unordered collection of elements" whereas a List maintains the ordering that elements were added in and can be sorted. This can be important in a number of scenarios where you want to work on records in a particular sequence. It also allows the List to be accessed by index, whereas the Set can only be iterated over. You can use a List interchangeably with Array notations.

A set is an unordered collection—you can’t access a set element at a specific index. You can only iterate over set elements. Source

This partially comes down to the Set using the equals and hashCode methods to determine uniqueness. I've been caught out a few times with Sets and Map keys where a field change on an sObject changes its hashCode. As a result it can be added to the Set or Map Keys again. So relying on a Set to remove duplicate values can be difficult if there are intermediate modifications to field values on sObjects. The same is true if you are implementing hashCode yourself on a custom object.

Unlike most of Apex, the casing of a string in a Set is considered. From the docs:

If the set contains String elements, the elements are case-sensitive. Two set elements that differ only by case are considered distinct.

As for memory usage, the definitive answer would be to monitor the heap size using Set and List alternatively. That would give the best indication for your specific scenario.

It's been awhile since I've done a data structures course. It looks like the Apex Set is based off Javas HashSet -

Unlike Java, Apex developers do not need to reference the algorithm that is used to implement a set in their declarations (for example, HashSet or TreeSet). Apex uses a hash structure for all sets. Source

Quick empirical test:
List ids = new List{'foo','bar'};
System.debug('Heap: ' + Limits.getHeapSize() + '/' + Limits.getLimitHeapSize());
Heap: 1065/6000000
Set ids = new Set{'foo','bar'};
System.debug('Heap: ' + Limits.getHeapSize() + '/' + Limits.getLimitHeapSize());
Heap: 1065/6000000

This would suggest that there is no difference in terms of heap usage between the two. I suspect this may be due to the negligible size of my collections. There could be some form of optimization going on for small sets. * Up until Summer `15 the ordering of items in Sets was non-deterministic. See Predictable Iteration Order for Unordered Collections



Your Answer

Interviews

Parent Categories