This library provides alternative collection interfaces and implementations other than the ones shipped with the
Java Collections Framework in the java.util
package. The library follows a different set of principles,
which are listed below.
The collection library provides fine-granular interfaces for the collection classes. Interfaces are separated in read-only and modifyable parts which makes it possible to take a collection with its read-only interface as a method argument signaling that the method is not going to modify the argument's contents. For each read-only interface there is also a immutable variant which goes a step further in guaranteeing that the collection content never changes. This guarantees thread-safety and allows for easy sharing of collection instances.
Classes implement all the methods of their interfaces (no throwing of UnsupportedOperationException
).
The only exception to this is Iterator
-- we use java.util.Iterable
to take advantage of the for-each loop --
which does not implement the remove()
operation if returned by a read-only interface (some collections implement a modifyingIterator()
method which allows removal during iteration).
Each collection provides only those operations which it can implement efficiently. This generally means in O(log n)
where n
is the number of elements in the collection (or O(m * log(n + m))
for operations that take m
elements
as argument). This helps writing programs that scale well by forcing to choose the collection that supports the
needed operations efficiently.
The implementations provide reasonable compromises between memory and CPU consumption. It is probably not the fastest nor the most memory efficient out there but up to a certain level both goals are well aligned and the implementation are quite efficient in both dimensions.
The collections generally cannot contain null
elements. null
is used in some places to signal the absence of
an element. Allowing null
as element whould make these conditions ambiguous.
The library has been developed with Java 8 which is the minimum version needed. Due to the backwards-compability, it should also work with higher versions of Java -- this has not been tested though. If you run into issues please report them.
The only dependency is the JSR305 annotations which are mainly used to document where null
is allowed or returned.
Unit tests are written using JUnit 5, Mockito and Truth.
Collection
is the base interface for all the single-dimension collections. Basically, it just allows to query for the
size and to iterate its elements.
Collections are divided into those which have a well-defined order of the elements (OrderedCollection
) and those
that do not (UnOrderedCollection
). Note that these interfaces define exactly how equals
must be implemented.
An ordered collection can never be equal to an unordered one and vice versa. When writing APIs this property is usally
essential (especially for testing and mocking), therefore a sub-type of OrderedCollection
or UnOrderedCollection
should be preferred over using Collection
.
The interfaces OrderedSet
, Sequence
, List
, Set
and Bag
are the main interfaces to choose from. They provide
different sets of operations, so you want to prefer one over the other based on your needs. If all you
need to do is collecting elements that you will later iterate over and you do not care about duplicates, the
ArrayList
is your friend. If you want to eliminate duplicates, it is most likely the HashSet
. For more advanced
uses see the following table.
Name | Ordered | Duplicates | Containment Check | Natural Order | Get by Index | Insert at Front |
---|---|---|---|---|---|---|
ArrayList | Yes | Yes | No | No | Yes | No |
HashBag | No | Yes | Yes | No | No | No |
HashList | Yes | Yes | Yes | No | Yes | No |
HashSet | No | No | Yes | No | No | No |
IndexedHashSet | Yes | No | Yes | No | Yes | No |
TreeList | Yes | Yes | Yes | No | Yes | Yes |
TreeSequence | Yes | Yes | Yes | Yes | No | No |
TreeSet | Yes | No | Yes | Yes | No | No |
LinkedSequence | Yes | Yes | No | No | No | Yes |
IntrusiveLinkedSequence | Yes | Yes | No | No | No | Yes |
'Natural Order' means that the elements are always kept ordered according to a given ordering relation.
The TreeList
has the additional advantage that is supports insertion (and removal) at an arbitrary index,
not just at front and back like the LinkedSequence
. The IntrusiveLinkedSequence
allows to
customize the link objects that compose its doubly linked list and removal by pointer to the link.
Not listed above is the ConcurrentIntrusiveLinkedSequence
which has the same properties as IntrusiveLinkedSequence
but supports concurrent operations. All the other classes above are not thread-safe, you need to properly guard them
against concurrent access if multiple threads read or modify them.
One thing that you might find odd at first when used to work with the Java Collections Framework is that
interfaces like List
or Set
do not provide any modification methods to add, remove elements etc.
This is on purpose. To be able to modify a collection you need to have a reference to the concrete type.
So, instead of
public class NameRepository {
private final Set<String> usedNames = new HashSet<>();
public void registerName(String name) {
if (!usedNames.add(name)) { // Ooops, no 'add' on Set...
throw new IllegalStateException("Already registered");
}
}
}
write
public class NameRepository {
private final HashSet<String> usedNames = new HashSet<>();
public void registerName(String name) {
if (!usedNames.add(name)) { // ... but there is on HashSet!
throw new IllegalStateException("Already registered");
}
}
}
This allows you to limit the code that has write-access to the collection as strictly as possible, while still allowing to pass a collection around:
public class NameRepository<Foo> {
//...
List<String> getAllNamesAlphabeticallySorted() {
// Passes 'usedNames' as read-only interface to the 'sort' method.
return CollectionUtil.sort(usedNames);
}
}
In this example CollectionUtil.sort
takes a Collection
argument indicating that it does not need
to modify its contents (though it would be principally possible by casting it). Instead, it creates
and returns a new ArrayList
to hold the sorted elements.
Another way to create collection instances is through their builders. A builder can be passed around
if elements need to be collected from different places. Most collections have a static newBuilder()
method
that returns a CollectionBuilder
for the collection.
An example collecting all permutations of a string in a recursive implementation:
class Permutations {
public static ImmutableSet<String>
74B7
permutations(String s) {
ImmutableHashSet.Builder<String> permutations = ImmutableHashSet.newBuilder();
permutations("", s, permutations);
return permutations.build();
}
private static void permutations(String prefix, String s, CollectionBuilder<String, ?> builder) {
if (s.length() == 1) {
builder.add(prefix + s);
} else {
for (int i = 0; i < s.length(); ++i) {
permutations(prefix + s.charAt(i), s.substring(0, i) + s.substring(i + 1), builder);
}
}
}
}
Builders are the only way to create instances of any implementation of ImmutableCollection
. Immutable collections
are guaranteed to never change after creation. You can also use the static methods of ImmutableCollections
like listOf(...)
, setOf(...)
etc which create immutable collections indirectly by using the builders.
Maps are associative containers represented by the Map
interface. The two mutable main implementations are
HashMap
and TreeMap
which use hashing respectively a balanced tree to store the entries.
Implementations of MultiMap
and ListMultiMap
can be used when multiple values have to be associated
with the same key. MultiMap
stores the values in a set (so, no order and no duplicates) while ListMultiMap
stores the values in a list.
Note that for efficiency reasons, Map<K,V>
does not implementIterable<Entry<K,V>>
. This would require to
either permanently store Entry
objects or create them on-the-fly during iteration. To iterate all entries
in a map, use the EntryIterator<K,V>
returned by the entryIterator()
method.
A speciality of the library are the persistent collections -- collections which are immutable but it is possible
to efficiently create new collections based on an existing one with one element added, removed or changed! To
distinguish these operations from the ones of the mutable collections, they are called with
and without
instead
of add
or remove
for example.
public class SnapshotableNameRepository<Foo> {
private PersistentSet<String> usedNames = PersistentHashSet.empty();
public void registerName(String name) {
PersistentSet<String> updatedNames = usedNames.with(name);
if (usedNames == updatedNames) { // 'with' returns the same object if there was no change
throw new IllegalStateException("Already registered");
}
usedNames = updatedNames;
}
public ImmutableSet<String> snapshot() {
return usedNames; // No need to copy 'usedNames' as they are immutable!
}
}
Persistent implementations are significantly slower that their mutable counterparts. But all their operations
are still O(log n)
so they are much faster than creating copies when their size is large.
To interact with classes from java.util
, there are adapters in org.povworld.collection
that translate
between the different interfaces:
JavaAdapters
wraps collections of this library tojava.util.List
,java.util.Set
orjava.util.Collection
. Note that the wrappers are read-only.ListAdapter
andSetAdapter
wrapjava.util.List
andjava.util.Set
intoList
andSet
interfaces of this library.