Implementing and Using an (Array Backed) Bag

A bag (a.k.a. multi-set) is a simple data structure that stores things. That’s it. There is no effort to maintain a specific order, sort elements, or check to maintain a limit on the number of duplicate entries. This type of data structure is extremely useful because you can put objects in it, you can iterate over each element, and you can take stuff out of it. This implementation is backed by an Array, although some multi-sets basically implement a Map with object keys (elements) and int values (counters). While other Map-like implementations are helpful because they provide very fast methods to test for inclusion and obtain counts of specific elements, ArrayList-like bag implementations are helpful for providing even faster adds, removes, and iterations. And, although my example is a class, the general idea behind it is so simple that you can implement the idea using direct array manipulation with little effort.

Its strength comes from the fact that you do not need to maintain any specific order. When you remove an element from the beginning or middle of the collection, you can grab an element from anywhere to fill its spot rather than shifting every element that comes after it. If you maintain an array with no gaps, then you can easily find the last element simply by using “elements[size - 1]”. Just move the last element to the gap you create, null the slot of the old last element, and decrement the size variable. (Maintaining a gap-less array.)

Insertion can be done with one array element assignment. Removal can be done with two array element assignments and one read. Iteration via direct array access could also be implemented using a standard for loop through array elements.

import java.util.Arrays;
import java.util.Iterator;

public class Bag<E> implements Iterable<E>
{
  protected E arr[] = (E[])new Object[1024];
  protected int size = 0;

  public void add(E e)
  {
    if(size == arr.length) 
    {
      expand();
    }
    arr[size++] = e;
  }

  protected void expand()
  {
    Object temp[] = new Object[arr.length * 2];
    System.arraycopy(arr, 0, temp, 0, arr.length);
    arr = (E[])temp;
  }

  public void clear()
  {
    Arrays.fill(arr, null);
    size = 0;
  }

  public int size()
  {
    return size;
  }

  public void resize(int capacity)
  {
    Object temp[] = new Object[capacity];
    System.arraycopy(arr, 0, temp, 0, capacity);
    arr = (E[])temp;
    size = capacity;
  }

  public int getCapacity()
  {
    return arr.length;
  }

  public Iterator<E> iterator()
  {
    return new BagIterator();
  }

  private class BagIterator implements Iterator<E>
  {
    int i = 0;

    public boolean hasNext()
    {
      return i < size;
    }

    public E next()
    {
      return arr[i++];
    }

    public void remove()
    {
      arr[i - 1] = arr[size - 1];
      arr[size - 1] = null;
      i--;
      size--;
    }
  }
}

Design decisions for this class:

I chose to make this class as minimal as possible. There is no get method because there is no meaningful way to index elements and most useful operations will use an iterator instead of random access. If I needed specialized operations, then I would modify this class. There is no remove(o) method because it is more efficient to remove elements while iterating over the collection (like a LinkedList.) The Collection class’s remove(o) method has to find the object first whether it’s fast as in a HashSet or slow as in an ArrayList. If you’re implementing a similar class you could write such methods, but keep in mind it will force you to do a lot of redundant searching. I also exclude them for the sake of simplicity. There’s no need for remove(index) either for the same reason there is no need for get(index). Removal should be done through an Iterator. Clear is provided because it’s a special case and is such a common operation.

It’s expected you already know Java’s Iterator pattern and the advantage of using it over get(index)/remove(index) for anything besides ArrayLists. Since so many use cases in game programming involve visiting each element in a set of Objects exactly once per loop, using an Iterator is often more elegant and efficient. (Set with a small ‘s’. You can use a bag like a normal Set as a user of the class by taking care not to add duplicates just as you could with a regular array or ArrayList or by modifying the add method.) Consider this:

for(Iterator<Goomba> itr = bag.iterator(); itr.hasNext(); )
{
  Goomba g = itr.next();
  if(g.isSquished() || g.isOffScreen())
  {
     itr.remove();
     g.dispose();
     continue;
  }
  g.step();
  Object o = g.checkCollision();
  if(o == mario)
  {
    mario.damage();
  }
  else
  {
    g.reverse();
    g.step();
  }
}

This is often better than any Collection if you want to store elements with the compactness of an ArrayList, have fast removal as in a LinkedList, and need to pick a data structure for use in a game loop. Those of you using an ArrayList because you are too lazy to maintain element counts or worry about array capacities can also use this a generic container for objects. In fact, the point of this type of data structure is to provide a generic container with no special properties.

Final note:

This is meant to act as a reference. I could have chosen from many other variations of this data structure, but chose to write the simplest form for the sake of demonstration. I did not implement Collection because in practice you may only need the methods provided here. Someone else could modify it to conform with the Collections interface. You might not need it, but its wouldn’t be hard at all if you did. Sometimes I’m more likely to pass around Iterable references than Collections, since that’s the lowest level interface I really need. I did not include comments. The code should be clear and basically self documenting. Kappa also posted a thread titled The Bag (Fast Object Collection) but it was apparently older than 180 days and I thought it better to start a new topic with the longest post at the top. Check out kappa’s implementation for an alternative. His code also includes comments and JavaDoc.