"Better" LinkedList implementation

theagentd · June 18, 2017, 1:43am

Hey, everyone.

Today I found myself in a need of a least-recently-used (LRU) cache system. Essentially, I wanted a cache of the last 10 000 used objects. When a new object was to be loaded, I wanted to unload the oldest element in the cache. Before, the cache was only 96 elements, at which point looping through the list to find the least recently used element was perfectly fine for the load I had, but with 10 000 objects, that situation changed. I found that LinkedHashMap could be used as a makeshift cache, as it can be set to update the internal order of elements when they are accessed, but it just seemed inefficient and weird for what I actually wanted to accomplish here. Some research made me realize that a simple linked list was a good fit for this, as a double linked list can easily be maintained in LRU order. Basically, whenever an element in the cache is used, it is moved to the beginning of the list (an O(1) operation on linked list), meaning that the cache stores objects sorted from most recently used to least recently used. Then to remove the most least recently used object, I just remove the tail of the list, again an O(1) operation.

Now, I hope we all know how bad Java’s LinkedList class is. Not only does it generate garbage every time an object is added to it, but moving an object to the beginning of a list is an O(n) operation as I’d need to use list.remove(object), which would need to scan through the list until it finds the object in question and can remove it. A proper linked list implementation would store the previous and next references in the object itself, meaning that finding the previous and next objects become O(1) operations, and the element can be removed without going through the entire list. So… I wrote a minimal linked list that could do what I needed and not much more.

As mentioned, the main difference is that my linked list class does not allocate Node objects to store the next and previous references, but instead stores those references in the elements themselves. This leads to a couple of peculiarities.

All elements must implement an interface that allows the list to set the previous and next pointers (or alternatively extend a class which implements those functions for it).
You can’t place the same element in a linked list twice, as each element can only track one previous and next neighbor.
For the same reason, you can’t even use the same element in two DIFFERENT lists at the same time.

In my case, none of these were an issue, so I went ahead and implemented it. I even added a special method for moving an element in the list to the start of the list for maximum performance in that sense. The class provides the following functions, all of which are O(1):

addFirst(e)
addLast(e)
moveToFirst(e)
remove(e)
removeFirst()
removeLast()

There is no random access function. The correct way of looping through the list is to simply call element.getNext() until it returns null (the current size of the list is tracked though). The somewhat complicated usage of generics is there to allow for type safety, both when extending the Element class and when working with FastLinkedList.

Usage example:



private class MyElement extends FastLinkedList.SimpleElement<MyElement>{ //Note this definition here, it's particularly clever =P
    ...
}


FastLinkedList<MyElement> list = new FastLinkedList<>();
list.addFirst(...);
...

Performance test on 100 000 randomly shuffled elements:

Adding 100 000 objects to the list:
LinkedList: 1.617 ms
FastLinkedList: 0.627 ms
~3x faster

Move 2000 random elements from their current position to the start of the list:
LinkedList: 203.315 ms
FastLinkedList: 0.118 ms
Insanely much faster (O(n)—>O(1))

Remove 2000 random elements from the list:
LinkedList: 175.541 ms
FastLinkedList: 0.094 ms
Insanely much faster (O(n)—>O(1))

Remove remaining 98 000 objects using removeFirst():
LinkedList: 0.298 ms
FastLinkedList: 2.78 ms
Noticeably slower in this test due to worse cache coherency. FastLinkedList needs to access each element’s data to find the next and previous node in the list. As the elements are shuffled, this ends up jumping around randomly in memory. Java’s LinkedList instead creates Node objects for each element, which end up sequentially in memory as they’re recreated each iteration. This is not an issue with real-life data, and is the cost of going garbage-less.

Code:

package engine.util;

import engine.util.FastLinkedList.Element; //this import is required

public class FastLinkedList<E extends Element<E>> {
	
	private E head, tail;
	private int size;
	
	
	public FastLinkedList() {}

	
	public void addFirst(E element){
		if(size == 0){
			head = element;
			tail = element;
		}else{
			head.setPrevious(element);
			element.setNext(head);
			head = element;
		}
		size++;
	}
	
	public void addLast(E element){
		if(size == 0){
			head = element;
			tail = element;
		}else{
			tail.setNext(element);
			element.setPrevious(tail);
			tail = element;
		}
		size++;
	}
	
	public void moveToFirst(E element){
		
		if(element == head){
			return;
		}
		
		E prev = element.getPrevious();
		E next = element.getNext();
		
		prev.setNext(next); //prev cannot be null thanks to if-statement above
		if(next != null){
			next.setPrevious(prev);
		}else{
			//element was tail, update the tail.
			tail = prev;
		}
		
		element.setPrevious(null);
		element.setNext(head);
		head.setPrevious(element);
		head = element;
	}
	
	public void remove(E element){
		E prev = element.getPrevious();
		E next = element.getNext();
		
		if(prev != null){
			prev.setNext(next);
		}else{
			//prev == null means element was head.
			head = next;
		}
		
		if(next != null){
			next.setPrevious(prev);
		}else{
			//next == null means element was tail.
			tail = prev;
		}
		
		element.setPrevious(null);
		element.setNext(null);
		
		size--;
	}
	
	public E removeFirst(){
		
		E h = head;
		
		
		E next = h.getNext();
		if(next != null){
			next.setPrevious(null);
		}else{
			//next and prev are null, list is now empty
			tail = null;
		}

		h.setNext(null);
		
		head = next;

		size--;
		
		return h;
	}
	
	public E removeLast(){
		
		E t = tail;
		
		E prev = t.getPrevious();
		if(prev != null){
			prev.setNext(null);
		}else{
			//next and prev are null, list is now empty
			head = null;
		}
		t.setPrevious(null);
		
		tail = prev;
		
		size--;
		
		return t;
	}
	
	public E getFirst(){
		return head;
	}
	
	public E getLast(){
		return tail;
	}
	
	public int size() {
		return size;
	}
	
	public boolean isEmpty(){
		return size == 0;
	}
	
	public String toString() {
		StringBuilder b = new StringBuilder("[");
		E e = head;
		for(int i = 0; i < size; i++){
			b.append(e.toString());
			if(i != size-1){
				b.append(", ");
			}
			e = e.getNext();
		}
		return b.append(']').toString();
	}
	
	public static interface Element<E extends Element>{
		
		public void setNext(E next);
		public void setPrevious(E previous);
		
		public E getNext();
		public E getPrevious();
	}

	public static class SimpleElement<E extends SimpleElement> implements Element<E>{
		
		private E next, previous;

		@Override
		public void setNext(E next) {
			this.next = next;
		}

		@Override
		public void setPrevious(E previous) {
			this.previous = previous;
		}

		@Override
		public E getNext() {
			return next;
		}

		@Override
		public E getPrevious() {
			return previous;
		}
	}
}

Archive · June 18, 2017, 4:16am

Wouldn’t a circular array with a head and tail pointer suffice?

The tail pointer would point at the index at which the least used object (last added) object is.
The head pointer would point at the next available spot (or a taken spot if it has looped all the way around)

When you remove an element, simply set it to null in the array (make sure all operations have a null check)
To move an element from index N to the front, just save it to a temp var, set index N to null, then put the saved element at the head pointer and increment the head pointer.

Instead of the Element interface containing next and previous pointers, it could contain an index into the array so that lookups are O(1)

abcdef · June 18, 2017, 11:56am

How is the performance compared to the magic ring buffer used in the disruptor software

Like archive suggests it has a circular buffer, you never remove elements to just overwrite the oldest elements in the list.

theagentd · June 18, 2017, 1:06pm

The problem with a simple ring buffer is that it’s essentially a FIFO queue. When an element is accessed, I need to move it to the start of the queue again to make sure it’s the most recently used again. I don’t think that’s doable in an array-based ring buffer. Removing an element would leave a null hole in the array. If you want to fill the hole, you need to shift half the elements to fill the hole (either the ones before or after, so 0 to size/2 elements). If you just leave the hole there, you’ve essentially reduced the size of the cache until the head reaches that point and can fill the null hole, right? I imagine a pathological (but very common) case where the cache is full of objects, and then two elements are accessed repeatedly, causing them to be moved to the head of the list repeatedly. If each movement leaves a null hole, then one element would be evicted from the cache each time an element is moved to the front of the list. =/ If I’ve misunderstood or missed something, please let me know.

Archive · June 18, 2017, 4:23pm

Actually, yes I believe you are correct theagentd, there would be a hole in the cache until the pointer reaches that value again.
I suppose the linked list is more suited to this problem.

philfrei · June 18, 2017, 7:41pm

Nice!
Are you thinking about a concurrency-safe version by chance, or is that definitely not an issue in your situation?

Multi-threading is on my mind as I just used a LinkedBlockingDeque for safely managing concurrent calls to an AudioCue class that allows concurrent playback. Am only using offerFirst and pollLast methods. Only a handful of instances per Cue play at one time, so efficiency is less of a concern than reliability.

theagentd · June 18, 2017, 10:15pm

Not really, it’s not a requirement for my use case. It’d just get messy if I needed to support concurrency in the system I’m working on, really. You can always just synchronize on the whole class if you really wanted to.

CoDi_R · June 19, 2017, 12:39pm

The removeFirst() and removeLast() functions don’t check for empty lists yet. :point: ;D

I like your solution, esp. because its an intrusive list - the node pointers are members of the list element.

Abuse · June 19, 2017, 2:49pm

I’d be inclined to have a reference in the Element, to the List that owns it.

It’d allow fail-fast protection against multiple logic errors (adding an Element to multiple lists/the same list multiple times, removing/moving an element from a List that it’s not a member of, etc), making the code far less fragile.

cylab · June 20, 2017, 6:40am

Actually you could make use of this. Just don’t move the object to the top of the list, if it’s in the upper half (or some other threshold) already. If anything, just swap it with an object a couple of indicies up in the buffer. This way you would only produce holes for the less used objects. Since you don’t need additional references, you can easily make up for the lost cache entries by increasing the array size.

Or you only move the head pointer for adding new objects and just swap existing objects up in the buffer. This would not produce any holes and while it is technically not an LRU list anymore, it might provide good enough heuristics to function as a cache.

theagentd · June 20, 2017, 1:01pm

NOTHING in this list checks for errors. I should’ve mentioned that in the original post I guess. xd

Add already existing element again —> Corrupt link references.
Add element from another list —> Corrupt other list.
moveToHead() on object not in the list —> head is set to object not in the list.
remove() object not in list —> corrupts size counter and may corrupt link references.
removeFirst/Last() on empty list —> nuppo

I keep track of if an element is in the list outside the list for the cache, so such testing wasn’t necessary (and IMO not the job of the list in the first place).

Yeah, that’d be how you add testing. Add a reference to the list the element currently is on, check this reference in add*(), moveToFirst() and remove(). Null out the reference on removal. Also add an isEmpty() check for the removeFirst/Last() functions.

I’m worried about experimenting too much with my use case. The cache is critical for performance, and if I were to start thrashing, the game will essentially freeze up. I could try some kind of system where you don’t move an accessed element to the head of the list, but just “upgrade” it a couple of elements up by swapping it with an element x indices up in the list. If x is randomly chosen for each swap, then the risk of getting to a pathological case would be low. Like you said, this wouldn’t be an LRU cache anymore, but some kind of heuristic priority cache I guess. It’s an interesting idea, but I’m afraid I don’t have much more time to spend on this right now. =/

Icecore · June 20, 2017, 2:00pm

That’s why linked list elements remember own holder


	public static class SimpleElement<E extends SimpleElement> implements Element<E>{
+		private FastLinkedList<E> Holder;
 		private E next, previous;
	}

Or


	public static class LinkedListElement<E>{
		private FastLinkedList<E> Holder;
		private LinkedListElement<E> next, previous;
		public E ElementData;
	}

SHC · June 20, 2017, 2:57pm

Instead of exposing the internal element class, won’t it be better to keep a static pool of element objects in the linked list class? That’s better for the API and easily fixes the corruption problems. I’d like to see how that performs over the standard list.

Jono · June 21, 2017, 10:00am

A long time ago I did a linked list pool (but only for int values).

This was much quicker than object-based linked lists, though I was using it for multiple lists that were often merged/appended. That’s a bit different from the use-case here.

Riven · June 22, 2017, 7:52am

Does your LRU cache need to be perfect? If not, I’d just create buckets of items, and transfer items among buckets, which is cache friendly.

[icode]List<List>[/icode] where the inner list is an ArrayList, and the outer list… probably too (as that is still bound to be faster than the LinkedList, even on random insert/remove) given that you probably won’t have more than 100 buckets for 10.000 items.

Icecore · June 22, 2017, 4:40pm

Its good concept – but its not so easy to implement as basic List
And about LRU cache
It sad, but it not help much – because Java hold pointer to object
And objects scattered all around random places on memory
So on iteration of array objects – it jumps around memory - its not cache friendly)

and this result most likely because you use Call function - not direct element change like in LinkedList


public E removeFirst(){
      
      E h = head;
      
      
      E next = h.getNext();
      if(next != null){
         next.setPrevious(null);
      }else{
         //next and prev are null, list is now empty
         tail = null;
      }

      h.setNext(null);
      
      head = next;

      size--;
      
      return h;
}

LinkedList


private E unlinkFirst(Node<E> f) {
        // assert f == first && f != null;
        final E element = f.item;
        final Node<E> next = f.next;
        f.item = null;
        f.next = null; // help GC
        first = next;
        if (next == null)
            last = null;
        else
            next.prev = null;
        size--;
        modCount++;
        return element;
}

theagentd · June 23, 2017, 5:46pm

@SHC
A pooled Node version would probably be slightly worse performance-wise when looping over the list due to the Node object and the element object being in two different places in memory. Also, without exposing the pointers, it becomes a bit annoying to loop through the list, but that’s no real concern I guess.

@jono
Interesting, it’d be cool to see a.comparison between that and my implementation.

@Riven
Hmm, how exactly would that work?

Icecore · June 27, 2017, 11:27am

I think I understand for what you need List – for pathfind
I use for this Binary Heap
My 5 years old code of Binary Heap is not best quality
So I find in google link - it looks nice and clean (better than mine ^^)

Jono · June 27, 2017, 3:01pm

To make it comparable it has to be turned into a doubly-linked list and be generalised to hold objects instead of just ints.

I tried this out, using elements with a “getID” method to enable the random access. It loses the cache coherency from storing data+references in one array, as well as having more overhead from the extra book-keeping of the previous links. In the end approximately 500x faster than java.util.LinkedList on the random access operations but roughly the same on the removeFirsts – there is a lot of variance between tests on this though.

In short, no good reason to move from your implementation. It moves the next/prev links out of the elements and into the single pool class, but at the cost of some efficiency.