Working with a custom General Tree... advice needed.

Demonpants · April 18, 2006, 3:55am

I have a game that uses an AI to find the fastest route from one Wikipedia page to another by cycling through links. This StorageTree does all of the logic, for the most part, and works except for one problem. Wikipedia is immense, and so recursive page loading effectively does not end until the goal is found or the timeout is reached. I’m fine with that, it’s all neccessary, of course, but there is one problem:

Currently as elements are added to my tree they are added in preorder - from the leftmost childnode to the parent nodes and finally to the right nodes. The problem is that I want to add pages on a levelorder basis so that I check every link on a site before moving on to sites further away from the original root. Basically I want to add items in the order of their depths in the tree, rather than the usual recursive way to do things.

I’ve thought of a few ways to do this, but this obviously needs to be efficient so it can juggle the most pages at once, and the ways I was thinking of involve a lot of wasted memory. Anyone have any ideas?


/**
 * The StorageTree is used by WikiRipper to save all possible moves
 * in an efficient and easily accessible manner. It is a basic general
 * tree with an array of nodes to each leaf. These nodes represent each
 * article, and their next nodes are all articles they are linked to. Upon
 * creation, the StorageTree constructs itself using timeout, start, and end
 * provided by the WikiRipper.
 * @author Eli Delventhal
 *
 */

import java.util.LinkedList;
import java.util.TreeSet;
import java.util.Iterator;

public class StorageTree implements java.io.Serializable
{
	private static final long serialVersionUID = 1L;
	private Node root;
	private int timeout;
	private long startTime;
	private String goal;
	private TreeSet added; //any article added already is put into this list
	private Node found; //if this node is not null then a goal has been found
	
	public StorageTree(int to, String start, String end)
	{
		//The start time is used to test when to stop loading
		startTime = new java.util.Date().getTime();
		added = new TreeSet();
		timeout = to; goal = end;
		root = new Node(start);
		addNode(root);
		clear();
	}
	
	/**
	 * addNode is a recursive method to add this node to the tree, and then
	 * it calls itself for each of the node's links.
	 * @param n   The node to add
	 */
	private void addNode(Node n)
	{
		//Halt in case of a timeout
		if (new java.util.Date().getTime() - startTime >= timeout*1000)
			return;
		//Halt in case the goal has been found
		if (n.data.equalsIgnoreCase(goal))
		{
			found = n;
			return;
		}
		//Halt if the goal has already been found
		if (found != null)
			return;
		
		//Get the links out and remove any that are already in the tree
		LinkedList l = getLinks(n.data);
		for (Iterator i = l.iterator(); i.hasNext();)
		{
			Object o = i.next();
			if (added != null)
			{
				if (added.contains(o))
					i.remove();
				else
					added.add(o);
			}
		}
		//Create an array to represent all links from this Node
		n.next = new Node[l.size()];
		
		//Cycle through every link and call addNode for each of
		//those Nodes, and also set their parent to this Node
		int index = 0;
		for (Iterator i = l.iterator(); i.hasNext(); index++)
		{
			n.next[index] = new Node((String)i.next());
			n.next[index].parent = n;
			addNode(n.next[index]);
		}
	}
	
	/**
	 * Called after the search is over or the goal has been found,
	 * clearing out all variables used.
	 */
	private void clear()
	{
		added = null; //added is no longer needed, so clear it
		//Although this seems crazy, if a goal was found then there is
		//no reason to preserve any other parts of the tree.
		if (found != null)
			root = null;
		goal = null; //We don't need to know the goal anymore
		startTime = timeout = 0; //Clear startTime and timeout, uneeded
	}
	
	/**
	 * Pull all the links out of an article and return them as a LinkedList.
	 * @param article   The article on Wikipedia to load
	 * @return   Every link found within that article
	 */
	private LinkedList getLinks(String article)
	{
		LinkedList choices = new LinkedList();
		String page = ConnectionManager.loadPage(article);
		String search = "<a href=\"/wiki/";
		for (int i = 0; i < page.length() - search.length(); i++)
		{
			try
			{
				if (page.substring(i,i+search.length()).equals(search))
					choices.add(page.substring(i+search.length(),page.indexOf('\"',i+search.length())));
			}
			catch (Exception e) {}
		}
		return choices;
	}
	
	/**
	 * Returns the absolute best choice to make, but returns null if there
	 * was never any goal found with the provided timeout.
	 * @param location   The location to get the best choice from.
	 * @return   The article of the best choice.
	 */
	public String getBestChoice(String location)
	{
		if (found == null)
			return null;
		Node search = found;
		while (search.parent != null)
		{
			if (search.parent.data.equalsIgnoreCase(location))
				return search.data;
			search = search.parent;
		}
		return null;
	}
	
	/**
	 * Mostly for debug reasons, this method prints the path, from the
	 * found Node upwards, of the best choice.
	 */
	public void printBestChoice()
	{
		if (found == null)
		{
			System.out.println("Wasn't found... will instead print tree.");
			printTree();
			return;
		}
		Node search = found;
		while (search.parent != null)
		{
			System.out.print(search.data + " ");
			search = search.parent;
		}
		System.out.println();
	}
	
	/**
	 * Prints out the entire tree in preorder.
	 */
	private void printTree()
	{
		printNode(root);
	}
	
	private void printNode(Node n)
	{
		if (n == null)
			return;
		System.out.print(n.data + ": ");
		if (n.next == null)
			return;
		for (int i = 0; i < n.next.length; i++)
			printNode(n.next[i]);
		System.out.println();
	}
	
	private class Node
	{
		public Node[] next;
		public String data;
		public Node parent; //parent is neccessary to find the fastest path
		
		public Node(String d)
		{
			data = d;
		}
	}
}

Mr_Light · April 18, 2006, 7:21am

you could just hold a collection branches and iliterate once over it for every level calling foo() where foo puts more branches in the collection or removes itself (and offcourse gets the leafs)

Demonpants · April 18, 2006, 8:18am

Yeah, I was thinking that, but do you think it will get absurd or use up too much RAM? I was thinking also maybe create a Queue of String[]'s and have it enqueue every level and then dequeue them all, and don’t move to a new row until the Queue is empty, but, again, the possibility of memory waste arises…