So i have ebook reader. And naturally want to create a Gutenberg book downloader so first time users can read something out of the box.
Fine. So i envisioned a library panel, over GlazedLists, like i already have for the local files, but that instead of showing all possibilities only begins to shows after 3-4 characters inserted. Should be enough for filtering right? And works almost the same way - i like consistency.
So i need a way to search. Looking at the Gutenberg site there is a rdf file 5 mb zipped. Wunderbar i think. Actually it is 100 mb unzipped - so don’t unzip.
So then i need a indexation method for rapid searching of rdf. Lucene comes to mind. Google finds LuceneSail easily.
First Problem : LuceneSail indexes and duplicates the text, so those 100mb become 210mb in the disk somewhere. But searches are fast at least.
Second Problem : Indexing takes forever (5-8 minutes) and too much memory for the main program. The too much memory for the main program can be alleviated if you have a monster machine and lots of memory. Then i can do this :
/**
* This method creates a new process that will run a new jvm
* on the main of the given class, with the selected arguments.
* It already flushes the output and inputstream of the forked jvm
* into the current jvm.
* The forked jvm uses the same java.exe and classpath as the current
* one.
* @param javaClass class with main method
* @param args jvm properties.
*/
public static void forkJavaAndWait(Class klass, String ... args) throws IOException, InterruptedException{
String javaExe = System.getProperty("java.home") + File.separator + "bin" + File.separator + "java.exe";
String classpath = System.getProperty("java.class.path");
List<String> l = new ArrayList<String>(4+args.length);
l.add(javaExe);
l.add("-cp");
l.add(classpath);
l.addAll(Arrays.asList(args));
l.add(klass.getCanonicalName());
ProcessBuilder pb = new ProcessBuilder(l);
pb.redirectErrorStream(true);
final Process p = pb.start();
//process builder stupidity (would need 2 threads if redirectErrorStream(false))
new Thread(new Runnable(){
@Override
public void run() {
String line;
BufferedReader bufferedStderr = new BufferedReader(new InputStreamReader(p.getInputStream()));
try {
while ((line = bufferedStderr.readLine()) != null) {
System.out.println(line);
}
} catch (IOException ex) {
Logger.getLogger(IoUtils.class.getName()).log(Level.SEVERE, null, ex);
}
}
}, "ProcessBuilderInputStreamConsumer").start();
int e = p.waitFor();
if (e != 0) {
p.destroy();
throw new IllegalStateException("couldnt fork the java process, error code "+e);
}
}
Third Problem : But the files are not deleted for some stupid reason if the java process is killed (in a finally in the given class main - have i to use a shutdown hook or the SignalHandler?).
What would you prefer:
-
stupid search that scraps project gutenberg webpages and doesn’t show possibilities as you type.
-
Smart search that shows possibilities after some typing and eats 210mb.
2a) and that you need a beastly machine to use takes 5 minutes to create (once) or update (more than once).
2b) and that you need to download a (24mb) zipped index and unzip it (once).
2c) and that is created on the installer (that i don’t have now) and works as 2a.
