XML as a file-format for games dev

This has been discussed a couple of times, and at the moment I’m in the middle of writing a web-based game which relies heavily on anything that reduces dev time (if I had more time, I’d be working on JGF!). XML has come up several times as an option for file-format, and I’ve finally switched to it for some config files. I’ve used XML before on lots of projects like this, using everything from full custom SAX decoders and schemas, through bastardisations of Sun’s XMLDecoder, and even using XMLDecoder “pure”. I’ve looked back at the old threads on here, and none seemed to come to much in the way of concrete conclusions, so I thought I’d start an article on my experiences and ideas, doing this for the N’th time :wink:

At the moment, I’m still just jotting down ideas and issues as I explore the different options. More sections will appear on this page over the next few weeks, then I’ll write it all up. Of course, keep your fingers crossed that JGF happens to be up at the time (I’m still having to reboot the server once a day :frowning: :frowning: :frowning: ).

http://javagamesfactory.org/views/proposed-article?id=11

Wel… I’;m using XML for config and discovery stuff in Darkstar.

As many of you are aware I’m a big hater of the “XML for Everything” movement, but for info like this which is hriearchically orineted, has to be half-way human readable/writable and/or is going to get pumped across text-oriented bridge like HTTP, and where wriite/read performance isnt of primary importance, it CAN be an (ugly) time-saver.

I still say the real meaning of XML is eXcessively Messy Languages and that a custom data labguage would almost always be a neater more elegent solution, but it CAN be a useful short-cut when you “just don’t care enough to send the very best.” :slight_smile:

It seems, that your linked article is broken in respect to the xml-examples… at least in firefox and IE.

One very simple way to use XML in your application is XStream http://xstream.codehaus.org/tutorial.html.

…at which point, Sun’s Properties object becomes entirely useless :frowning:

  1. not-yet-an-article :slight_smile:
  2. the “xml examples” are merely copy/pasted XML fragments…shouldnt be anything in there to be broken?

I used XML in a previous version of my game (well, one day will be a game, I hope :D) but hated it. Easy to make mistakes if I edit it with a text editor, difficult to read, lack of a fixed DTD (because I keep adding properties and classes to my engine) so I couldn’t validate, etc. What’s the advantage here? So one day I changed it all to properties files and what a difference!

[quote]…at which point, Sun’s Properties object becomes entirely useless :frowning:
[/quote]
Really? I have had no problem with hierarchical properties. Just implemented a subclass of Properties with two modifications:

-The ability of include a properties file within another (so my config files can be well structured and remain small enough to be easy to read and edit manually), this can be made scanning the file prior to loading it or simply establishing a special property name, in my case, every one that ends in “.include”.

-A collapse function that scans all the keys and extract the ones that start with a given prefix, then construct a hashtable<String, MyProperties> like this:

Original MyProperties p:

sound.music.file=data/music.xm
component.title.class=Image
component.title.file=data/title.png
component.label.class=Label
component.label.font=anime
component.label.text=Hi there!

p.collapse(“sound”):

music -> file=data/music.xm

p.collapse(“component”):

[i]title ->
class=Image
file=data/title.png

label ->
class=Label
font=anime
text=Hi there![/i]

(Easier to show it than to explain it in English xD)

After collapsing I can iterate over the keyset (names of the “objects”) and get the properties of each one. I find it really easy and intuitive to use. Perhaps the collapse method isn’t very optimized but I only use it at loading time, so…

sounds like your structure of your xml is wrong.

properties files are quite usefull for simple keysets, take internationalisation for example.

xml for configuration well overall I have good experiences, but I’ves encountered mostly wel structured well setup examples. I do have seen very bad, I’d almost say abused xml.

[quote]sounds like your structure of your xml is wrong.
[/quote]
Could be, of course. But I don’t know how a game that extends some components from my 2D engine (i.e., a new Text that looks like a comic book speech balloon) can be validated from the engine without providing a new DTD that includes (by hand, I assume, don’t think you can include a DTD inside another, perhaps with schemas… I’m not a XML guru by any means, I fear) all the standards components and the new ones… And I don’t think it is really worth the extra effort. If someone can show me a different approach to validating, please, post.

Besides that (and knowing that a simple and obvios solution can exist), I find much more easier to use my new classes and iterate over arrays and read from hash tables than to navigate between nodes of an DOM tree (and I have done both things. ;)); and think that a properties file is much more human-readable (and human-editable!) than XML, both being valuable advantages, from my point of view.

But I’m not trying to persuade anyone, of course, ;D just showing some alternatives that perhaps can spark some ideas…

I like the idea of XML based levels for games that are moddable by the community. Yes it may leave things more open than some would like,but I think the freedom it gives the community is worth it.

I love a bit of rapid development myself. For one project I wanted the user to be able to configure the project with XML. But for the system to be neat you really need all the options to have an object representation. Parsing the XML and building an object representation is tedious, especially if the XML structure keeps changing and invalidating your parse code. The solution I came up was:-

  1. Pick a single package for your object representation to reside in. Fill it full of very simple objects that can be connected into a heirachy and also give them String attributes for things you would like the user to configure.

2.Create a reflective parser. Basically this just recurses the DOM or SAX and for each Element it sees it instanciates a class whose package name is (1). The String attributes of the element are copied into memebers of the new object of the same name*. The difficult bit is deciding how you connect the heirachy up. In my non-generic case things were fairly simple and it was enough to either :- add a child object to a parents attribute of type collection iff one existed OR assign the child object to a varable of type ChildObject if one existed.

Obviously this approach is not fully flexable for all possible XML structures, but the ones it does not fit are probably poorly designed. The great thing is though that the workflow for adding options is just to add new attributes to your object representations or create new classes in the package.

PS Actually I did have to intergrate my parsing stratergy with an independantly developed parser for some elements of my XML. To make your XML fully configurable I created a ParserConfig. This was a big HashTable of ElementNames -> ParserObjects along with a default. The parser config then gets to decide which parser handles what element types. In my case the defualt parser was the reflective parser. This alowed me to write my special cases for Element types that were a bit tricky in some respect.

  • PPS actually I also had a bit of intelligent parsing here. If the target attribute was an int, then Integer.parseInt was called on the XML attribute first to assign it.

uuurgh.
I hate XML with a passion. It all SEEMS like such a good idea. And then you have to trawl through the specs for JBI, Web services, and just to throw a spanner into the works, BPEL (or rather BPEL4WS as we haven’t moved to version 2 yet). Using the PXE BPEL engine you have to write …

  1. an XML WSDL file to expose the actual BPEL service itself,
  2. The BPEL process, which is a state based language with procedural elements and is the ugliest misabuse of XML I’ve ever seen,
  3. The BPEL deployment descriptor, also in XML
  4. An extra XML file for each of the WSDL services you’re planning on using in the BPEL process, to expose their partnerLinks

All of these,needless to say, are completely interdependent, with various element names you have to drag from a variety of different depths in each others hierarchies and use correctly in another file. And did I mention the namespaces ? Hundreds of them. HUNDREDS ! and woe betide if you get a single thing wrong …

-ahem- Okay, thats my little OT rant over,

back ON topic, I’ve used (thankfully) much simpler XML scripts for quite a while for configuration of things like models, shaders, animation effects and so on. Something like the following:

`

<?xml version="1.0" encoding="us-ascii"?> `

I find its great for stuff like that, simple, didn’t have to bother writing a parser, easily editable.

D.

How did you avoid writing a parser? How did the values in the DOM model or SAX or whatever get into your java objects? At some level you would have needed a parser surely…

The easy way: XStream
A bit more complicated and read only, but very flexible: Digester
Maybe a bit of an overkill, but extremely powerful: Spring Framework

The latter seems odd at first, but you can use the framework just to build object hierarchies out of xml files.

There is also Relaxer.

[i]"Relaxer is a Java class generator that operates on a XML document defined by a RELAX grammer.

By using Relaxer, no tedious DOM programming is required to make a XML aware program.

Relaxer creates a set of Java classes that form an object hierarchy that is logically equivalent to a DOM tree, but is easier to use programmatically."[/i]

Never used it tho… but I know that Kenta Cho used it for some projects.

okay, so minimal code then. What I meant by a parser in the OP was having to parse the file itself, getting tokens in and REing them to get values etc etc. The various XML libraries make this bit redundant.

D.

I thin kthe point is that the parser is a standard component, thanks yto all XMl docs sharing a common syntax. There are quite a few pacakges that will proiduce complete parsers for you from a DTD

Having said that, I think this is a VASTLY overrated feature.

Automatic parser generation has been a reality, ina much more flexible form, since I was in college (and thatw as abck in teh days whenw e wrote flow charts on cave walls) through YACC and Lex. I have a stadnard parser I wrote quite awhile ago that has “expect…” and “get…” calls and using it I can write a parser for any basic text data language in a few lines of code.

[quote]The easy way: XStream
[/quote]
Oh wow. That Xstream is good.
goodbye ReflectiveParser, the same idea but only partially implemented!

[quote]Automatic parser generation has been a reality, ina much more flexible form, since I was in college (and thatw as abck in teh days whenw e wrote flow charts on cave walls) through YACC and Lex. I have a stadnard parser I wrote quite awhile ago that has “expect…” and “get…” calls and using it I can write a parser for any basic text data language in a few lines of code.
[/quote]
Yeah. I know that. Yacc etc though still need configuration. I wanted a solution that worked with no configuration or programming and that would update itself when you change all your model classes.

Sure, but YACC still isn’t properly documented (at least it wasn’t circa 12 months ago last time I looked, still just says basically “if you already know how to write grammers, you can use this, if not, go learn”) and is insanely difficult to use for non-academic purposes :(.

NB: this is speaking as someone who has used all this suff in the past, written grammars, parsers, and parser-generators, etc. Its just that when you haven’t done it for years, you know it’s trivial, but the bloody software is written with not a single thought for making it usable - it punishes you for not still having memorized all the domain-specific jargon (which has no purpose being in a tool, which is what yacc etc should be).

So, yeah, YACC and friends can do it all. But … the XML stuff is aimed at real people, so unsuprisingly is a heck of a lot more popular :). SAX, bar knowing two gotcha (to do with the chars() method buffering, and with whitespace reading), is trivial to use and you can use it immediately with no special knowledge. That is how life should be :).

YACC is complex msotly because they chose LALR as their grammer model for its felxability.

If you instead chose LL(1) then you cna describe a full parser with EBNF which IMHO is a whole lot less arcane then DTDs.

I actually designed an EBNF driven parser generator awhile back but others have done simialr things. ANTLR being a good example.

Having to write a grammer though totally gets in the way of productivity. You have to then maintain that grammer in situations where you saving format is changing all the time. Plus grammers in general are normally pretty verbose. Also YACC is not java …

Anyway, so I like the whole reflection method of parsing XML. You don’t need to maintain anything apart from your model which you would have to do anyway.