TreeML markup language

I posted this under shared code a while ago, but I have been using this for real stuff for around a year now, so I decided to make it a real project.

TreeML is a YAML-like, JSON-like structured document language. It parses tab-indented files into trees of nodes. It supports lists, integers, floats, booleans, strings, tokens and explicit nulls.

How does it differ from YAML? It has a LOT less features and and the parser is one class. It also uses tabs and and doesn’t support arbitrary indentation levels.

How does it differ from JSON? It uses curly brackets or indentation for nesting, and doesn’t force you to use quotes around string tokens. It also allows repeated keys in maps (which, by extension, are not maps). It has a schema language.

Some notes about TreeML best practices:

  1. Don’t use it for operating a nuclear facility.

  2. It may or may not work. So far it seems to work for me.

WIP posts need a picture, so here is a picture of the Christmas Tree Nebula:

It’s not my picture, credit goes to: http://www.guidescope.net/nebulae/cone.htm

I have decided I need a schema language to keep my huge Vangard asset files sane, so I’m working an adding ultra-lightweight schema support.

Sources:

I always tip my hat to people writing parsers by hand. :slight_smile:

BTW: Here is a working JavaCC grammar for your language, which accepts your example file: https://gist.github.com/httpdigest/ce6479b041dec9efe3b42b54f6836461

One thing to note: It creates a standard Map that contains the keys and the values are simple Strings or Integers or Maps themselves. In the case of multiple values for the same key, the Map then contains an ArrayList under that key containing the values.

Nice! … but, tabs? :-\ Doesn’t everyone else set their IDE and text editor to output spaces instead of tabs? :persecutioncomplex:

What’s wrong with tabs?

Tabs are dreadful if they are mixed with spaces for indentation.

I opted for tabs because 1 tab means 1 level of indentation. The mapping of spaces to indentation is arbitrary.

I use spaces for everything else though. Had to set up an exception in IntelliJ for treeml…

Wow, this markup language is almost exactly like the one I wrote for my game ;D

It’s an official, for-real markup language when it has two different parsers written by two different people! :slight_smile:

Absolutely!
You should create a GitHub repository and deploy to Maven Central. :point:

Negative floats! :point:

Yep. Indentating with spaces is cancer :slight_smile:

Only freaks use spaces for indentation, because proportional fonts.
But then only freaks use proportional fonts.

However: whitespace as a syntactical feature of any language? Just reprehensible. You should be hung.

Cas :slight_smile:

Given that programmers slavishly follow whitespace rules in languages even where whitespace is insignificant, and bad indentation is one of the worst developer thought crimes, why not make indenting the law?

yet, indentiation is visual and doesn’t share much with code semantics. using a code formatter is asking for trouble.

Because every project I work on or with uses spaces by convention. Standard Java style is spaces, or spaces and tabs (yuck!). This means my editors and IDE are set to output spaces by default. Even @ags1 mentioned he uses spaces everywhere else and has to put an exception in his IDE to output tabs instead for this. IMO, that’s where this falls down and becomes less useful.

I also agree with @princec that whitespace as a syntactical feature is horrible (although I don’t on spaces - who codes in a proportional font - you’ll be using Comic Sans next)

Praxis LIVE has a similar-ish format for its patch files, except the syntax is based on Tcl and so uses braces for structure. I have some similar parser code (though there’s no way @KaiHH is writing a JavaCC grammar for it! :wink: )

Standard Java style is tabs, not spaces.

You can see now why whitespace with semantic meaning is a disaster.

Cas :slight_smile:

er, no it isn’t, unless it’s changed since this http://www.oracle.com/technetwork/java/javase/documentation/codeconventions-136091.html

Indentation is 4 spaces - tabs must be set to 8 spaces. Which means you either have all spaces or a mix, but it’s not possible just with tabs. The default NetBeans formatting is definitely based on 4 spaces.

if there is an option “replace tabs by spaces on save” - i cannot imagine why one woudln’t check it and forget about tabs altogether.

Since Eclipse has used tabs by default for code formatting since before it was even called Eclipse that sort of made tabs a defacto standard.

I like tabs because they don’t require as many keypresses to navigate around, or indeed to type in the first place.

I make extensive use of //formatter:off/on too in my code.

Cas :slight_smile:

A disaster ONLY if the language syntax allows a mix of tabs and spaces.

Also, to be totally correct I am using the tab character as semantically significant, not the whitespace it is represented with in the editor. There is a difference. You can configure your editor to represent the tabs with one space or eight (or even zero), but that doesn’t change the meaning of the document.

I could use a different character without a whitespace representation if that would make more sense (but it wouldn’t).

I opted for tabs because spaces were worse. Would I hardcode that a specific number of spaces equals an official indentation level? Or would I follow YAML and allow any change in indentation to be significant? In my opinion the YAML approach is just nasty and dirty, and hard-coding a number of spaces (or allowing each user to configure it somehow) means hard coding a visual preference into every treeml file. Tabs feel wrong, but in this case they’re right.

Here’s what the YAML spec has to say about indenting:

This was the inspiration for treeml, after reading that paragraph I knew I would never willingly use YAML :slight_smile:

Also, regarding proportional fonts: I don’t use proportional fonts but spaces are fine for indenting a proportional font file. The spaces are at the beginning of the line (i.e. with zero characters before them) so proportionality is not a factor.

The crux of the issue is that most people who works with space indented projects (and I don’t currently work with anything Java or otherwise that doesn’t use spaces) has their editor set up to output spaces when the tab key is used.

I’m currently working on something with lots of YAML, and I don’t disagree with the “nasty and dirty” comment, but for me using tabs is even more of a PITA because of the above issue.