TreeML markup language

Over the next few years as the language becomes wildly popular I expect the major IDEs will recognize .treeml files and make an appropriate exception for tabs… :wink:

That document is as relevant as the reason it gives for the line-width of 80 characters :point:

Why are we hijacking this innocent little thread?

I’m sorry, I’m so sorry… ::slight_smile:

https://scontent.xx.fbcdn.net/v/t34.0-12/13292861_1574562892843747_1976431750_n.png?oh=73ff0a8a4f017449884fdb50be3d0d16&oe=574A8832

I think there is a case for a whole new computer programming language based entirely on the difference between spaces and tabs, and that it should be enforced as a matter of style to use Comic Sans to edit it. Perhaps it should be entirely defined in terms of overloaded ascii operators as well.

Cas :slight_smile:

You mean defined entirely in trigraphs of invalid unicode code points.

I just updated the Gist with my first version of my validating parser. Currently it only validates field names and order, with support for optional and repeating fields.

A schema for my careers asets looks like this (the “d” elements are dummy list items):


career: d, d
	id: single, d, d
	name: single, d, d
	description : single, d, d
	minimumLevel: single, d, d
	status: single, d, d
	reputation: single, d, d
	alignment: single, d, d
	wealth : single, d, d
	possessions: single, d, d
		token : d, d
	skills : single, d, d
		token : d, d
	behaviors: single, d, d

The schema language does not support recursion so the schema schema isn’t pretty:


token : single, list, token
	token : optional, list, token
		token : optional, list, token
			token : optional, list, token
				token : optional, list, token
					token : optional, list, token
						token : optional, list, token
							token : optional, list, token
								token : optional, list, token
									token : optional, list, token

I really like that language! :slight_smile:
Tiny suggestion, maybe?


career: 0-n
    id: 1 // <- 1 could be default multiplicity
    name
    description
    possessions: 0-n
    skills: 0-n
    behaviour: 0-n

To model more complex things such as XML Schema’s “choice” element, it would require some “meta” element/hierarchy with special syntax support, such as:


career: 0-n
    id
    name
    description
    #choice: 0-n // <- possessions, skills and behaviours may occur in any order
        possessions
        skills
        behaviours

The simple enumeration of attributes/properties of an object would be syntactic sugar for the “sequence” schema construct, which is implicit, but could be made explicit:


career: 0-n
    #sequence
        id
        name
        description
        #choice: 0-n // <- possessions, skills and behaviours may occur in any order
            possessions
            skills
            behaviours

It’s possible to get these structural assertions with nested name values without overly elaborating the syntax, i.e. keeping to the basic structure of nested name:values.

We add a new keyword like ‘flow’ to the value list. As its on the value side it does not force a restriction on node names. Any node with a flow token is treated as a structural assertion not an element definition…

The cardinality ranges are tougher - as each schema entry value is defined as a list of tokens, numbers are not allowed.


career: min0max20 //it's a valid token now :)
+   sequence : flow
        id
        name
        description
+      choice : flow
            possessions
            skills
            behaviours

I guess I could borrow from schemas and get recursion thusly:


root : ...
    moot : ...
        foot : ...
        soot : ...
        loot : ....
    woot :...
       ref : flow, shoot
+shoot : referenced, ....
+    ref : flow, shoot, optional

EDIT: added a refs node as otherwise the shoot would be interpreted as a valid root node of the document, which is probably not desired.

EDIT: Simpler, added a “referenced” keyword to indicate the node definition is neither optional nor required, and doesn’t in fact form part of the document flow unless it is reffed from elsewhere.

[quote]We add a new keyword like ‘flow’ to the value list.
[/quote]
Yeah, that is a great idea.

[quote]…numbers are not allowed.
[/quote]
Okay. I thought your schema language was just your actualy TreeML language, where you also showed examples of numbers as values.

Yes this is pure treeml. But if you have a list of values like:

field : val1, val2, val3, 4

The values in the list have to be of the same type. So 4 would be a violation, as val1, val2, and val3 are tokens/strings. Tokens are not allowed to start with a number. But I’m not sure i really need cardinality.

Using the proposed syntax for recursion cleans up the schema schema (also adding in the missed definition of enums):


token : single, list, token, enum, single, optional, string, token, integer, decimal, boolean, empty, list, set, flow
   ref: optional, flow, token
token : referenced, list, token, enum, single, optional, string, token, integer, decimal, boolean, empty, list, set, flow, referenced
    ref: optional, flow, token

Basically everything after enum is an enum value, like a varags argument.

Cleaned up the syntax for recursive definitions slightly. I hope to add support for enums and recursive definitions shortly to the parser.

The parser does now read data types from the schema if available, which is important as I do not have an explicit list syntax. A single item list is parsed by default as a non-list. I probably need to change the syntax from:


-    oldListField : singleValue
+    newSyntaxListField : [singleValue]

I have also been thinking about adding an abstract parser/DOM interface to keep the parser DOM/datatype-agnostic and therefore to let the client decide how it wants to parse the syntax elements into a document object model.

Something like this:


interface ParserInterface {
  /** 
   * Called whenever a new object is parsed.
   * 
   * The client returns an arbitrary Object identifying the newly parsed object.
   * This will then be used as the 'owner' for a parsed value for {@link #onValue}, a parsed property
   * with {@link onProperty} or as the 'parent' in a subsequent invocation of {@link #onNewObject}.
   */
  Object onNewObject(Object parent);

  /**
   * Called whenever a new value in the value list of 'owner' was parsed.
   */
  void onValue(Object owner, Object value);

  /**
   * Called whenever a property was parsed belonging to 'owner'.
   */
  void onProperty(Object owner, String name, Object value);
}

The ‘owner’ and ‘parent’ Objects could be anything the client wants them to be. They just need to have identity for the client to associate properties/values with them. It could be a simple java.util.Map or something different. But the thing being: The parser need not care what it is concretely as it only communicates with the client via that interface methods.

Yes, parsers using callbacks are a nice way to separate parsing logic from your data-model, even if you only have 1 interface implementation.

This is kind of like SAX for TreeML… :slight_smile:

However, I need to curb my enthusiasm and not implement more features than I strictly need right now.

One thing I would like to have (that you don’t find in schema languages generally) is the ability to define enums as a query on another document. For example, a career lists a number of skills, mapped to the maximum progression possible for that skill in that career. I could define an enum in the career-schema.treeml file defining the expected skills, but that is fragile, too much work, and beyond my mental ability to keep straight. Instead I would like to define the enum as a reference to the careers.treeml file:


    enum : flow, skills, args, "skills.treeml", "skill/id"

Then my career schema is always in synch with the skills list! I’d apply the same thing to the list of career possessions, etc. The above example is not quite well formed as the last two arguments in the value list are not tokens. Also the “filename” should be considered a logical resource name and not a system dependent file name.

I hear XPath for TreeML coming… :slight_smile:

All this effort just to avoid dem angle brackets.

Too upset after the political turmoil in Europe to do anything heavy, so I just worked on adding schemas to all my TreeML resource files. I also came up with a neat way to specify dependencies between files - rather than baking it into the schema language, which was complicated enough, I will just define a separate simple format to specify the interdependencies. something like:


dependency :
	id : itemsConsumedByRecipe
	referencesFrom :
		resource : "resource/encyclopedia/recipes.treeml"
		path : recipe, itemsToConsume, token, nodeName
	referencesTo :
		resource : "resource/encyclopedia/items.treeml"
		path : item, id, nodeValue
dependency :
	id : toolsForRecipe
	referencesFrom :
		resource : "resource/encyclopedia/recipes.treeml"
		path : recipe, tools, token, nodeName
	referencesTo :
		resource : "resource/encyclopedia/items.treeml"
		path : item, id, nodeValue
...

Tomorrow I will implement a little dependency checker that runs off this; it should be pretty simple to write. Once it is running I will have a lot of broken references to fix because I was less disciplined about adding references lately because I knew automated dependency checking was in the works :slight_smile:

The dependency checker is up and running and working a treat! I specify a path within a “from” schema and a path within a “to” schema and the links for all documents matching those schemas are validated during document load. It’s a huge productivity boost as i can just define the dependencies one by one and then walk through the errors and add the missing items, creatures etc.

This is really starting to take shape :slight_smile:

Checking dependency: adultForCreature
Checking dependency: partsOfCreature
Checking dependency: productsOfCreature
Checking dependency: skillsOfCreature
Checking dependency: attacksOfCreature
Checking dependency: skillsRequiredByItem
Checking dependency: itemsConsumedByRecipe
Checking dependency: toolsForRecipe
Checking dependency: productsForRecipe
Checking dependency: skillsForRecipe

I’m sailing close to post-locking this thread, but anyway. I just realized today with the dependency definition format, I can simplify the schema language because I never need to define enums in schemas. For example, instead of defining an enum in a schema for wealth categories (impoverished, poor, prosperous…), I can define a dependency on file where I define the allowed values. This has advantages because I can now give numerical attributes to the enum values, add description texts etc.

To go on with the discussion about tabs, spaces, and indentation… nobody mentioned Python.

One pro to making tabs scope rather than {}'s can be found in a comparison between Python and JSON tab formatting. It is the same thing writing a string like…

“example” : {
“example” : {

}

}

and something like this

“example”:
“example”:

So I think the way he has his scoping set up is really the best way. At least it isn’t like a switch in java where in order to have a scope-per-case, you need to do

case ENUM: {

}
break;