Monthly Archives: July, 2016

Next Generation of the Echelon Reference Series

Echelon Game Design Logo

Echelon Game Design Logo

In my previous post I wrote about the evolution of the Echelon Reference Series. So far there have been four stages:

  1. Raw copy and paste aggregation. Ultimately not useful to me because it threw away so much information without gaining me much.
  2. Aggregating by source document, marking up in Word using styles specific to the problem domain. Ultimately too specific, not abstract enough, and my workflow was prone to error… as illustrated very clearly when a moment’s inattention blew away almost everything. Oops… but it cleared the deck for the next version.
  3. As second iteration, but slightly more abstract, making it easier to handle in code. More importantly, started using source control (as I should have been from the start). Also started to parse and gain information from the text itself, allowing automating linking and cross-referencing of content. Ultimately insufficient because it lacked fine control over layout, and the automated data extraction similarly lacked fine control. Released: ERS: Barbarians, ERS: Clerics, and ERS: Sorcerers.
  4. Revised document construction mechanisms. Made workflow more efficient by building a common index file, then linking and cross-referencing information as new content was added. While it is getting closer, I realized there is opportunity for another level of abstraction in content (which I’ll talk about below) and combining the document files (as opposed to data files) so I need only maintain one set for all six versions (see below) of each. Released: ERS: Rogues, ERS: Fighters, ERS: Monks, and ERS: Rangers in RAF (except Monks, which is WIP — Work In Progress, the next stage), and will release the other ERS class books and the ERS spell books (sample ERS: Elemental Wizard Spells, PWYW).

Now to describe the next generation of the Echelon Reference Series.

Next Generation Data Geekery

Grab your pocket protectors, this is going to be a ride.

The biggest limitation I’ve had with the Echelon Reference Series lately is around data selection, and redundancy in document scaffolding. What does that mean? Well…

  • Almost every title of the Echelon Reference Series has two main flavors: PRD-Only, and 3pp+PRD. The first has material only from the PRD (well, mostly… I don’t count my augmentations), the second includes select third-party material.
  • Each book in the Echelon Reference Series has three versions, at different stages of development. I started releasing the early-stage versions (at a discount, see below) so they at least exist, and I can then improve on them.
    • The RAF (‘Rough And Fast’, politely) version has the basic text content, but not much more. No diagrams, no additional useful redundancy (such as applying the archetypes to the associated class to build a ‘new class’ and see what it looks like when the archetype is used. While I do look for places where items are referenced (such as feats and skills) I might not have yet done it exhaustively. This version of the product sells at 50% off (75% off in bundles) because they’re not complete, and buyers will also get the WIP and Final versions at no extra charge when they come available.
    • The WIP (‘Work in Progress’) is more developed. The text has been gone over more thoroughly, and I’ve added many diagrams (but possibly not all). I probably don’t have the archetype classes in place, but I might have started organizing things better. This version of the product sells at 25% off (50% in bundles) because while they’re getting closer, they’re not actually done yet. Again, buyers will also get the Final versions at no extra charge when they are available.
    • The Final version has the diagrams and the archetype classes, and so on. I’m done with this, until I add more content. This version no longer has a discount on its own, but does have a 25% discount in bundles.

The releases that don’t have both PRD-Only and 3pp+PRD versions, and don’t have RAF/WIP/Final versions, are typically sample documents (such as ERS: Elemental Wizard Spells and ERS: Teamwork Feats).

I have found while working on them, though, that even though the earlier versions aren’t as complete as I’m aiming for, they’re still pretty good. The RAF version is, after I finish cleaning the text up, a pretty close copy of the source material, but organized and consistently formatted. I can see many people preferring this version, in fact. Similarly, the WIP has that and the diagrams, without some of the redundant text that provides context: I can see people preferring this version because it doesn’t have excess material, but presents the rest of the content in a more approachable manner.

As a result I’ve pretty much decided to release each document with RAF, WIP, and Final included, when and as available, with discounts for ‘buying early’ (before Final version). This means I’ll be maintaining six versions of each title, and the current framework… does not do that well. What was more or less manageable with two versions (PRD-Only and 3pp+PRD) becomes difficult with six versions of each document.

Document Workflow

A couple years ago I described workflows for extracting data from Word files, so I won’t describe it here again… except as ‘painfully complicated and prone to error’. I’ve largely worked around the problems, but even now I run into problems when a line ends (or starts? I forget) on an HTML element, in which case the space that should follow it gets removed. This leads to cases where I get text like

Bloodline Spells magic missile(3rd)

If you look closely, there is no space character between ‘magic missile‘ and ‘(3rd)’. I have not found an acceptable workaround for this that is easy and consistently effective.

Moving ahead, I will instead convert the Word files to ‘WordprocessingML’, an XML representation of the Word internal structure of the file. This starts in a useful character encoding (UTF-8) rather than the less than useful windows-1232, and more importantly does not need HTML Tidy (which appears to have a lot of influence on the problem). This means that once the content leaves Word it will be in happy XML, where it is easy for me to get at.

Document Creation, Single Sourcing

I realized that with some changes to how I capture the information I can probably get each title down to a single source document (plus my data store, of course). Major sections simply copy content from my data store into the output document. For instance, a chapter containing all the rogue talents contains the chapter title and introductory text, then a long list of object IDs of content to copy from source into this document. Right now the PRD-Only and 3pp+PRD rogue talent chapters are different files, but I realized that if the IDs are properly unique I can get away with a single list and plug into different data stores depending on version. If I plug into the 3pp+PRD data file all the identified objects will be copied, if I plug into the PRD-Only data file then many of the identified objects (the 3pp ones) won’t be copied. I achieve my PRD-Only/3pp+PRD split with no further effort… at least as far as data objects are concerned.

Text selection and formatting is a little more effort:

  • All PRD content ends up in the 3pp+PRD version, but there are sometimes entire chapters that exist only for the 3pp+PRD version (exalted domains are in ERS: Clerics (3pp+PRD), but not in ERS: Clerics (PRD-Only), so an ‘Exalted Domains’ chapter in the PRD-Only version would be empty and out of place). There needs to be a way to turn off certain content based on PRD-Only/3pp+PRD distinctions.
  • Final includes content not present in WIP and RAF (archetype classes, for example) and WIP includes content not present in RAF (diagrams, some expanded text). It’s easy to exclude the diagrams in the RAF version by simply ignoring the diagram instructions, but there needs to be a way to exclude text.
  • Because the content differs from version to version (that is, I might need six different sets of tweaks), there needs to be a way to include or exclude tweaks based on (PRD-Only, 3pp+PRD) and (RAF, WIP, Final) distinctions.

This actually should be easier than it sounds.

Word is very flat (except for tables): a chapter heading, a section heading, and a list nested within two other lists are all at the same level according to Word. The first two steps when processing the XML files created from these Word files are:

  1. Remove stuff I don’t care about. Word has a lot of overhead in the file, defining styles and whatnot. I don’t care about it, I get rid of it. This step includes mapping the content elements to other elements with attributes to be used later. For instance, a paragraph with ‘doc 4 Chapter’ (document level 4, chapter — ‘doc 4 indicating depth and so they’re ordered properly in the style manager, ‘chapter’ to remind me of the semantic intent) gets turned into <section outline-level=”4″ />. This happens with many elements.
  2. Build the document hierarchy, so each outline-level=’1′ element contains all following objects of lower (or with no) outline level, repeating until there are no more outline-levels… then do the same for list-level. (Incidentally the game objects live somewhere around outline-level=10… and For Reasons, stat blocks are considered lists)
    • Instructions to import files or copy game objects are also given outline-levels, which 1. keeps them from being nested incorrectly in other content, and 2. allows me to append content to them after importing.

This gives me a very easy way to solve my problem. I can assign attributes (exact mechanism not yet determined, I have many options) to the various objects so they are relevant only for certain builds.

  • prd means ‘include only in a PRD-Only build’.
  • 3pp means ‘include only in a 3pp+PRD build’.
  • raf means ‘include only in an RAF build’
  • wip means ‘include only in a WIP build’
  • fin means ‘include only in a final build’
  • !prd means ‘do not include in a PRD-Only build’
  • !3pp means ‘do not include in a 3pp+PRD build’
  • !raf means ‘do not include in an RAF build’
  • !wip means ‘do not include in a WIP build’
  • !fin means ‘do not include in a final build’

When processing, it is very easy for me to know which version I’m working on. The “Exalted Domains” chapter I mentioned earlier would be marked (either on the chapter itself or in the include instruction) as “3pp”, meaning it is only to be included in the 3pp+PRD version, while the “Cleric Archetype Classes” would be marked “fin”. The layout tweaks can also be marked with these values, so “3pp wip” means “do this tweak only if it’s the 3pp+PRD WIP version” (because all the PRD-Only and the RAF and FIN versions don’t need this tweak).

This largely solves my ‘scaffolding’ problem. A single set of input documents should now be transformable into six output documents, depending on flags set. I’ll need to make some changes in the make-the-final-document scripts, but by and large this should greatly reduce the file handling I need to do.

Data Hierarchy

The earliest versions of the Echelon Reference Series data store, at least after I started parsing the data, had very specific styles, from ‘class’ down to individual class subfeature types such as rage powers and bardic performances. The middling versions used some better abstractions and let me get away from being quite so specific, but when rendering the documents I had to examine the context of the object to see what it was. For instance, I would deduce that a particular class-subfeature is a rage power because it’s parent was a class-feature called ‘rage power’. This worked, as far as it went, but led to my ‘inserting’ parent data objects so the abstractions would work.

This had two unfortunate side effects, one minor and one more significant. The minor was that if I were to render the source document in PDF (as I often do as a data check, to verify the structure is correct) I would have extraneous objects in the document. This is ultimately not a big deal, but I found it jarring. The more significant effect was that it caused me to have many objects in the system with exactly the same ID. This was more troublesome.

It looks like the easiest solution consists of embracing abstraction. Much of the time I need only know that I have an object and what type it is (i.e. a label). It is still necessary to be able to nest the items, but the following seems to work well:

  • Replace all data-specific stat-block styles (spells and monsters are the most-used, but there are others) with ‘d20 Abstract’, ‘d20 Abstract Group’, and ‘d20 Abstract Sub’. These can be applied to all object types. ‘d20 Abstract’ is the most commonly used, ‘d20 Abstract Group’ provides a heading in the stat block (often seen in monster stat blocks), and ‘d20 Abstract Sub’ is a child object of a ‘d20 Abstract’, most commonly used so there can be more than one paragraph in a stat block field (such as a monster special ability that needs more than one paragraph to describe). These are actually identified internally as list items so they can interact and include lists.
  • Replace all game object styles with a combination of nine styles (three sets of three):
    • d20-1-Decl, d20-1-Object, d20-1-Section (Heading levels 1-3)
    • d20-2-Decl, d20-2-Object, d20-2-Section (Heading levels 4-6)
    • d20-3-Decl, d20-3-Object, d20-3-Section (Heading levels 7-9)
  • Add ‘d20 Attribute’, which adds or overrides meta information about the object, that isn’t game information. This is used mostly to provide processing hints, and doesn’t get used much.

The naming scheme seems odd, but is set up that way so they appear in my style manager in a useful order.

Functionally there is no difference between a d20-2-Object and a d20-1-Object, except that the d20-2-Object can be nested within a d20-1-Object… and because I declare the types explicitly now, this mostly does not come up often.

When I encode a character class, I can do something like

[d20-1-Decl] Class

[d20-1-Object] Rogue

(description goes here)

[d20-1-Section] Class Features

Rogues have the following class features

[d20-2-Decl] Class Feature

[d20-2-Object] Weapon and Armor Proficiency

(description)

[d20-2-Object] Sneak Attack

(description)

[d20-2-Object] Trapfinding

(description)

[d20-2-Object] Evasion

(description)

[d20-2-Object] Rogue Talents

(description… not including actual rogue talents)

The headings — all the styled paragraphs in the block quote above — are actually increasingly indented in Word, to make the hierarchy easier to see. They also show up in the navigation pane in tree format, making it easy to navigate.

While processing, I end up with an object (of type ‘class’) called ‘rogue’, with the ID ‘class.rogue’. This has descriptive text and a section containing five objects (second-tier, but still ‘objects’) of type ‘class feature’. These objects each have a description and are called respectively ‘Weapon and Armor Proficiency’, ‘Sneak Attack’, ‘Trapfinding’, ‘Evasion’, and ‘Rogue Talents’. Because they are inside another object, though, their IDs are slightly different: class-feature.weapon-and-armor-proficiency.rogue, class-feature.sneak-attack.rogue, class-feature.trapfinding.rogue, class-feature.evasion.rogue, and class-feature.rogue-talents.rogue. Each also has a ‘group ID’ (gid) that has the ‘.rogue’ suffix removed.

That each instance of the object now has a unique ID is incredibly valuable. It lets me to identify and refer to (or copy) a specific data object. In many ways they are equivalent (they do the same thing, and satisfy the same prerequisites… usually), but in some ways they are different (class-feature.evasion.rogue is gained at rogue second level, but class-feature.evasion.ranger is gained at ninth level).

This also makes it feasible for me to define ‘universal class features’ (provide a single standard definition for a class feature such as evasion). I can then change the class-specific definitions to the class-specific application (‘Rogues gain evasion at 2nd level’, ‘Monks gain evasion at 2nd level’, ‘Rangers gain evasion at 9th level’). It will no longer be necessary to define the class feature each time a class gains it, and more importantly it becomes reasonable to get rid of ‘gains evasion, as a 2nd-level rogue’.

Regarding the rogue talents above, the class feature describes the rules for rogues taking rogue talents (gained at 2nd level and every even level after that). The actual rogue talent definitions happen outside the class, mostly because most rogue talents are not defined in the class (in other supplements). Also, rogues aren’t the only class to gain rogue talents, so defining them outside the class makes it easier to ‘share’ them. I might put the following in another chapter (using d20-2-Decl

[d20-2-Decl] Rogue Talent

[d20-2-Object] Bleeding Attack (Ex, Sneak Attack Exclusive)

(description goes here)

[d20-2-Object] Combat Trick

(description goes here)

[d20-2-Object] Fast Stealth (Ex)

(description goes here)

[d20-2-Object] Finess Rogue

(description goes here)

This gives me four new objects, named ‘Bleeding Attack’, ‘Combat Trick’, ‘Fast Stealth’, and ‘Finesse Rogue’ (with IDs rogue-talent.bleeding-attack, rogue-talent.combat-trick, rogue-talent.fast-stealth, rogue-talent.finesse-rogue). I used d20-2 styles to show that it isn’t necessary to start at d20-1.

In case you’re curious, Bleeding Attack is given the ‘sneak attack exclusive’ type for clarity. “Talents marked with an asterisk add effects to a rogue’s sneak attack. Only one of these talents can be applied to an individual attack and the decision must be made before the attack roll is made.” is not terribly useful when asterisks are used in many places for different things, I prefer to be explicit.

And with a little bit of forethought, I can even prepare for the type tags to be objects themselves, so if I want I can define them in data and provide textual descriptions for them that I can present when needed.

Taxonomy

I had originally planned to write about object type taxonomy, but this article is already almost 2,900 words long! Next post!

Evolution of the Echelon Reference Series

After I finish releasing the RAF versions of the Echelon Reference Series, I’ll be rebuilding my workflow for data capture. It will be simpler and more abstract, and more powerful. I thought it might be interesting to show how this has evolved.

Origin of the ERS

Echelon Game Design Logo

Echelon Game Design Logo

The ERS started as a set of research documents for Echelon. There have been several iterations.

First Cut

Originally I just created a file for each topic — feats, spells, rage powers, etc. — and dumped all the relevant bits into the file. This worked for a while, but I started running into places I wanted to split the data up or organize it in different ways. The feat document could be split into item creation feats, metamagic feats, combat feats, style feats, and… the last two could be applied to the same feat.

And the prerequisites. The complexity of the prerequisites started to overwhelm me, and there wasn’t a lot I could do about it. First design scrapped, thankfully before I invested too much in it.

Second Cut

I started to use styles to mark data types, with a style for each data type of interest. I didn’t limit it to ‘feats’ and ‘spells’, either: rogue talents, advanced rogue talents, domains, all had their own styles (that mostly looked like Heading 3… I didn’t start changing the style presentation until later). Using heading styles for game elements made it easy to navigate my Word file because they’d show up in the navigation pane, but making them look the same meant it was easy to assign the wrong one.

I was manually saving the files as Word and ‘filtered HTML’ formats so I could do a series of transformations to parse and extract information. Everything was done in a single directory… don’t do that. Hundreds of source files, each combined with about eight intermediate files before hitting the final form made for a huge directory full of stuff I don’t care about, with important stuff inside it.

More intermediate files than that, even. After parsing the information I would spray out the individual items into their own files — each feat, each spell, each rage power, etc. — that would be picked up by a script for inclusion in output. I started building PDFs via LaTeX, but it wasn’t very manageable. It was also very easy to accidentally delete something.

… as I actually did. Got a new computer, was working on it and decided to blow away the ‘copy I’d made’, forgetting I hadn’t actually copied the files but created a shortcut to the folder to ensure I kept both versions in step.

Much to my surprise, after a moment of shock I wasn’t even particularly upset. No swearing or tears involved, even, just a deep breath. I was running into severe limitations and was ready to move on already.

Third Cut

This version saw the release of ERS: Barbarians, ERS: Clerics, and ERS: Sorcerers. This is where I started to abstract the data types. ‘Class feature’ had been around for a while, but I replaced all the individual feature types (rogue talent, etc.) with a more abstract ‘class subfeature’ type. This reduced the mental space needed to keep track of things, and simplified parsing later quite a bit.

Rogue talents and rage powers really parse about the same way, so why differentiate in code? I know from the parent class-feature what this is (class-subfeature with class-feature of ‘Rage Power’ means I’m looking at a rage power, right?) so I could mark it and move on. Lots of heavy encoding in the object markers, mostly for automation reasons.

Runeforger (Su) [1 Forgemaster’s Blessing; 2; 4; 6; 8; 10; 12; 14; 16; 18; 20] <channel energy>

A forgemaster may inscribe mystical runes upon a suit of armor, shield, or weapon as full-round action, using this ability a number of times per day equal to 3 + her Intelligence modifier. These runes last 1 round per cleric level, but inscribing the same rune twice on an item increases this duration to 1 minute per level, three times to 10 minutes per level, and four times to 1 hour per level. Erase affects runes as magical writing. A forgemaster learns forgemaster’s blessing at 1st level and may learn one additional rune at 2nd level and every 2 levels thereafter. Only one type of rune marked with an asterisk (*) may be placed on an item at any given time.

Using the runeforger class feature of the forgemaster cleric archetype as an example, The heading line has:

  • class feature name,
  • class feature type (Su, supernatural ability),
  • [1…20] showing the levels it is applied (first level gives ‘Runeforger (Forgemaster’s Blessing)’, then every even level up to 20th you get another runeforger choice), included mostly so I could automatically create the level table correctly (or at least automatically),
  • <replaces class feature> indicating that runeforger gets this instead of the ability to channel energy. <<double angle brackets>> indicates that this new feature only modifies an existing feature.

This actually worked, more or less. I could parse and render the items, and even eventually combine them (applying archetype to base class) to get the archetype class, complete with level table. I had almost no control over it, though.

Parsing, Prerequisites, and Pictures

Animal Companion Diagram

Animal Companion Diagram

This was also the version where I finally had the ability to extract and automate prerequisite links. This gave me the ability to draw pictures showing me the prerequisite relationships between game elements. The first pass would be machine drawn using GraphViz (software that takes a list of nodes and edges and draws the diagram — linked page includes sample diagram for the Improved Sunder feat), which was helpful for understanding but not very pretty when I tried to include them in the PDF.

I would then redraw the diagrams using PGF/TikZ, a LaTeX package that, well, draws diagrams. I’d write down where to put each node and how to draw the edges, then the image would be rendered and added to the output PDF. This let me put diagrams with hyperlinks in the documents.

For example, the diagram to the right shows the relationship between Animal Companion (top, second column over) and a bunch of feats (pale brown). In many cases Animal Companion is not sufficient, and the feat has other prerequisites (class features or subfeatures, mostly), and in others Animal Companion is one of several options to meet a prerequisite (one of animal companion or familiar or mount or divine bond… or some subset of these).

The other thing the prerequisite parsing and linking allowed was the automated discovery of ‘class-relevant’ feats. I could take the list of class features for the base class and archetypes, and look for feats that had those class features (or subfeatures) as prerequisites. I could then pull a copy of those feats into the PDF. Some of the choices ended up looking nonsensical because they required features from other classes as well, but since the barbarian’s uncanny dodge did meet the prerequisite I kept it.

Subversion Repository

This also is where I started working with an offsite Subversion repository. I had my formatting software on a server offsite and SVN allowed me to send minimal file changes from my workstation to the server, while making it almost impossible for me to accidentally blow the whole thing away again. I’m now creeping up on revision 2000.

Automated Conversion

In Second Cut I was manually saving the files as DOCX and Filtered HTML as I went. This was troublesome and annoying, and made it frightfully easy to get out of step — a couple times I ended up making the same change more than once because I’d saved as Filtered HTML and forgot to save it again as DOCX.

I wrote a program that would go through the ‘docx directory’ and convert all the files to Filtered HTML. Every time it was run. Very time-consuming and wasteful, but after learning of the ‘%’ file designator I switched to making the program a filter (single-input, single-output) and adding a make rule so only the files that were out of date needed to be replaced.

Limitations

The third cut did a lot, but it didn’t give very good control over the output. There was no facility for being selective: the druid got access to some cleric domains, so if I included domains in the druid book I had to include all the domains. I could add a note to the front of the chapter that said “druids get only these ones, ignore the rest”, but after including subdomains that amounted to about a hundred pages when I needed only a couple dozen (which is why I didn’t release the druids book under the ‘third cut’). It also gave very little control over spacing the like, the small tweaks that let me fix unfortunate object placement. That’s why this version sees some places where a feat name appears at the bottom of a page and the description on the next.

Fourth Cut

The lack of control over layout was getting to me, so I took steps to clean the data up some more and regain more control over the layout. This led to the release of the ‘RAF’ (‘Rough And Fast’) releases that grabbed all the information, but didn’t add the diagrams or useful redundancy (and have a ticket price of 50% less than the final version will). I no longer capture certain information (such as the detailed level progression of the class feature within the class) and reworked the entire linking process so it could work on individual files as it parsed them, and was more accurate.

This cost me some of my automated data discovery, though. I no longer have the tools in place (with the current data set) to generate the initial version of each of the books. It also does not handle duplicate items well, something that really showed up (to me) with Draconic Bloodlines, when it could not differentiate between the ‘Claws’ bloodline powers. It seems a minor thing, but it bothered me.

So, while I did release ERS: Rogues, ERS: Fighters, ERS: Monks, and ERS: Rangers in RAF (except Monks, which is WIP — Work In Progress, the next stage), and will release the other ERS class books and the ERS spell books (sample ERS: Elemental Wizard Spells, PWYW), I will be revising the entire workflow and data capture process to make it easier to move to the next step.

And I’ll describe that in my next post.