Legitimate Text Processing

Update: Hours after this post went up, and Jeff Atwood renamed his fork of Markdown to “Common Markdown”. All the criticism below is still 100% valid. They’re making a project that suits their own needs, but using a (new) name to suggest some level of primacy over other Markdown dialects, and Markdown itself.

Markdown was made by John Gruber, and it’s a great way to turn easy-to-read text in to unreadable HTML. It’s a limited set of syntax for things, that can be expanded on. It’s become wildly popular, particular for blogging, or for web sites that have comment systems. It’s even influenced Fountain, a similar specification for writing heavily formatted screenplays with just plain text.

Some people have to write code that supports processing Markdown text in to HTML. Many people hewed closely to what the original tool generated, which makes sense. Then people ran in to cases that weren’t precisely explained, or areas where the tool didn’t have what someone wanted. This means that sometimes things will produce different HTML code, but even that doesn’t always look wrong in the browser. More often than not, people wanted to add on features.

People called their Markdown implementations something clever. Like “GitHub Flavored Markdown” or “Python Markdown” or “Kramdown” or whatever. So here you have a ton of little tools that do slightly different things — either intentionally or unintentionally.

This drives some people nuts because they want there to be a proper way. They want the correct way. That’s cute, because the same people that want a correct version to refer to, and test against, are the same people that make their own syntax features.

Here, let this guy lacking self-awareness explain how he oversaw two different Markdown implementations:

We really struggled with this at Discourse, which is also based on Markdown, but an even more complex dialect than the one we built at Stack Overflow. In Discourse, you can mix three forms of markup interchangeably:

Markdown

HTML (safe subset)

BBCode (subset)

If there was a standard, Jeff would still have ignored the standard if it didn’t fit the products he made. The flexibility to make your own dialect trumps adhering to anything. This is the point of every Markdown service.

Jeff highlights John MacFarlane, creator of PanDoc, and a tool John made called Babelmark. The tool lets a person compare the code generated by the default behavior of a variety of Markdown tools. Example. This is neat, and clearly, you can see that the code is different. If you flip over to the preview, you’ll see there’s not much visual difference here.

I know, horror of horrors, it’s not conforming to one, specific thing that can be tested and verified. Pass or fail. That would be neat and tidy, wouldn’t it? It ignores reality though.

In the “Standard Markdown” spec, they include GitHub Flavored Markdown’s “fenced code blocks”. Oh! Well, would you look at that! It’s a feature that serves the needs of one of the “Standard Markdown” contributors. It has nothing to do with the original specification of Markdown. This isn’t solely about removing ambiguity, of course, it’s about making the Markdown someone wants in to the correct Markdown.

Where are the tables? Tables are not a “core feature” like GitHub fenced code blocks. Where’s ids for headers? No one needed it, but Jeff agrees about maybe putting it in. Where’s the metadata storage? Guess no one needed metadata storage. Maybe they’ll come later on, and we’ll have “Standard Markdown 2.0 Compliant” badges we can adorn our blogs with. Maybe we can put a special header in our text files that says what the human-readable text should be processed with? You know, like “!#/usr/bin/StandardMarkdown/Official/2.0.1.a/” Something easy on the eyes.

This blog, which is really simple, and dumb, uses Python Markdown. It offers a series of extensions that can be enabled, disabled, and configured to suit my needs. I use metadata to store things like the title, and publish date. I use a table of contents package to create anchors for the headings. None of this stuff is supported by Markdown or “Standard Markdown”, and Python Markdown doesn’t even do it out-of-the-box.

Byword is my writing app of choice. It includes certain visual cues for Markdown elements based on the popular MultiMarkdown (MMD) syntax. I don’t get visual hinting for all of the elements I write that Python Markdown will convert. That’s fine. It’s great that neither strictly adhere to Gruber’s original Markdown. There’s enough here to make this all work smoothly, and I’m not surprised by the outcome very often. The alternative is that I have a rigidly enforced system that does not do what I want it to do.

Like Stack Exchange, Discourse, or GitHub, we all have needs. This is here to make things easier for us. If we have to have some cockamamy specification laid down that all must yield to, then I don’t find it appealing. Is the “Standard Markdown” team going to allow all these customizations? They fly in the face of what they deem correct. Will every customization need vetting and approval through some Discourse board where I’ll have very little sway?

I Can’t, With That Name

A petty, vainglorious power-grab of a name. What’s in a name? That which we call a Fork, by any other name would be just as forky.

Standard - This is like telling everyone you’re cool. “Hi everyone, I’m Cool Joe! Come hang out with me!” Congratulations on jinxing yourself? The iPhone is not called “Standard Phone”. Also, as I’ve established above, this is only standard in name only. A few guys made this in secret to scratch their own itch.
Markdown - Lots of things use “Markdown” as part of the name of their implementation of Markdown. The Python library I’m using does this. It’s usually not paired with “Standard”, “Official”, “One and Only”, or “Legal” to imply it holds some special place. This is, after-all, a fork.

This is about legitimizing their fork over all the others. Not just another fork here, this one is named “Standard Markdown”!

For someone that says he loves Markdown, Jeff doesn’t seem to understand anything about why it is popular. Or why attempts to rein in the wild sprawl are bound to fail.

2014-09-04 13:20:51

Category: text

Unauthoritative Pronouncements

Legitimate Text Processing

I Can’t, With That Name