Wednesday, November 4, 2009

The Bigger Picture

Have you ever stopped to think about what it is that you’re actually doing, when you mark up a new Web page? And what’s all this about markup, anyway?

Whenever there is a new technology that is similar to something current, we see a massive adoption of the terminology, even the jargon, of the Olde Ways. Even customs and lore seem to transfer over. When Man learned to fly a hundred and six years ago, the closest thing we had to describe and govern this behavior was shipboard navigation, and so a lot of nautical stuff was quickly adopted and adapted for use in aviation.

The original idea behind the Web was that we would be laying out online documents… hmm… kind of like setting type. Sure it was much easier to add color or edit words, but the basic ideas seemed to fit almost perfectly. And since the language of typesetters was called markup, we took to marking up our Web pages. We don’t have any of the cool editing symbols, though I still remember a few of them. But the basic idea is the same. You start with a document, sometimes typed but sometimes handwritten, and you start putting little symbols into it to describe how you want it to look to the typesetter, who loads everything onto a giant plate, from which you print as many copies of the document as you need. Trust me, in Mark Twain’s day, this was heady stuff, indeed.

So from all of this we get the basic structure of HTML, today. All of our tags begin with “<“ and end with “>” and most sort of describe or remind us of their action. We put a slash in front of the same symbol to indicate we are done with its behavior, whatever it was. That is, a “p-tag” (<p>) begins a paragraph, while a “slash-li-tag” (</li>) indicates the ending of a list item. We start at the very top and the very bottom, with <html> and </html>, turning on and turning off the HTML-ness of our document. “Here is where our HTML begins, and here is where our HTML ends”. HTML being, let’s remember, the HyperText Markup Language. So we have a start and an end, and everything in between is (wait for it) HTML.

A section of our document has been set aside to help describe and control the rest of it. <head> begins the head of our page, and this is where we link to any external stylesheets or JavaScript pages and put any meta data we want to include and so on. Only the <title is actually visible to our page visitors here, unless something has gone horribly wrong. The rest of it is really only useful to Web servers, Web browsers and search engines. But we describe that area, too. <head> and </head>.

The part below the head is the body, so <body> and </body> come next. Every visible thing except for the page title appears here, so the body of your document is crucial. We place a tag under the </head> to indicate the body is starting, and one right before the final </html> tag to indicate we are done with the body, in a non-forensic doctor kind of way.

The elements inside the body tags are very close to actual nineteenth-century page markup. We place paragraph tags around the text we want to be paragraphs. We place heading tags around the text we want to be headings. We build complex tag structures to describe tables and lists. In those cases we need more than just an “it-begins-here” tag and an “it-ends-here” tag. We need to indicate the beginning and the ending of a table or a list, sure. But we also need to markup the various elements within it, too. With tables, we next describe the start and end of our table rows, and then inside those we even describe the beginning and ending of each individual table data cell. For lists, we must indicate the start and end of each list item.

I’m sure that back in the day there were people renowned for their markup skills just as there are, today. I’m sure there were pages that were structurally awful, but looked okay on the page just as we suffer today from bad markup with good-enough results.

We are probably lucky things turned out this way. The dominant model for computer communication of that era was a high-level programming language that translated the ideas of the programer into the machine language that the computer understands. We could easily have ended up with a markup model that involved either compiling or interpreting machine instructions. That would have led us down a whole different path.

No comments: