Safekipedia

HTML

Adapted from Wikipedia · Adventurer experience

HTML5 official logo (official since 1 April 2011, see FAQ)

Hypertext Markup Language (HTML) is the standard markup language for documents shown in a web browser. It tells the browser what the content is and how it should be organized. HTML works with Cascading Style Sheets (CSS) for design and JavaScript for adding interactive features.

When you open a website, your web browser reads an HTML document. This document might come from a web server or be saved on your device. The browser then turns the HTML into a page you can see, with text, pictures, and more.

HTML uses special words called tags, written with angle brackets, to describe the structure of a page. Tags can make text into headings, paragraphs, or lists. They can also add images and links to other pages. You won’t see these tags when you look at the page, but they help the browser understand what to show.

With HTML, you can add programs written in JavaScript to make pages do things like play games or update information automatically. A newer version called HTML5 lets browsers show video and audio directly on pages.

History

Development

In 1980, physicist Tim Berners-Lee made a system called ENQUIRE to help people share documents. In 1989, he wrote about a way to connect documents on the Internet using hypertext. He made HTML and the first web browser and server in 1990. That year, he and Robert Cailliau asked for money for the project, but they did not get support from CERN.

The first public description of HTML was a document called “HTML Tags” in 1991. It talked about 18 basic parts of HTML. Eleven of these parts are still used today.

HTML is a markup language that web browsers use to show text, pictures, and other things on web pages. Designers can change how things look using CSS. HTML focuses on the structure of content.

Berners-Lee thought of HTML as a type of SGML, a standard for document formats. In 1993, the Internet Engineering Task Force made the first HTML proposal. Another proposal by Dave Raggett in 1993 suggested adding features such as tables and forms.

After these early ideas ended in 1994, the Internet Engineering Task Force made an HTML Working Group. In 1995, they made “HTML 2.0,” the first official HTML standard.

Development slowed after that, but the World Wide Web Consortium took over in 1996. In 2000, HTML became an international standard. HTML 4.01 came out in 1999. In 2004, work began on HTML5, which was finished and released on October 28, 2014.

HTML version timeline

HTML 2

24 November 1995

HTML 2.0 was released as an official document. It added features like uploading files, tables, and international characters.

HTML 3

14 January 1997

HTML 3.2 was released. This was the first version made by the World Wide Web Consortium. It added support from popular web browsers but kept things simple.

HTML 4

18 December 1997

HTML 4.0 was released. It came in three versions: Strict, Transitional, and Frameset. It included many features that web browsers already used but encouraged using stylesheets.

Tim Berners-Lee in April 2009

After HTML 4.01, no new HTML versions were released for many years because work shifted to XHTML.

HTML 5

28 October 2014

HTML5 was released as an official standard.

1 November 2016

HTML 5.1 was released as an official standard.

14 December 2017

HTML 5.2 was released as an official standard.

HTML draft version timeline

October 1991

The document “HTML Tags” was first mentioned publicly.

June 1992

The first informal draft of HTML was created.

November 1992

HTML DTD 1.1 was released, the first to include a version number.

June 1993

The Hypertext Markup Language was published as an early proposal.

November 1993

Logo of HTML5

A competing proposal called HTML+ was published.

November 1994

Work began on what would become HTML 2.0.

April 1995

HTML 3.0 was proposed but was not finished.

January 2008

HTML5 was introduced as a work in progress.

2011 HTML5 – Last Call

In February 2011, HTML5 was reviewed for final checks.

2012 HTML5 – Candidate Recommendation

In December 2012, HTML5 was ready to become an official standard.

2014 HTML5 – Proposed Recommendation and Recommendation

In September 2014, HTML5 moved closer to becoming a standard.

On 28 October 2014, HTML5 was released as an official standard.

XHTML versions

Main article: XHTML

XHTML is a version of HTML using XML. It is now called the XML syntax for HTML and is no longer a separate standard.

  • XHTML 1.0 was released on January 26, 2000. It had three versions like HTML 4.0, but written in XML.
  • XHTML 1.1 was released on May 31, 2001. It was based on XHTML 1.0 Strict.
  • XHTML 2.0 was being developed but work stopped in 2009 in favor of HTML5 and XHTML5. XHTML 2.0 was not compatible with earlier XHTML versions.

Transition of HTML publication to WHATWG

See also: HTML5 § W3C and WHATWG conflict

In May 2019, the World Wide Web Consortium announced that the WHATWG would be the only group publishing HTML and DOM standards. The two groups had been publishing different versions since 2012. The WHATWG’s “Living Standard” was widely used.

Markup

HTML uses special words called tags to organize web pages. These tags often come in pairs, like and, where the first tag starts something and the second tag ends it. Some tags, like <img>, don’t need a matching end tag.

Here’s a simple example:

  
    This is a title
  
  
    <div>
      Hello world!
    </div>
  

The text between and shows what appears at the top of a browser tab. The text between <div> and </div> is part of the page that you can see.

HTML documents use tags to arrange content. For example, headings use tags like to, with being the most important heading. Paragraphs are made with tags, and lines can be broken with the <br> tag.

Links are created using the <a> tag, with the href attribute pointing to the link’s address, like this:

A link to Wikipedia!

Semantic HTML

Main article: Semantic HTML

Semantic HTML is a way to write web pages that focuses on what the information means, not just how it looks. Since the late 1990s, people have been encouraged to use HTML that shows the meaning of the content.

This helps computers understand web pages better. When search engines look through the web, good use of semantic HTML helps these tools work better. It also makes web pages easier to use for everyone, including people who need special tools to read the web.

Delivery

HTML documents can be sent like any other computer file. They are often sent using HTTP from a web server or through email.

The World Wide Web is made up of HTML documents that travel from web servers to web browsers using HTTP. HTTP is also used for sending images, sounds, and other content. With each document, extra information is sent to help the browser know how to show it. This includes details like the type of document and how characters look.

Some web browsers might show HTML documents differently based on the information sent with them. Most email programs let you use a little HTML to make emails look nicer, like adding colors or pictures. But using HTML in emails can sometimes cause trouble for people who have trouble seeing the screen, and it can also make emails bigger.

The usual ending for HTML files is .html, though sometimes people use the shorter .htm as well.

HTML4 variations

Since HTML started, it became popular fast. But early days had no clear rules. Though HTML was meant for meaning, not looks, practical needs added many look-related parts, mainly because of different web browsers. Today’s rules aim to bring order to HTML’s growth and create a strong base for building clear and nice documents. To return HTML to its meaning-focused role, the W3C made style languages like CSS and XSL to handle looks. With this, the HTML rules slowly cut down look-related parts.

HTML has two main ways to differ: SGML-based HTML versus XML-based HTML (called XHTML), and strict versus transitional (loose) versus frameset.

SGML-based versus XML-based HTML

A key difference in HTML rules is between SGML-based and XML-based HTML, called XHTML. The W3C planned for XHTML 1.0 to match HTML 4.01, except where XML’s rules differed from SGML’s more complex ones. Because XHTML and HTML are close, they are sometimes written together as (X)HTML or X(HTML).

Like HTML 4.01, XHTML 1.0 has three types: strict, transitional, and frameset.

Besides starting a document differently, HTML 4.01 and XHTML 1.0 mainly differ in their structure. HTML allows shortcuts XHTML does not, like tags without opening or closing parts, or empty tags without end tags. XHTML needs all tags to have opening and closing parts. XHTML also adds a new shortcut: a tag can open and close in one tag, like <br/>. This new way might confuse older software. Fixing this means removing the slash, like <br>.

To change a valid XHTML 1.0 document to HTML 4.01, do these steps:

  1. Use a lang attribute for language instead of XHTML’s xml:lang.
  2. Remove the XML area (xmlns=URI). HTML has no areas.
  3. Change the document type from XHTML 1.0 to HTML 4.01.
  4. If present, remove the XML start part. It usually looks like ``.
  5. Make sure the document’s type is set to text/html. This comes from the server’s Content-Type message.
  6. Change XML empty-tag style to HTML style (<br/> to <br>).

These are the main steps to change a document from XHTML 1.0 to HTML 4.01. Changing from HTML to XHTML also needs adding any missing opening or closing tags. It might be best to always include optional tags in HTML rather than remembering which can be left out.

A well-made XHTML document follows all XML’s structure rules. A right document follows XHTML’s content rules, which describe its building order.

The W3C suggests some ways to make moving between HTML and XHTML easier. These steps work for XHTML 1.0 documents only:

  • Include both xml:lang and lang attributes when giving a language to elements.
  • Use the empty-tag style only for tags meant to be empty in HTML.
  • Remove the closing slash in empty tags: like <br> instead of <br/>.
  • Include full closing tags for elements that can hold content but are left empty, like <div> not <div />.
  • Leave out the XML start part.

By following the W3C’s guidelines, a web browser should treat the document the same whether it is HTML or XHTML. For XHTML 1.0 documents made this way, the W3C allows them to be sent as HTML (with text/html MIME type) or as XHTML (with application/xhtml+xml or application/xml type). When sent as XHTML, browsers should use an XML reader, which follows XML’s rules strictly.

Transitional versus strict

HTML 4 had three types: Strict, Transitional (once called Loose), and Frameset. The Strict type is for new documents and is best practice. The Transitional and Frameset types help move older documents to HTML 4. They allow look-related parts that Strict leaves out. Instead, cascading style sheets are suggested to improve how HTML documents look. Since XHTML 1 only gives an XML way to write HTML 4, the same rules apply to XHTML 1.

The Transitional type allows these parts not in Strict:

  • A more flexible content order
    • Inline pieces and plain text can go directly in: body, blockquote, form, noscript and noframes
  • Look-related pieces
    • underline (u) (Old. may mix up links.)
    • strike-through (s)
    • center (Old. use CSS instead.)
    • font (Old. use CSS instead.)
    • basefont (Old. use CSS instead.)
  • Look-related attributes
    • background (Old. use CSS instead.) and bgcolor (Old. use CSS instead.) for body (needed part for the W3C.) element.
    • align (Old. use CSS instead.) for div, form, paragraph (p) and headings (h1...h6)
    • align (Old. use CSS instead.), noshade (Old. use CSS instead.), size (Old. use CSS instead.) and width (Old. use CSS instead.) for hr
    • align (Old. use CSS instead.), border, vspace and hspace for img and object (note: object only works in Internet Explorer of big browsers) elements
    • align (Old. use CSS instead.) for legend and caption
    • align (Old. use CSS instead.) and bgcolor (Old. use CSS instead.) on table
    • nowrap (No longer used), bgcolor (Old. use CSS instead.), width, height on td and th
    • bgcolor (Old. use CSS instead.) for tr
    • clear (No longer used) for br
    • compact for dl, dir and menu
    • type (Old. use CSS instead.), compact (Old. use CSS instead.) and start (Old. use CSS instead.) for ol and ul
    • type and value for li
    • width for pre
  • Extra pieces in Transitional rules
    • menu (Old. use CSS instead.) list (no replace, but unordered list is suggested)
    • dir (Old. use CSS instead.) list (no replace, but unordered list is suggested)
    • isindex (Old.) (this needs work from the server and is usually added there, form and input can be used instead.)
    • applet (Old. use object instead.)
  • The language (No longer used) attribute for script piece (extra with type attribute).
  • Frame-related pieces
    • iframe
    • noframes
    • target (Old in map, link and form pieces.) for a, image-map (map), link, form and base

The Frameset type includes all Transitional parts, plus the frameset piece (used instead of body) and the frame piece.

Frameset versus transitional

Besides the above Transitional differences, the frameset rules (whether XHTML 1.0 or HTML 4.01) use a different content order, with frameset replacing body, holding either frame pieces, or sometimes noframes with a body.

Summary of specification versions

As this list shows, the loose rules are kept for old support. But, contrary to common belief, moving to XHTML does not mean removing this old support. The X in XML means extensible, and the W3C is separating the whole rule set and opening it to independent additions. The main win in moving from XHTML 1.0 to XHTML 1.1 is this separation. The Strict type of HTML is used in XHTML 1.1 through a set of added parts to the base XHTML 1.1 rules. Similarly, those looking for the loose (transitional) or frameset rules will find similar added XHTML 1.1 support (much in the old or frame modules). Separation also lets pieces grow on their own schedule. For example, XHTML 1.1 will let faster moves to new XML standards such as MathML (a way to show and mean math based on XML) and XForms—a new, advanced way for web forms beyond today’s HTML forms.

In short, the HTML 4 rules mostly brought all HTML ways into one clear rule book based on SGML. XHTML 1.0 moved this rule book to the new XML-based rules. Next, XHTML 1.1 used XML’s open nature to separate the whole rule book. XHTML 2.0 was meant to be the first step in adding new parts to the rules in a group-based way.[AI-generated?]

WHATWG HTML versus HTML5

The HTML Living Standard, made by WHATWG, is the main version used now. W3C HTML5 is not separate from WHATWG anymore.

WYSIWYG editors

Some editors, called WYSIWYG (what you see is what you get), let users create web pages using a graphical user interface, much like word processors. These tools show how the page will look instead of showing the code, so you don’t need to know much about HTML.

However, these editors can sometimes create messy or unnecessary code. Some developers prefer a different approach called WYSIWYM (what you see is what you mean), which focuses more on the meaning of the content rather than just how it looks.

Related articles

This article is a child-friendly adaptation of the Wikipedia article on HTML, available under CC BY-SA 4.0.

Images from Wikimedia Commons. Tap any image to view credits and license.