HTML
Adapted from Wikipedia · Adventurer experience
Hypertext Markup Language (HTML) is the standard markup language for documents shown in a web browser. It tells the browser what the content is and how it should be organized. HTML works with Cascading Style Sheets (CSS) for design and JavaScript for adding interactive features.
When you open a website, your web browser reads an HTML document. This document might come from a web server or be saved on your device. The browser then turns the HTML into a page you can see, with text, pictures, and more.
HTML uses special words called tags, written with angle brackets, to describe the structure of a page. Tags can make text into headings, paragraphs, or lists. They can also add images and links to other pages. You won’t see these tags when you look at the page, but they help the browser understand what to show.
With HTML, you can add programs written in JavaScript to make pages do things like play games or update information automatically. A newer version called HTML5 lets browsers show video and audio directly on pages.
History
Development
In 1980, physicist Tim Berners-Lee made a system called ENQUIRE to help people share documents. In 1989, he wrote about a way to connect documents on the Internet using hypertext. He made HTML and the first web browser and server in 1990. That year, he and Robert Cailliau asked for money for the project, but they did not get support from CERN.
The first public description of HTML was a document called “HTML Tags” in 1991. It talked about 18 basic parts of HTML. Eleven of these parts are still used today.
HTML is a markup language that web browsers use to show text, pictures, and other things on web pages. Designers can change how things look using CSS. HTML focuses on the structure of content.
Berners-Lee thought of HTML as a type of SGML, a standard for document formats. In 1993, the Internet Engineering Task Force made the first HTML proposal. Another proposal by Dave Raggett in 1993 suggested adding features such as tables and forms.
After these early ideas ended in 1994, the Internet Engineering Task Force made an HTML Working Group. In 1995, they made “HTML 2.0,” the first official HTML standard.
Development slowed after that, but the World Wide Web Consortium took over in 1996. In 2000, HTML became an international standard. HTML 4.01 came out in 1999. In 2004, work began on HTML5, which was finished and released on October 28, 2014.
HTML version timeline
HTML 2
24 November 1995
HTML 2.0 was released as an official document. It added features like uploading files, tables, and international characters.
HTML 3
14 January 1997
HTML 3.2 was released. This was the first version made by the World Wide Web Consortium. It added support from popular web browsers but kept things simple.
HTML 4
18 December 1997
HTML 4.0 was released. It came in three versions: Strict, Transitional, and Frameset. It included many features that web browsers already used but encouraged using stylesheets.
After HTML 4.01, no new HTML versions were released for many years because work shifted to XHTML.
HTML 5
28 October 2014
HTML5 was released as an official standard.
1 November 2016
HTML 5.1 was released as an official standard.
14 December 2017
HTML 5.2 was released as an official standard.
HTML draft version timeline
October 1991
The document “HTML Tags” was first mentioned publicly.
June 1992
The first informal draft of HTML was created.
November 1992
HTML DTD 1.1 was released, the first to include a version number.
June 1993
The Hypertext Markup Language was published as an early proposal.
November 1993
A competing proposal called HTML+ was published.
November 1994
Work began on what would become HTML 2.0.
April 1995
HTML 3.0 was proposed but was not finished.
January 2008
HTML5 was introduced as a work in progress.
2011 HTML5 – Last Call
In February 2011, HTML5 was reviewed for final checks.
2012 HTML5 – Candidate Recommendation
In December 2012, HTML5 was ready to become an official standard.
2014 HTML5 – Proposed Recommendation and Recommendation
In September 2014, HTML5 moved closer to becoming a standard.
On 28 October 2014, HTML5 was released as an official standard.
XHTML versions
Main article: XHTML
XHTML is a version of HTML using XML. It is now called the XML syntax for HTML and is no longer a separate standard.
- XHTML 1.0 was released on January 26, 2000. It had three versions like HTML 4.0, but written in XML.
- XHTML 1.1 was released on May 31, 2001. It was based on XHTML 1.0 Strict.
- XHTML 2.0 was being developed but work stopped in 2009 in favor of HTML5 and XHTML5. XHTML 2.0 was not compatible with earlier XHTML versions.
Transition of HTML publication to WHATWG
See also: HTML5 § W3C and WHATWG conflict
In May 2019, the World Wide Web Consortium announced that the WHATWG would be the only group publishing HTML and DOM standards. The two groups had been publishing different versions since 2012. The WHATWG’s “Living Standard” was widely used.
Markup
HTML uses special words called tags to organize web pages. These tags often come in pairs, like and, where the first tag starts something and the second tag ends it. Some tags, like <img>, don’t need a matching end tag.
Here’s a simple example:
This is a title
<div>
Hello world!
</div>
The text between and shows what appears at the top of a browser tab. The text between <div> and </div> is part of the page that you can see.
HTML documents use tags to arrange content. For example, headings use tags like to, with being the most important heading. Paragraphs are made with tags, and lines can be broken with the <br> tag.
Links are created using the <a> tag, with the href attribute pointing to the link’s address, like this:
A link to Wikipedia!
Semantic HTML
Main article: Semantic HTML
Semantic HTML is a way to write web pages that focuses on what the information means, not just how it looks. Since the late 1990s, people have been encouraged to use HTML that shows the meaning of the content.
This helps computers understand web pages better. When search engines look through the web, good use of semantic HTML helps these tools work better. It also makes web pages easier to use for everyone, including people who need special tools to read the web.
Delivery
HTML documents can be sent like any other computer file. They are often sent using HTTP from a web server or through email.
The World Wide Web is made up of HTML documents that travel from web servers to web browsers using HTTP. HTTP is also used for sending images, sounds, and other content. With each document, extra information is sent to help the browser know how to show it. This includes details like the type of document and how characters look.
Some web browsers might show HTML documents differently based on the information sent with them. Most email programs let you use a little HTML to make emails look nicer, like adding colors or pictures. But using HTML in emails can sometimes cause trouble for people who have trouble seeing the screen, and it can also make emails bigger.
The usual ending for HTML files is .html, though sometimes people use the shorter .htm as well.
HTML4 variations
Since HTML started, it became popular fast. But early days had no clear rules. Though HTML was meant for meaning, not looks, practical needs added many look-related parts, mainly because of different web browsers. Today’s rules aim to bring order to HTML’s growth and create a strong base for building clear and nice documents. To return HTML to its meaning-focused role, the W3C made style languages like CSS and XSL to handle looks. With this, the HTML rules slowly cut down look-related parts.
HTML has two main ways to differ: SGML-based HTML versus XML-based HTML (called XHTML), and strict versus transitional (loose) versus frameset.
SGML-based versus XML-based HTML
A key difference in HTML rules is between SGML-based and XML-based HTML, called XHTML. The W3C planned for XHTML 1.0 to match HTML 4.01, except where XML’s rules differed from SGML’s more complex ones. Because XHTML and HTML are close, they are sometimes written together as (X)HTML or X(HTML).
Like HTML 4.01, XHTML 1.0 has three types: strict, transitional, and frameset.
Besides starting a document differently, HTML 4.01 and XHTML 1.0 mainly differ in their structure. HTML allows shortcuts XHTML does not, like tags without opening or closing parts, or empty tags without end tags. XHTML needs all tags to have opening and closing parts. XHTML also adds a new shortcut: a tag can open and close in one tag, like <br/>. This new way might confuse older software. Fixing this means removing the slash, like <br>.
To change a valid XHTML 1.0 document to HTML 4.01, do these steps:
- Use a
langattribute for language instead of XHTML’sxml:lang. - Remove the XML area (
xmlns=URI). HTML has no areas. - Change the document type from XHTML 1.0 to HTML 4.01.
- If present, remove the XML start part. It usually looks like ``.
- Make sure the document’s type is set to
text/html. This comes from the server’sContent-Typemessage. - Change XML empty-tag style to HTML style (
<br/>to<br>).
These are the main steps to change a document from XHTML 1.0 to HTML 4.01. Changing from HTML to XHTML also needs adding any missing opening or closing tags. It might be best to always include optional tags in HTML rather than remembering which can be left out.
A well-made XHTML document follows all XML’s structure rules. A right document follows XHTML’s content rules, which describe its building order.
The W3C suggests some ways to make moving between HTML and XHTML easier. These steps work for XHTML 1.0 documents only:
- Include both
xml:langandlangattributes when giving a language to elements. - Use the empty-tag style only for tags meant to be empty in HTML.
- Remove the closing slash in empty tags: like
<br>instead of<br/>. - Include full closing tags for elements that can hold content but are left empty, like
<div>not<div />. - Leave out the XML start part.
By following the W3C’s guidelines, a web browser should treat the document the same whether it is HTML or XHTML. For XHTML 1.0 documents made this way, the W3C allows them to be sent as HTML (with text/html MIME type) or as XHTML (with application/xhtml+xml or application/xml type). When sent as XHTML, browsers should use an XML reader, which follows XML’s rules strictly.
Transitional versus strict
HTML 4 had three types: Strict, Transitional (once called Loose), and Frameset. The Strict type is for new documents and is best practice. The Transitional and Frameset types help move older documents to HTML 4. They allow look-related parts that Strict leaves out. Instead, cascading style sheets are suggested to improve how HTML documents look. Since XHTML 1 only gives an XML way to write HTML 4, the same rules apply to XHTML 1.
The Transitional type allows these parts not in Strict:
- A more flexible content order
- Inline pieces and plain text can go directly in:
body,blockquote,form,noscriptandnoframes
- Inline pieces and plain text can go directly in:
- Look-related pieces
- underline (
u) (Old. may mix up links.) - strike-through (
s) center(Old. use CSS instead.)font(Old. use CSS instead.)basefont(Old. use CSS instead.)
- underline (
- Look-related attributes
background(Old. use CSS instead.) andbgcolor(Old. use CSS instead.) forbody(needed part for the W3C.) element.align(Old. use CSS instead.) fordiv,form, paragraph (p) and headings (h1...h6)align(Old. use CSS instead.),noshade(Old. use CSS instead.),size(Old. use CSS instead.) andwidth(Old. use CSS instead.) forhralign(Old. use CSS instead.),border,vspaceandhspaceforimgandobject(note:objectonly works in Internet Explorer of big browsers) elementsalign(Old. use CSS instead.) forlegendandcaptionalign(Old. use CSS instead.) andbgcolor(Old. use CSS instead.) ontablenowrap(No longer used),bgcolor(Old. use CSS instead.),width,heightontdandthbgcolor(Old. use CSS instead.) fortrclear(No longer used) forbrcompactfordl,dirandmenutype(Old. use CSS instead.),compact(Old. use CSS instead.) andstart(Old. use CSS instead.) forolandultypeandvalueforliwidthforpre
- Extra pieces in Transitional rules
menu(Old. use CSS instead.) list (no replace, but unordered list is suggested)dir(Old. use CSS instead.) list (no replace, but unordered list is suggested)isindex(Old.) (this needs work from the server and is usually added there,formandinputcan be used instead.)applet(Old. useobjectinstead.)
- The
language(No longer used) attribute for script piece (extra withtypeattribute). - Frame-related pieces
iframenoframestarget(Old inmap,linkandformpieces.) fora, image-map (map),link,formandbase
The Frameset type includes all Transitional parts, plus the frameset piece (used instead of body) and the frame piece.
Frameset versus transitional
Besides the above Transitional differences, the frameset rules (whether XHTML 1.0 or HTML 4.01) use a different content order, with frameset replacing body, holding either frame pieces, or sometimes noframes with a body.
Summary of specification versions
As this list shows, the loose rules are kept for old support. But, contrary to common belief, moving to XHTML does not mean removing this old support. The X in XML means extensible, and the W3C is separating the whole rule set and opening it to independent additions. The main win in moving from XHTML 1.0 to XHTML 1.1 is this separation. The Strict type of HTML is used in XHTML 1.1 through a set of added parts to the base XHTML 1.1 rules. Similarly, those looking for the loose (transitional) or frameset rules will find similar added XHTML 1.1 support (much in the old or frame modules). Separation also lets pieces grow on their own schedule. For example, XHTML 1.1 will let faster moves to new XML standards such as MathML (a way to show and mean math based on XML) and XForms—a new, advanced way for web forms beyond today’s HTML forms.
In short, the HTML 4 rules mostly brought all HTML ways into one clear rule book based on SGML. XHTML 1.0 moved this rule book to the new XML-based rules. Next, XHTML 1.1 used XML’s open nature to separate the whole rule book. XHTML 2.0 was meant to be the first step in adding new parts to the rules in a group-based way.[AI-generated?]
WHATWG HTML versus HTML5
The HTML Living Standard, made by WHATWG, is the main version used now. W3C HTML5 is not separate from WHATWG anymore.
WYSIWYG editors
Some editors, called WYSIWYG (what you see is what you get), let users create web pages using a graphical user interface, much like word processors. These tools show how the page will look instead of showing the code, so you don’t need to know much about HTML.
However, these editors can sometimes create messy or unnecessary code. Some developers prefer a different approach called WYSIWYM (what you see is what you mean), which focuses more on the meaning of the content rather than just how it looks.
Related articles
This article is a child-friendly adaptation of the Wikipedia article on HTML, available under CC BY-SA 4.0.
Images from Wikimedia Commons. Tap any image to view credits and license.
Safekipedia