Extensible Markup Language: A New Technology Tool for the Public Sector.

AuthorSmith, Ken
PositionIncludes discussion of current Internet technology strengths and limitations

XML is a new Internet technology tool that holds considerable promise for the public sector. This article explains what XML is and how it works, and discusses its use in seven governmental applications.

Although technology changes are fast paced, the fundamental tasks of government rarely change. The dilemma for public managers is how much time they should devote to learning about and implementing the new technologies as opposed to addressing the basic problems of government with one of the existing technologies. The goal of this article is to examine the technologies that governments use to share information on the Internet and to propose the use of a relatively new language, Extensible Markup Language, or XML.

The article begins with a discussion of the strengths and limitations of current Internet technologies. It then examines the benefits of and the barriers to implementing XML and discusses its relevance to seven different governmental applications: purchasing, tax reporting, financial reporting, budgeting, human resources, grant management, and performance measurement. The article concludes that while XML is a viable technology for some of these applications, it may never be viable for others. Consequently, depending upon the area of concentration, some public managers should spend the time to learn about XML while others probably should continue to use existing technologies.

A Brief History of the Internet

The Internet was developed in the early 1960s by the Department of Defense's Advanced Research Project Agency. The purpose was to network supercomputers among researchers located throughout the United States. In the late 1960s, four universities were allowed access to the Internet. The Internet was available only to education and government (primarily defense) until 1991 when the National Science Foundation first allowed commercial entities to use it. Since then, the number of Internet hosts has grown to more than 25 million. With the proliferation of the Internet, however, it has become increasingly difficult to find certain pieces of information.

Prior to the development of XML in 1998, most general information on the Internet was presented in Web pages using either HyperText Markup Language (HTML) or the Portable Document Format (PDF). Exhibit 1 provides a brief comparison of HTML, PDF, and the newer XML. Both HTML and PDF are relatively easy to use for both Internet publishers and users. This ease of use has been a major factor in the proliferation of the number of Web sites.

HTML and XML are "open standards" that are maintained by the World Wide Web Consortium or W3C (www.w3.org), meaning they are free and available to all Internet publishers and users. PDF is a proprietary product of the Adobe Corporation, but the Adobe Acrobat Reader is free, so any Web user can access PDF data after downloading the Reader. Publishers must purchase software from Adobe in order to create PDF documents.

When making Web pages, HTML provides a layout that includes text, images, and push buttons. PDF is merely a scanned version of text or pictures, while XML creates catalogs of information. The advantage of XML is that it gives end users the flexibility to dynamically access and use data because the catalog specifies the location and description of the various pieces of data used in Web documents.

In comparison, HTML and PDF statically deliver information to end users, much like a fax machine. Once the information is received or accessed, it is very difficult to manipulate. Only slow searches are available in HTML, while PDF cannot be searched until the user locates and opens the PDF file. Unless the user knows the exact location or name of the PDF file, the data within PDF files are virtually hidden.

Cataloging information in XML can be accomplished in multiple languages. HTML, however, is language-specific, making translations between languages very difficult. Since PDF is an image, it does not discriminate between languages. In terms of moving around the Internet, HTML allows hot links from the host Web site to other Web sites and pages. Similarly, XML provides X-links to other Web databases. X-links, however, are more efficient in linking to the exact data desired and reduce missed Web pages. Users can link out from a PDF document, but cannot link into a PDF document from other Web sites.

XML allows dynamic, on-the-spot data analysis. Data from XML Web sites can be seamlessly downloaded onto application programs for spreadsheet or statistical analysis. HTML and PDF also can be used for data analysis, but not without cutting and pasting or retyping, which can require significant time and effort, especially as the number of sources increases.

In summary, XML provides faster search and data analysis capabilities than either HTML or PDF, which can potentially reduce the workload of Web users. The next section explains in some detail how XML works. The purpose is to introduce those concepts that are important for public managers to understand in order to conceptualize how XML might improve the applications for which they are responsible.

XML Basics

Two unique features of XML enable the cataloging of data described in the previous section: "tags" and "taxonomies." A tag is a defining label attached to data presented on the Internet. The taxonomy lists all of the tags for a specific application and the exact rules for how the tagged data will be presented. The technical term for a taxonomy is a Document Type Definition, or DTD. The more popular term "taxonomy" will be used for this presentation.

To illustrate tags and taxonomies for text, assume the characters "Ken" and "Smith" appeared on a Web page. In XML, tags such as "FirstName" and "LastName" would probably be attached to "Ken" and "Smith" respectively. The taxonomy might require that the first letter of these elements be capitalized while the remaining letters are lowercase. Tags also can be more restrictive such as "AuthorFirstName."

To illustrate tags and taxonomies for numbers, assume the characters "$120,000.00" and "10000" appeared on a Web page. The tags might be "AnnualSalary" and "MonthlySalary" respectively. Notice that the format differs in at least three ways: use of dollar sign, use of comma, and number of decimal places. The taxonomy would specify how the information was to be presented on at least these three dimensions so that there would be no confusion among users. The first tag could also be "AnnualSalaryTimes3," "DesiredSalary," or "Parent'sSalary."

How do the relatively simple concepts of tags and taxonomies facilitate a radical change in the Internet that is not possible with HTML and PDF technologies? Recall that...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT