Chapter 27 Creating and Managing Dynamic Web Sites: Differentiating Data from Display

The Solution
Constructing an Attribute File
DIDDS Component II: Data Files
Putting It Together: A CGI Script
Technical Issues
Summary

This is the challenge: Build a Web site to serve a group of independent producers interested in marketing their various products under a common umbrella. The producers are pursuing a Web strategy that promotes the individual features of each producer, creates a shared look and feel across the group, and uses a common ordering system. Products, descriptions, and prices change frequently-sometimes daily!-and the producers want to maintain their own pages. Consistency across pages is an absolute (the marketing pages must reflect the same prices as the order pages).

And none of the producers know, or care to learn, HTML.

The Solution

Our solution for this client is a technique we refer to as Dynamic Information Data Delivery System (DIDDS). This system is designed to serve the needs of clients who want a series of pages that can be changed quickly, easily, and consistently by individuals unfamiliar with HTML. The key to our approach is keeping data separate from display. Using our suite of CGI programs, data requested by a client is passed through a display filter that attaches the HTML tags and generates the page on-the-fly. Making changes to either data or to display elements is facilitated greatly because the two are kept separate from one another; changes in one do not involve sorting through or working with the other. If a site modification is desired, such as supplying a new background, for example, it is possible to do it without changing background tags on a multitude of static pages. Instead, one change in the attribute file that defines the background causes all subsequent page requests that pass through that attribute file to inherit the new background scheme. By keeping data separate from page design attributes, Web site development and information-serving functions are facilitated greatly.

Our purpose in this chapter is to introduce the concepts behind DIDDS, explain the benefits of this approach, and describe its implementation using a case study. We first introduce a specific client and describe its Web needs. We turn next to a description of the component parts of a DIDDS site, illustrating each part with reference to our case study. We conclude with a brief discussion of technical issues associated with implementing a DIDDS site.

Our Client: The Farm of the Future

The client (a fictitious entity) posing the challenge is the Farm of the Future. The Farm of the Future consists of three independent producers of small-scale, organic agricultural products: chicken, beef, and honey. The cooperative is interested in pursuing a Web strategy that effectively promotes the individual care and attention each of the independent producers gives their products, while at the same time emphasizing the features and qualities shared by all three. The cooperative wants to develop a shared look and feel for the Web site, as well as use a common ordering system. Cooperative members expect their products, descriptions, and prices to change frequently, and the cooperative members want to avoid high maintenance costs by updating their own pages.

Display Characteristics: Attribute Files

An attribute file defines the key characteristics of a Web site. The attribute file itself is basically a list of various page-layout attributes or display characteristics that define the look and feel of the web pages: font colors and heights, various horizontal rules and separators, customized bullets and dingbats, table widths and heights, alignments of all sorts, background patterns and colors, image placements and alignments, and other layout-related elements you might want to specify.

Constructing an Attribute File

It is helpful if you first generate sketches of how you want the finished pages to appear, because this gives you a general idea of what specific elements are involved and how they relate to one another. In the sketches you will define alignments, the colors, and other parameters involved with layout. Then, each of the elements is isolated into a single display attribute. Each attribute consists of three fields. The first field is the name you give the attribute, such as title or text_font_color, followed by a delimiter. We use @@ as the delimiter because it is unlikely that anyone would use a @@ character combination anywhere else, so this combination of characters is reserved to serve this special purpose. Following the delimiter is the attribute's value or definition. The attribute title@@Farm of the Future, for example, specifies Farm of the Future as the web pages' title, and the attribute bgcolor@@White defines White as the background color.

Elements of an Attribute File

We list about 60 attributes in our example, but the number and variety probably are unlimited. In a later section, "Putting It Together: A CGI Script," we provide a list of typical entries for the attribute file. It is important to keep in mind that because the Web browser does read the attribute file, you have to be certain that all the elements you specify in the file that affect Web page layout have corresponding HTML equivalents, and these elements are specified in the attribute file. You can see that these files will continue to grow as new page elements are needed.

Adding and Creating New Elements in an Attribute File

It should be clear at this point that the only real limits as to what can be included in the attribute file are the constraints of the HTML specifications or browser-specific implementations. We feel that in the beginning of a new dynamic site it is important that the attributes rely as much as possible on conventional HTML tags. As new HTML specifications become implemented or as new and different page design ideas come about, you can go back and add new attributes to the file. Also, for more complex Web sites, it is possible to create a number of separate attribute files so that management does not become unwieldy.

Connection with Layout

When a web page is called by a client, the data or information that is to be sent to the client first passes through the display filter so that the proper display elements can be applied and executed. The information inherits whatever layout qualities you have specified in the attribute file. If you determine that a horizontal rule will come between every separate paragraph, for example, this page-display characteristic is executed whenever one paragraph ends and another begins. Figure 27.1 represents the main points of intersection among the various DIDDS elements.

Figure 27.1: Elements of a dynamic Web site.

DIDDS Component II: Data Files

The second component of a site based on DIDDS software is the data. Whereas display components tell us how to display, data components tell us what to display.

Every site will have its own, unique form of data. In the Farm of the Future site, we decided to offer a "front page" (see Fig. 27.2) and a series of subsequent pages following a standard display format. Here, we describe the structure of the subsequent pages.

Figure 27.2: The Farm of the Future front page.

Each page (other than the front page) consists of three sections: a header, a body, and a footer. The header and footer are stored as separate data files (header.dat and footer.dat; the names of these files are stored in the attribute file as described in the preceding section). The body file is determined by the user action in the CGI script calling the page; clicking the Chicken button, for example, calls the chicken.dat file, whereas clicking the Honey button calls the honey.dat file. In addition, as you see in Listing 27.1, data files also can embed calls to other data files.

Our data files use SGML tags. These tags allow us to exploit as many opportunities as possible for future uses of the data. SGML is an international standard published in 1986 that manages information as data objects rather than characters. In other words, information is referred to in terms of its type rather than its characteristics. The number of data types is unlimited; each data type, however, must be defined in one of the CGI scripts or libraries responsible for the site. In this example, we use data types to define headings, titles, prices, and graphics. The advantages of using SGML tags are threefold:

They are simpler than HTML.
They are modifiable locally by your own script.
They are infinitely extensible, meaning that you can define as many different SGML tags as necessary.

Let's take a look at a very simple data file to start. Listing 27.1 contains an example of a data file that has been tagged according to data type. In this scheme, it is necessary to place all the SGML metatags flush left. If desired, HTML tags can be inserted for interpretation by the browser as long as they are indented to the right.

Listing 27.1. A data file that loads tabular data.

Contents of honey.dat <T> Honey </T> <H> Happiness Ridge Apiaries </H> <PIX> honey.gif </PIX> <P> Our unprocessed honey has not been altered by heat, filters, or additives in any way. We bring you a naturally wonderful honey, far superior to mass-produced commercial brands. </P> <TABLE> honeyprice.dat </TABLE> <PINDEX> display@@beef.dat@@Beef display@@chicken.dat@@Chicken </PINDEX> <S> </S>

In Listing 27.1, the <T>Honey</T> tags indicate that Honey is the page's title, and that display characteristics for data between the tags are found by referencing the title_style, title_color, and title_size attributes. The construction <PIX>honey.gif</PIX> indicates the inclusion of a graphic element; the style and location of this graphic are determined by the attribute file. The <H> and </H> tags delimit the page's heading. Note that the <BR> HTML tag will cause a line break in the heading. Because it is necessary to delimit individual sections of the data file, the <P> and </P> tags are used to set apart a paragraph of text. The <TABLE>honeyprice.dat</TABLE> lines indicate the use of an embedded data file to be displayed as an HTML table. At the bottom of the file, the <S> and </S> are section tags. These tags refer back to the attribute file for instructions on how to mark the end of a section-in this case, by using the graphic associated with the horizontal rule attribute.

Listing 27.2 contains an example of a tabular data file, the honeyprice.dat file. It should be noted that the first line of the data file is treated as header information and formatted appropriately. The number of rows is arbitrary, as is the number of elements in a row. The format of the data in the price files is shown in Listing 27.2 with @@ serving as column delimiters. You'll notice that we use the same @@ delimiters to separate the different fields in the database. A site manager may edit the honeyprice.dat file to change the price of orange honey to $2.75 per pound from $2.50 per pound. This change then will be reflected in any and all pages on that site calling that file. Notice here that the instructions about how to display data are kept separate from the actual data to be displayed. Figure 27.3 shows the page as it is displayed.

Figure 27.3: The Pure Honey page.

Listing 27.2. A tabular data file.

Contents of honeyprice.dat Flavor@@Quantity@@Price Clover @@1 lb. @@$2.00 @@2 lbs. @@$3.50 Thyme @@1/2 lb.@@$1.50 @@1 lb. @@$2.50 Raspberry @@1 lb. @@$2.50 Orange @@1 lb. @@$2.50 Buckwheat @@1 lb. @@$2.00 @@2 lbs. @@$3.50

Putting It Together: A CGI Script

So far, you have seen how to construct the two main components of a DIDDS system: the attribute file and the data files. In this section, we show how the two are combined to display data types correctly. We will provide examples of typical entries for the attribute file and cover the procedures to access the style and layout information to produce the resulting HTML.

Accessing Display Attributes

For the business case presented here, we decided to keep things simple. Therefore, the display attributes for the entire site are maintained in one file. For much larger sites with a complicated hierarchy of data and DIDDS applications, it might be necessary to maintain several attribute files. A site with many different logical areas would require an attribute file for each one. In addition, several attribute files can be used to effect a seasonal or time-based change of appearance. As discussed previously, the attributes are stored in a file with the following format:

Attribute_Name@@Value

Here, Attribute_Name can be any of the quantities one typically associates with standard HTML and other DIDDS-specific properties. Listing 27.3 shows a sampling of attributes.

Listing 27.3. An attribute file.

server@@www.cmsgroup.com account@@Farm of the Future Cooperative author@@CMS Group, LLC didds_url@@http://www.cmsgroup.com/fofdidds pix_url@@http://www.cmsgroup.com/fof/pix icon_url@@http://www.cmsgroup.com/fof/icons title@@Farm of the Future data_directory@@data front_file@@front.dat index_file@@index.dat header_file@@header.dat footer_file@@footer.dat bullet@@rd_ball2.gif rule@@line_black.gif background_color@@White link_color@@Firebrick vlink_color@@Blue alink_color@@Red text_font_size@@3 text_font_color@@Midnight Blue text_font_style@@Bold title_font_size@@5 title_font_color@@Medium Sea Green title_font_style@@Italic header_font_size@@4 header_font_color@@Medium Sea Green header_font_style@@Bold bullet_font_size@@3 bullet_font_color@@Sea Green bullet_font_style@@Bold link_font_size@@2 link_font_color@@Orange Red link_font_style@@Normal index_font_size@@5 index_font_color@@Orange Red index_font_style@@Normal table_border@@1 table_cellpad@@1 table_cellspace@@1 table_width@@300 table_font_size@@3 table_font_color@@Red table_font_style@@Italic table_header_font_size@@5 table_header_font_color@@Black table_header_font_style@@Bold image_border@@0 image_align@@left image_width@@150 image_height@@ image_hspace@@10 image_vspace@@10

The total list of attributes can be rather long, depending on the complexity of the site. This abbreviated list shown in Listing 27.3 illustrates some of the style attributes that can be defined site-wide. The first few attributes provide site-wide information, such as the location of the data files, the base URL for all associated images and icons, and the title of the site. The next few attributes define some basic data files that are essential throughout the site. The remaining attributes determine font sizes, colors, and styles, as well as table and image attributes.

Each time that information is requested by a web client, the appropriate application fetches the necessary style and display information from the attribute file. In some sense, the DIDDS method implements what can be called a dynamic template library.

In our Perl implementations of the DIDDS applications, the attribute information is read and stored as an associative array using the following subroutine:

%attributes = &ReadAttributeFile($attribute_file); sub ReadAttributeFile { my($file,$other) = @_; my($fcheck); my(%att); my($param,$value); $fcheck = open(AFILE,"<$file"); if($fcheck eq 0) { print "Attribute File Error: $file \n"; return 0; } else { while(<AFILE>) { chop; ($param,$value) = split("@@"); if( $param ne "") { $att{$param} = $value; } } return %attributes; } } # end ReadAttribute File

The name/value pairs are read in and stored for easy access via references to associative array elements such as this:

$tcolor = $attributes{'text_font_color'}

$tsize = $attributes{'text_font_size'}

Whenever text is to be displayed, the <FONT SIZE=$tsize COLOR=$tcolor> </FONT> set of tags is used to set the size and color of the text.

Processing Data Types

Because the DIDDS approach to site management strives to keep data separate from display, it is important to define the various data types and data formats. Because these might be totally site specific, the data might be stored either in simple flat files or complex relational databases. The DIDDS application need only be aware of how to fetch the desired data. Processing then is dependent on the context type of the data elements. Data display relies on specific attribute values and the style sheets defined within each application.

In the following section, we illustrate a manner in which data can be stored in simple files using basic context rules.

Processing Context Types

As with any data-storage mechanism, it is important to define the meaning of individual data elements within a given record. Therefore, we rely on an SGML-like mechanism to define context types within simple flat data files. As the data file is parsed, different context types are encountered. With each new context type, a reference to the attribute information is made that returns the appropriate parameters for continued processing of the information. By defining information in a context-sensitive manner, display attributes are associated only at runtime.

Table 27.1 defines some of the context types.

Table 27.1. Context types.

Tag	Description
Bulleted Item `<BULL>...</BULL>`	Signals that the following body of text should be treated as a bulleted item.
Comment `##`	Signals that the data on this line should be ignored.
Data File `<DATA>...</DATA>`	Allows for the inclusion of another data file into the document. This command is used for embedding normal style links into documents. The syntax follows: `</DATA>` `data_file_name` `</DATA>` This tag is especially useful in cases where data must appear in multiple places. In the FoF site, this tag is used to display the pricing information for the various farm products. By keeping this information in a single file, site consistency is maintained.
Header `<HEADER>...</HEADER>`	Signals a body of text that should be treated as a header.
Index `<INDEX>...</INDEX>`	Allows for the inclusion of an index into a document. This command is used for embedding DIDDS-style links into documents. Regular tagged links can be inserted using standard methods. The syntax follows: `<INDEX>` `application@@datafile@@description` `application@@datafile@@description` `application@@datafile@@description` `application@@datafile@@description` `</INDEX>`
Link `<LINK>...</LINK>`	Allows for the inclusion of a link into a document. This command is used for embedding normal-style links into documents. The syntax follows: `<L>` `URL@@linkname` `</L>`
Mail `<MAIL>...</MAIL>`	Allows for the inclusion of a `mailto` tag. The syntax follows: `<MAIL>` `address@@name` `</MAIL>`
Paragraph `<PAR>...</PAR>`	Signals a body of text that should be treated as a paragraph.
Picture `<PIX>...</PIX>`	Signals that the following body of text should be treated as an image with an appropriate tag. The format of this command follows: `<PIX>` `image.gif@@http://www.cmsgroup.com` `</PIX>`
Product Index `<PINDEX>...</PINDEX>`	Allows for the inclusion of an index into a document, but it differs from the `Index` command in that a small icon is placed next to each link.
Section `<SEC>...</SEC>`	Signals the beginning and end of a section.
Table Making `<TABLE>...</TABLE>`	Signals that the information in the following data file should be handled as a table. The format of this command follows: `<TABLE>` `data file name` `</TABLE>`
Title `<TITLE>...</TITLE>`	Signals a body of text that should be treated as a title.
Title Picture `<TPIX>...</TPIX>`	Signals that the following body of text should be treated as a title image with an appropriate tag. The format of this command follows: `<TPIX>` `image.gif@@http://www.ntcnet.com` `</TPIX>`

Technical Issues

In this section, we present an overview of tools and methodologies used to develop this DIDDS site. While we demonstrate that relatively little access to sophisticated libraries and programming environments is required, an understanding of Perl libraries, data structures and organization, and directory structure is necessary.

Perl Libraries

At present, the entire DIDDS application library is written in Perl 5. For the most part, the DIDDS library consists of routines that simplify the generation of HTML. It is not nearly as sophisticated as some of the HTML modules written for Perl 5, but it has proven to be very functional by both programmers and nonprogrammers. We also make heavy use of the CGI.pm library written by Lincoln Stein for construction of forms used in many of the DIDDS applications.

Data Structure and Organization

In order to manage a large site with many contributing parties, it is important to construct a well-defined directory structure for the Web site.

Because security is of utmost importance, the scripts are maintained in their own directory that is accessible by using the ScriptAlias directive of the Apache and NCSA servers. A similar command exists for CERN-based servers. In fact, the parties contributing to site construction do not have access to the scripts at all because they need to supply only the necessary data and image files, which are kept in separate directories. By keeping all data files in a common file tree, there is no need for the site contributors to worry about the directory structure when referring to other site information. It should be noted that images, pictures, and icons are stored in an appropriate directory so that URLs such as http://www.cmsgroup.com/fofpix/image1.jpg are valid.

As an explicit example, the FoF client can be set up in the following manner:

http://www.cmsgroup.com/fof

is a ScriptAlias for the directory

~fof/didds/bin

wherein resides all DIDDS applications for this client. The appropriate data files are stored in

~fof/didds/data

The images (which must have a valid non-cgi URL) are referenced in the following manner:

http://www.cmsgroup.com/fofpix

This is a link connecting

/web_site_base_directory/fofpix

to the directory

~fof/didds/pix

web_site_base_directory is the top of the file tree referenced by the Web server.

In this manner, access to the images does not allow outside users to access the data or script directories. Of course, this assumes that the server has the security configured correctly. Figure 27.4 provides an illustration of these directories. The solid lines indicate the hierarchy and the dotted lines indicate the directories that are called by corresponding URLs.

Figure 27.4: Directory configuration.

Again, site contributors do not need to understand the site construction in order to maintain their data files as long as all images, pictures, and icons are stored in their proper places.

Directory Structure

One of the methods we use to aid in site management and development is a CGI program called SiteManager. This program provides a form-based interface that enables site contributors to upload data and image files, edit existing data files, and edit site attributes. With this program, contributors to the site have the freedom to alter previously entered data and supply new data without the need to pay for costly updates to their Web site. One example is the price data for the beef. This data is stored in a file supplied by the contributor. The contributor can update the current price list simply by uploading the new data file. Because the site is completely dynamic, the prices are updated across the entire site upon data upload. In addition, new graphics logos can be uploaded at any given time. Placing the capability to do such site modifications into the hands of the site owners reduces maintenance costs and enhances site usefulness as a business tool.

Summary

The dynamic web page procedure can be especially useful for organizations or companies that have large or numerous databases they want to port over to Internet or intranet platforms. For many organizations that have constantly changing data or large databases to maintain, it would not be practical to implement a Web site using static HTML tagging because the necessity of making so many frequent minor adjustments would be too complicated, time-consuming, and costly. With a dynamic Web site, changes to the databases become trivial and alleviate site managers from the worry of having conflicting or outdated information because the data is stored in one, and only one, place. Hence, when any data item is called on a dynamic site, regardless of where it appears on the site, it will be consistent and accurate throughout the site as long as the database itself is maintained regularly.

Another difficulty that is overcome with this approach involves the levels of technical expertise required of people who oversee database modifications. Currently, most other approaches necessitate at least some training in HTML to make changes on Web databases. The user-friendly point-and-click interface of a finished DIDDS site allows data to be added or changed by individuals with little technical expertise. Therefore, by moving some of the mundane Web maintenance responsibilities to individuals closer toward the clerical end of the staff and away from the system-administration side, organizations can save money by avoiding expensive HTML training for staff members or avoiding the use of highly paid staff to do trivial database changes and updates.

Overall, this approach to information management treats text and graphics as part of a much larger information base. That is, it allows for the management of information as data objects rather than as characters typeset on a page. Information is referred to in terms of its type. After the mode of presentation is determined, data type and style are merged to produce an appropriate document. The SGML approach provides an industry-standard document interface that keeps information and presentation separate. This has many advantages, because it is difficult to predict every future view of one's databases. In addition, pursuing industry-standard methodologies prevents site owners and developers from being locked into a proprietary system. In fact, a strength of this approach is its reliance on standard tools and packages that stand in high regard in the UNIX community.

Chapter 27

Creating and Managing Dynamic Web Sites: Differentiating Data from Display

CONTENTS