`cmc
`ERICaN (ol
`development!
`
`Understand the Web development
`process— planning, analysis,
`design, implementation, and
`maintenance
`
`Learn to use tools like HTML,
`VRML,andJavato create enticing
`Web content
`
`Page 1 of 37
`
`UNLEASHED
`
`Master advanced CGI gateway
`programming techniques with C,
`Perl, and REXX
`
`MICROSOFT CORP.
`
`EXHIBIT 1013
`
`Page 1 of 37
`
`
`
`For my grandparents, Isabelle and Joseph December, and Aili and
`Arthur Hill.
`Copyright © 1995 by Sams.net Publishing
`
`FIRST EDITION
`
`All rights reserved. No part of this book shall be reproduced, stored in a
`retrieval system, or transmitted by any means, electronic, mechanical,
`photocopying, recording, or otherwise, without written permission from the
`publisher. No patent liability is assumed with respect to the use of the
`information contained herein. Although every precaution has been taken in
`the preparation of this book, the publisher and author assume no responsi(cid:173)
`bility for errors or omissions. Neither is any liability assumed for damages
`resulting from the use of the information contained herein. For informa(cid:173)
`tion, address Sams.net Publishing, 201 W. 103rd St., Indianapolis, IN
`46290.
`
`International Standard Book Number: 0-672-30745-6
`
`98 97 96 95
`
`Library of Congress Catalog Card Number: 95-69417
`4 3 2
`Interpretation of the printing code: the rightmost double-digit number is
`the year of the book's printing; the rightmost single digit, the number of the
`book's printing. For example, a printing code of95-l shows that the first
`printing of the book occurred in 1995.
`Composed in A Garamond, Futura, and MCPdigital by Macmillan Computer
`Publishing
`Printed in the United States of America
`
`Trademarks
`
`All terms mentioned in this book that are known to be trademarks or service
`marks have been appropriately capitalized. Sams.net Publishing cannot
`attest to the accuracy of this information. Use of a term in chis book should
`not be regarded as affecting the validity of any trademark or service mark.
`
`President, Sams Publishing Richard K Swadley
`Publisher, Sams.net Publishing George Bond
`Managing Editor Cindy Morrow
`Marketing Manager john Pierce
`
`Acquisitions Editor
`Mark Taber
`Development Editor
`Dean Miller
`Software Development
`Specialist
`Steve Flatt
`Production Editor
`Katharine Stuart Ewing
`Copy Editors
`Susan Christophersen
`Mitzi Foster Gianokos
`Chuck Hutchinson
`Technical Reviewer
`Alan Richmond
`Editorial Coordinator
`Bill Whitmer
`Technical Edit Coordinator
`Lynette Quinn
`Formatter
`Frank Sinclair
`Editorial Assistant
`Carol Ackerman
`Cover Designer
`Jason Grisham
`Book Designer
`Alyssa Yesh
`Production Team Supervisor
`Brad Chinn
`Production
`Carol Bowers, Mona Brown,
`Terrie Deemer, Cheryl
`Dietsch, Michael Dietsch,
`Greg Eldred, Michael Henry,
`Ayanna Lacey, Kevin Laseau,
`Paula Lowell, Steph Mineart,
`Nancy Qt Price, Brian-Kent
`Proffitt, SA Springer, Tim
`Taylor, Mark Walchle
`
`Page 2 of 37
`
`
`
`Part I
`
`Introduction xx
`Introduction to Web Systems and Applications
`1
`The World Wide Web as a Communication System 3
`2 A Developer's-Tour of the Web 41
`3 Options for Web Connections 71
`Part II Web Development Processes
`4 Web Development Principles and Methodology Overview 89
`5 Web Planning 113
`6 Web Analysis
`137
`7 Web Design 157
`8 Web Implementation 183
`9 Web Promotion 195
`10 Web Innovation 209
`Part Ill Web Implementation and Tools
`11 Design and Implementation Style and Techniques 221
`· 12 Basic HTML 251
`13 Advanced HTML 279
`14 Forms 303
`15 Multimedia 317
`16 Basics oflmagemaps 333
`17 Implementation Tools 34 i
`18 Development and Language Environments 361
`Part IV Gateway Programming
`19 Principles of Gateway Programming 373
`20 Gateway Programming Fundamentals 391
`21 Gateway Programming I: Programming Libraries and Databases 425
`22 Gateway Programming II: Text Search and Retrieval Tools 485
`23
`Scripting for the Unknown: The Control of Chaos 517
`24 Transaction Security and Security Administration 561
`25 Gateway Programming Language Options and
`a Server Modification Case Study 583
`
`Page 3 of 37
`
`
`
`Part V Development Case Studies
`26 VRML on the Web 613
`27 C-Based Gateway Scripting 639
`28 Writing CGI Scripts in REXX 663
`29 A Web Coloring Book 679
`30 A Campus-Wide Information System 691
`31 A Hypertext News Interface 711
`32 A Graphical Web Page Counter 725
`Part VI Appendixes
`A
`Sources of Further Information 753
`B HTML Tag and Supporting Information Summary 781
`C Glossary 807
`Index 813
`
`Page 4 of 37
`
`
`
`1
`Pan I Introduction to Web Systems and Applications
`1
`· The World Wide Web as a Communication System 3
`An Overview of the World Wide Web
`Origins of the Web
`A Definition of the World Wide Web
`The Role of the Web Within Cyberspace
`The Topology of Cyberspace
`The Internet and the Web Within Cyberspace
`Gateways Among Networks
`The Web Within the Internet
`Information Spaces in the Web
`Uniform Resource Locators
`Information Spaces
`Communication Contexts on the Web
`Web Navigation Summary
`Searching the Web by Subject
`Searching the Web by Keyword
`Searching the Web by Geographical Space
`Searching the Web by Information Space
`Searching the Web by People Space
`Web Introductory Check
`2 A Developer's Tour of the Web 41
`An Overview of the Web's Potential
`Web Functions
`Information Delivery
`Information Protocols
`Communication
`Interaction
`Computation
`Web Development Phases
`Design
`Implementation
`Developer's Tour Check
`3 Options for Web Connections 71
`Connecting to the Internet
`:
`User Connections
`Information Provider Connections
`Accessing the Web
`Web Server Options
`Web Browser Options
`Web Access Bootstrap Tutorial
`Web Connections Check
`
`4
`4
`8
`18
`19
`19
`21
`23
`24
`24
`26
`27
`28
`28
`31
`34
`35
`36
`39
`
`42
`46
`46
`46
`50
`59
`64
`64
`65
`67
`69
`
`72
`73
`81
`84
`84
`84
`85
`86
`
`Page 5 of 37
`
`
`
`HTML and CCI
`1-------------------------------------------------
`UNLEASHED
`
`90
`91
`93
`96
`96
`103
`105
`108
`108
`112
`
`114
`
`114
`
`87
`Part II Web Development Processes
`4 Web Development Principles and Methodology Overview 89
`The Web as a Medium for Expression
`Web Media Characteristics
`Web Media Qualities
`Web User Experience
`Web Navigator Needs
`Web User Experience
`Web Communication Processes
`A Web Development Methodology
`Web Development Overview
`Web Principles and Methodology Check
`5 Web Planning 113
`Principles of Web Planning
`The Limits of Web Planning: What a Developer
`Can't Control
`The Opportunities of Web Planning: What a
`Developer Can Control
`Web Planning Processes and Techniques
`Strategic and Policy Planning
`System Planning
`Individual Web Planning Techniques
`Web Planner's Check
`6 Web Analysis
`137
`Web Analysis Processes
`Web Analysis Principles
`Information Analysis Points
`Design and Performance Analysis
`Semantics
`Implementation Analysis
`Web Information Analyst's Check
`7 Web Design 157
`Overview of Web Design
`Principles of Web Design
`Web Design Methodologies
`Top Down
`Bottom Up
`Incremental/In-Time
`Design Techniques
`Package Information in the Right-Sized Chunks
`Link Pages Together
`Specify Overall Look and Feel with a Universal Grid
`Use Repeated Icons
`Create and Use Web-Wide Navigation Links
`Use Information Cues
`Consider Media Type, Information Structure,
`················-········-·································· 171
`..... nrl rnnnprt-1f)n<::
`
`158
`160
`161
`161
`161
`162
`162
`162
`165
`167
`168
`169
`170
`
`·
`
`120
`121
`122
`125
`126 .
`134
`
`138
`139
`140
`149
`152
`152
`155
`
`Page 6 of 37
`
`
`
`Design Problems
`The Page from Outer Space
`The Monster Page
`Multimedia Overkill
`The Uneven Page
`Meaningless Links
`Clown Pants
`KOOL Design
`Web Designer's Check
`8 Web Implementation 183
`Implementation Overview
`Implementation Principles
`Implementation Processes
`Working with People
`Choosing the Level of HTML Compliance
`Testing
`Solving Implementation Problems
`Continuous Implementation
`Web Implementor's Check
`9 Web Promotion 195
`Web Promotion Principles
`Web Promotion Techniques
`Publicity Timing
`General Web Releases
`Focused Web Releases
`Current Web Releases
`Web Business Models
`Web Presence
`Customer Service
`Sponsorship
`Advertising
`Publishing
`Web Promoter's Check
`10 Web Innovation 209
`Innovation Overview
`Web Innovation Techniques
`Keep All Processes of Web Development Going
`Monitor the User's Information Environment
`Continuously Improve Quality
`Consider Technological Advances for Use in the Web
`Web Innovator's Check
`Part Ill Web Implementation and Tools 219
`11 Design and Implementation Style and Techniques 221
`Overview oflmplementation and Design
`System-Level Design and Implementation
`Web-Level Design and Implementation
`
`.....,
`
`T
`
`1 r--,,
`
`•
`
`t T
`
`I
`
`Contents
`
`;
`
`173
`174
`175
`176
`176
`177
`178
`180
`181
`
`184
`186
`187
`187
`190
`192
`192
`193
`193
`
`: .. 19G
`1 99
`200
`201
`203
`203
`204
`205
`205
`206
`206
`206
`207
`
`210
`211
`211
`211
`212
`216
`217
`
`222
`224
`224
`
`Page 7 of 37
`
`
`
`HTML and CG!
`f---------------------------------------------
`U NL EASH ED
`
`.'
`
`Design and Implementation Essentials
`Information Organization
`Context and Information Cues
`Page Length
`Page Grid
`Typography
`Links
`Schools of Web Design
`Early ASCII: Text
`Classic: Tripartite Web Page
`Modern: Graphic Slabs
`Postmodern: Fragments
`Early Virtual: Scene
`Language Issues and Style
`Focus on Audience Needs
`Shape Information to Meet Audience Needs
`Use Techniques to Shape Information
`Design Techniques
`Chunking Information
`Creating a Common Look and Feel..
`Implementation Techniques
`Creating a File Management Structure
`Maintaining Source Code Control
`Using Tools
`Creating Web Components
`Implementation and Design Check
`12 Basic HTML 251
`Overview of HTML
`HTML and SGML
`The Philosophy of HTML
`HTML Description (Levels O and 1)
`Elements
`An HTML Document Layout
`HTML Tutorial
`What HTML Levels O and 1 Can't Do
`HTML Features That Many Developers Find Tricky
`Getting Started: Basics
`Some Flairs and Details
`A Sample HTML Page
`Implementing a Look-and-Feel Template
`More HTML Features
`Anchors
`Nesting
`Semantic Versus Physical Tags
`Nicks and Cuts
`Key HTML Information Sources
`Basic HTML Check
`13 Advanced HTML 279
`
`225
`226
`230
`231
`232
`233.
`233
`235
`235
`236
`236
`238
`238
`239
`239
`239
`240
`241
`242
`245
`246
`246
`248
`248
`248
`249
`
`252
`252
`254
`255
`255
`263
`263
`263
`263
`264
`268
`270
`271
`273
`273
`273
`274
`275
`276
`276
`
`Page 8 of 37
`
`
`
`'.
`
`Form Elements
`A Sample Form
`HTML Level 3
`HTML Level 3 Elements
`A Sample Level 3 Table
`Netscape Extensions to HTML
`Netscape Extensions to the BODY Element
`Netscape Exten~ions to HEAD and Related Elements
`Netscape Extensions to BODY Elements
`Netscape Extensions to the TABLE Element
`Advanced HTML Check
`14 Forms 303
`A Forms Tutorial
`The FORM Element
`Elements in a Form
`Having a Form Do Something
`Forms Check
`15 Multimedia 317
`Multimedia Technical Overview
`Images
`Sound
`Movies
`Multimedia Use Issues
`Multimedia Usability
`Multimedia Design
`The Future
`Multimedia Check
`16 Basics oflmagemaps 333
`An Overview oflmagemaps
`A Complete Imagemap Example
`The HTML File _
`The MAP File
`The Results
`Imagernap Check
`17 Implementation Tools 341
`Overview of Tools
`HTML Converters
`A Sample Converter: LaTeX to HTML
`Conversion Program Information Sources
`HTML T etnplate Applications
`HTML Editors
`A Sample HTML Editor: asWedit
`Other HTML Editors Available
`Tools Check
`18 Development and Language Environments 361
`Overview of Development Environments
`Cyberleaf
`
`.-
`
`Contents
`
`280
`283
`287
`287
`291
`295
`295
`296
`296
`298
`301
`
`304
`305
`305
`308
`315
`
`318
`319
`325
`327
`328
`329
`329
`330
`331
`
`334
`337
`337
`338
`338
`339
`
`342
`; .. 342
`343
`348
`3 51
`351
`352
`357
`359
`
`362
`362
`
`·
`
`Page 9 of 37
`
`
`
`HTML and CG!
`f------------------------------------------
`U NL EASH ED
`
`Extending the Web Through New Languages
`Java
`Java: Possibilities
`The Java Language
`Virtual Reality Modeling Language (VRML)
`Environments and Languages Check
`Part IV Gateway Programming 371
`19 Principles of Gateway Programming 373
`Transmission Control Protocol-Internet Protocol (TCP-IP)
`Why Do We Need HTTP?
`Filling the Collaborative Vacuum
`A Closer Look at the Hypertext Transport Protocol (HTTP)
`What Is the Common Gateway Interface?
`The Flow of Data Using the Common Gateway Interface
`A Briefintroduction to Data Passing and Methods
`Environment Variables
`Standard Input
`CGI: An Expanding Horizon
`Typical Hardware And Server Software Platforms
`Typical CGI Programming Languages
`CGI Guidelines and Principles
`And Once The Coding Starts
`Software and Hardware Platform of the Gateway
`Programming Section Examples
`Principles of Gateway Programming Check
`20 Gateway Programming Fundamentals 391
`, Understanding Multipart Internet Mail Extensions
`(MIME) in the CGI Environment
`Understanding Environment Variables
`Variables That Contain HTTP Information about
`the Server and Do Not Depend on the Client Request
`Variables That Are Dependent on the Arrival of a
`Client Request
`Server-Side Includes
`Ready to Program: The Initial Steps
`From Client to Server to Gateway and Back
`How the Client Can Output Data
`Before the Server Passes the Data: Encoding
`How the Server Passes the Data to the Gateway Program
`Code Sample: The "Print Everything" Script
`Gateway Output
`Manipulating the Client Data with the Bourne Shell
`Manipulating the Client Data with Perl
`To Imagemap or Not to Imagemap
`Code Walk-Through: bimodal.pl
`An Integrated E-Mail Gateway Application
`Discussion of the Resume Application
`
`364
`365
`365
`366
`368
`368
`
`374
`376
`377
`377
`382
`382
`385
`385
`387
`388
`388
`388
`389
`389
`
`390
`390
`
`392
`394
`
`395
`
`395
`397
`400
`401
`401
`403
`403
`404
`407
`408
`409
`410
`410
`413
`419
`
`Page 10 of 37
`
`
`
`Contents
`
`457
`458
`466
`466
`466
`468
`471
`476
`476
`477
`478
`478
`
`421
`CGI Debugging Techniques
`422
`Gateway Programming Fundamentals Check
`21 Gateway Programming I: Programming Libraries and Databases 425
`Rules of Thumb in Gateway Programming
`426
`Perl Programming Libraries
`_ .. 427
`form_init
`427
`nyu_trailer
`428
`html_header
`428
`-
`home
`428
`An Overview of the Relational Database Model
`432
`Binary Searches
`432
`Binary Search Example I: Simple Rolodex Application
`433
`Binary Search Example II: Corporate Filings Lookup
`438
`Observations About formlynx.pl.b
`445
`A Double Binary Search with a Little State Thrown in
`447
`Comments About zacksnew.pl
`450
`Code Discussion: zack_tick.pl
`457
`Final Binary Search Example: The Partial Company
`to Ticker to Filing Application
`Company to Ticker to Filing Application Walkthrough
`Code Observations: zack3.pl
`Perl Version 5
`Perl and Relational Database Packages
`Illustra-An Unusual RDBMS
`Code Discussion: sql_wrapper.pl
`Observations About sql_loss.pl
`Database Tuning
`Pros and Cons of Relational Databases on the Web
`Gateway Programming: Libraries and Databases Check
`Additional Code
`22 Gateway Programming II: Text Search and Retrieval Tools 485
`Philosophies of Text Search and Retrieval on the Web
`Introduction to WAIS
`Using WAIS to Index a Data Archive for the Web
`Forms-Based WAIS Query Examples
`The Standard wais.pl Interface
`Some Observations About wais.pl
`Debugging the WAIS Interface
`Another Way to Ask a WAIS Query
`Observations About waissearch.pl
`:
`free WAIS-sf
`Building a free WAIS-sf WAIS Index: HTML Extensions
`wais-Sk.pl
`Code Discussion: wais-Sk.pl
`Pros and Cons of WAIS and WAIS-Like Packages
`Introduction to Glimpse
`Glimpse Indexing
`A Practical Test of Glimpse
`
`486
`489
`.490
`.492
`493
`495
`495
`495
`496
`498
`499
`501
`502
`503
`503
`503
`505
`
`,
`
`Page 11 of 37
`
`
`
`HTML and CG!
`UNLEASHED
`
`/
`
`:
`
`Code W alkthrough: glimpse. pl
`Pros and Cons of Glimpse
`Harvest
`Text Search and Retrieval Tools Check
`23 Scripting for the Unknown: The Control of Chaos 517
`Bridging the Unknown by Maintaining State
`Using the QUERY_STRINGVariable
`Using PATH_INFO
`Form Variables
`Combining Methods of Maintaining State
`Generating Graphics at Runtime
`Image File Formats
`Access Counters
`Gnuplot and Server Stats
`NetPBM
`An HTML Form to Make Buttons
`gdl.1.1
`Using Expect to Interact with Other Servers
`Retrieving Web Data from Other Servers
`Scripting for the Unknown Check
`24 Transaction Security and Security Administration 561
`Cryptographic Terminology
`Data Encryption Standard (DES)
`RSA Public-Key Cryptography
`Kerberos
`Pretty Good Privacy (PGP) and Privacy-Enhanced Mail (PEM)
`NCSA httpd and PGP/PEM
`Riordan's Privacy-Enhanced Mail (RIPEM)
`Netscape Communication Corporation's Secure Sockets
`Layer (SSL)
`Secure NCSA httpd
`Comments on the S-HTTP Protocol and SSL from
`a Developer's Perspective
`Electronic Commerce-Security Considerations
`NetCheque and NetCash
`First Virtual
`Digicash's E-Cash
`Comments on Electronic Payment Systems
`Security Pitfalls of CGI Programming
`Web Administrative Security Overview
`NCSA's htpasswd Scheme
`NCSA's "Host Filtering" Method
`Transaction Security and Security Administration Check
`25 Gateway Programming Language Options and a Server
`Modification Case Study 583
`Perl 5
`Python
`-r_J -c •• ---~ --~ 'T'L
`
`:
`
`-
`
`510
`512
`512
`515
`
`519
`519
`520
`522
`525
`5 31
`532
`534
`536
`53B
`540
`542
`546
`556
`560
`
`563
`563
`563
`564
`565
`565
`565
`
`567
`568
`
`569
`569
`569
`570
`570
`572
`572
`574
`575
`580
`581
`
`584
`589
`<;q~
`
`Page 12 of 37
`
`
`
`Contents
`
`600
`608
`608
`610
`
`615
`616
`618
`619
`620
`620
`620
`
`Case Study: Modification of the Server Imagemap Software
`Technical Discussion of the Code Changes to imagemap.c
`Ten Commandments for Web Developers
`Programming Language Options and Server Modification Check
`Part V Development Case Studies
`611
`26 VRML on the Web 613
`History of VRML
`Introduction to Creating VRML Worlds
`What You Need (Browsers, Hardware, Internet Access)
`Using the Browsers
`VRML Site Design
`Identify the Goal of the Site
`Identify the Participants
`Create Wild Ideas for Site Possibilities, Including
`621
`Multi-User Interactivity, Sound, and Behaviors
`621
`Consider Bandwidth and Rendering Concerns of Participants
`621
`Plan the Range of Site Configurations
`621
`Design the Overall Framework of the Site
`Create Actual VRML Objects················:······································· 621
`Integrate into Completed Site
`622
`Perform the Initial Testing and Refinement
`622
`Test with Amateur Users and Assess Site Goal
`Achievability and Refinement
`Continue Improving and Redesigning the Site
`Authoring Tools and Converters
`Optimizing Virtual Reality on the Web
`Rendering Speed
`CGI and VRML
`,
`Hacks for Advanced VRML Within 1.0
`The Future ofVRML, Version 2.0 and Beyond
`Interactivity
`Behaviors
`Sound
`Multi-User
`Other VRML 2.0 Issues
`Business Models for Virtual Reality on the Web
`Go Forth and Create Virtual Worlds
`VRML Resources Appendix
`General VRML Information
`VRML Browsers
`Software
`Authoring Tools
`Converters
`Interesting VRML Web Sites
`Related Resources oflnterest
`27 C-Based Gateway Scripting 639
`C as a Scripting Language
`
`622
`622
`623
`623
`625
`626
`628
`630
`630
`630
`630
`631
`631
`632
`634
`634
`634
`635
`636
`637
`637 ·
`637
`638
`
`,
`
`:
`
`640
`.r L -t
`
`Page 13 of 37
`
`
`
`HTML and CG!
`UNLEASHED
`
`Implementing C Across Different Environments (UNIX)
`Reading Input
`A Very Simple C Script
`Tips and Techniques for C-Based Scripts
`Create Generic Procedures for Common or Repetitive Tasks
`Assign #define Definitions for URL/File/Path References
`Categorize Major Procedures into Groups
`Minimize File I/0 Wherever Possible
`Always Be Prepared for Invalid User Input
`Implement File and Record Locking
`Case Study: A "Sign-In Guest Book" Application
`The guestbook.c Program
`An Outline of How the Guest Book Works
`The Guest Book Program Check
`28 Writing CGI Scripts in REXX 663
`What Is RE:XX?
`A Brief History ofRE:XX
`Getting Input to the Script
`QUERY_STRING Environment Variable
`PATH_INFO Environment Variable
`Standard Input
`Decoding Input from a Form
`Sending the Document Back to the Client
`Diagnostics and Reporting Errors
`Security Concerns
`Beware of the REXX INTERPRET or ADDRESS
`UNIX Statements
`Escaping Dangerous Characters
`Restricting Distribution oflnformation
`Testing the Script
`A Simple RE:XX CGI Script
`Conclusion
`-
`29 A Web Coloring Book 679
`Initial User Input and Setup
`The User Chooses a Picture to Color
`Save a Copy of File for User to Work On
`Outputting the HTML Page
`The Coloring Iteration
`User Submission
`The gd Binary
`Output Another Coloring Page
`Other Options and Housekeeping
`Done Coloring
`Download GIF
`Download PostScript
`Housekeeping
`Final Advice
`
`-
`
`641
`642
`643
`643
`644
`644
`644
`645
`645
`646
`649
`653
`657
`661
`
`664
`665
`666
`667
`668
`GG8
`670
`670
`671
`673
`
`673
`675
`676
`677
`677
`678
`
`680
`680
`682
`683
`684
`684
`684
`686
`687
`687
`688
`688
`689
`689
`
`Page 14 of 37
`
`
`
`30 A Campus-Wide Information System 691
`Course Information
`Navigational Aids
`Finding the Data
`Adjusting and Translating the Images
`Tidying Up
`Walking Through the Building
`Correcting the Floorplan
`Making Imagemaps
`Entering Occupancy Data
`where_is
`Virtual Walkthrough
`Unsolved Problems
`Speed Versus Beauty
`Terse Versus Verbose
`More Shorter Pages Versus Fewer Longer Pages
`Netscapisms Versus Standard
`Official Versus Unofficial
`Campus-Wide Information System Check :
`31 A Hypertext News Interface 711
`Problem Definition
`Project History
`Design Constraints
`The Implementation Process
`Overview of the Interface
`The Query Page
`The Message List Browser
`The Article Page
`A Sample Archive's Home Page
`The Implementation
`The Build Process
`The Database Format
`Executing Queries
`Maintaining State Between Pages
`Some Advanced Features
`Article Filters
`URL-Based Queries
`Browser-Dependent Customizations
`Future Plans
`Hypertext News Interface Check
`32 A Graphical Web Page Counter 725
`The IMG Tag
`Counting Each Time a Page Is Viewed
`Simple Test Script
`Image in X-Bitmap Format
`Open Inventor and a 3-D Counter
`Open Inventor
`
`_.
`
`·
`
`Contents
`
`_.
`
`:
`
`692
`696
`696
`697
`697
`698
`698
`698
`699
`700
`703
`707
`707
`707
`707
`708
`708
`709
`
`712
`712
`712
`713
`713
`714
`714
`715
`717
`718
`718
`719
`720
`720
`721
`721
`722
`722
`723
`723
`
`726
`727
`727
`729
`732
`733
`
`Page 15 of 37
`
`
`
`HTML and CG!
`UNLEASHED
`
`Converting to GIF Format
`Bibliography
`Web Counter Check
`Part VI Appendixes
`751
`A
`Sources of Further Information 753
`General Web Information
`FAQs
`Access
`Software
`Developing Information
`Navigating the Web
`News and Discussion
`Internet Searching Information
`New or Noteworthy
`Resource Lists
`Subjects
`Keyword
`Spaces
`People
`Internet Technology Information
`Computing
`Developing
`Human Interaction
`Multimedia
`Virtual
`Networks
`Internet Commerce
`Information
`Marketplaces
`B HTML Tag and Supporting Information Summary
`HTML Tag Summaries
`C Glossary 807
`Index 813
`
`736
`7 49
`7 49
`
`754
`754
`754
`755
`755
`755
`756
`756
`756
`758
`760
`762
`763
`764
`765
`765
`766
`766
`766
`770
`770
`778
`778
`780
`
`781
`
`:
`
`782
`
`Page 16 of 37
`
`
`
`Introduction to Web Systems and Applications
`Part I
`
`FIGURE 1.9.
`The Web within the
`Internet.
`
`Network access - - - -►
`
`Hypertext link _...,
`
`Web
`browser
`
`I
`I
`I
`I
`
`I ' ..
`
`Text
`graphics
`
`FTP
`server
`
`i
`
`'
`
`HTML
`~
`
`Text
`
`Web ¥Y«4Y«
`
`The resources shown in Figure 1.9 include a remote login to a host through the Telnet proto(cid:173)
`col, a link to a text file on a File Transport Protocol (FTP) server, a link to a menu on a Gopher
`server, and a link to another HTML document on another Web server. Thus, the Web links
`disparate resources scattered across the Internet.
`
`Information Spaces in the Web
`
`The Web's "linking" relationship with Internet resources is one of its chief characteristics. The
`Web's scheme for referring to these Internet resources creates a structure of information spaces.
`Uniform Resource Locators
`The basis for referring to resources on the Web is the Uniform Resource Locator, or URL. A
`URL consists of a string of characters that uniquely identifies a resource. A URL is like a cata(cid:173)
`log number for a resource. When a Web browser opens a particular URL, the user will gain
`access to the resource referred to by that URL.
`
`Page 17 of 37
`
`
`
`The World Wide Web as a Communication System
`Chapter 1
`
`The basic format for many (but not all) URLs is as follows:
`
`scheme://host:port/path
`in which
`scheme is one of the rules or protocols to retrieve or send information, such as FTP,
`NNTP, Gopher, Telnet, and others
`host is the computer host on which the resource resides
`port is a particular number that identifies the service requested from the server; this
`number is provided if the service is installed on a port different than the standard one
`for that service
`path is an identification of the location of a resource on a particular computer host
`There are other variations in format that a Web navigator will encounter. For example, the
`URL news: comp. infosystems. www. misc refers to a Usenet newsgroup.
`The URL telnet: / / locis. loc. gov refers to a Telnet connection to the U.S. Library of Congress's
`online catalogs and databases. When a Web browser opens this URL, a Telnet session will begin
`(a session in which the user can log on to a remote computer host).
`The URL http://www. rpi. edu /-decemj /works /wwwu /contents. html#part3 refers to a particu(cid:173)
`lar section of a hypertext page. The page resides on the host www. rpi. edu and has the pathname
`of -decemj /works /wwwu /contents. html. The #part3 at the end of the pathname for the file in(cid:173)
`dicates that this URL will cause the Web browser software to "go" to a specific place within the
`file labeled with the anchor named part3. (How to construct and name these anchors will be
`covered in detail in Part II.)
`The URL http:/ /www·. ncsa. uiuc. edu/SDG/Experimental/demoweb/marc-global-hyp. au is an audio
`file (. au extension) located on a server demonstrating the Mosaic browser's capabilities. This
`sound file, when accessed by a browser (provided that the user has the appropriate audio player
`software and hardware installed in the computer) will produce a voice message.
`The URL http:// uu -gna. mit. edu: 8001 / uu -gna/ index. html refers to the home page of the
`Globewide Network Academy, an organization dedicated to creating a fully accredited online
`university. Note that this URL has a port number (8001) specified by the developers of this
`page. The standard port number for HTTP access is 80; so, when a port not equal to 80 is set
`for HTTP access, a user should use it in the URL. If the user leaves off the port number, the
`following error message will be generated:
`
`Requested document (URL http://uu-gna.mit.edu/uu-gna/index.html) could not be
`accessed. The information server either is not accessible or is refusing to serve
`the document to you.
`
`Page 18 of 37
`
`
`
`-
`
`I
`
`Introduction to Web Systems and Applications
`Part I
`
`All URLs, however, share the same purposes. When used in a Web document, a URL refers to
`a resource in hypertext anchors displayed by Web browsers. When "opened" by a user in a
`Web browser, a URL causes the resource to which it refers to be retrieved across the network
`and displayed in the Web browser. In the future, other forms of addressing will play a role on
`the Web (see http:// info. cern. ch/ hypertext /WWW/ Addressing/ Add res sing. html).
`
`WEB HYPERTEXT TERMINOLOGY
`
`.Although the concept of hypertext and its actual use in computer systems has been
`around a long time, terminology for Web-related hypertext elements is evolving, both
`in formal definitions and informal usage. The following terms are often used in talking
`about Web-based hypertext:
`OXQY Refers to a single sheet of hypertext (a single file of HTML).
`- _= Y 6XQY Refers to a designated entry point for access to a local web. Also refers to
`a page that a person defines as his or her principal page, often containing personal or
`professional information.
`- _0¥6_0 The region of displayed hypertext that, when selected, links the user to
`another point in the hypertext or another resource.
`\ Y$ (lowercase w) A set of hypertext pages considered a single work, often located on
`a single server. In popular usage, it is synonymous with Home page.
`: Y$ (uppercase Wj The set of hypertext on Web servers worldwide; in a broader
`sense, all information available through a Web browser interface.
`
`KEY RESOURCES
`
`For finding out more about URLs, see the following:
`"Uniform Resource Locators." http:/ /WWW. w3. org/hypertext /WWW/ Addressing/URL/
`Overv i.ew. html
`Theise, Eric S. (1994 January 7). "Curling Up to Universal Resource Locators."
`gopher://gopher.well.sf.ca.us/00/matrix/internet/curling.up.02
`
`Information Spaces
`URLs "point into" information spaces on the Web based on the information protocol used.
`For example, all FTP URLs can be considered to exist in FTP space, the set of all servers
`publicly available for anonymous FTP. This space is just one region of the Internet's resources,
`but represents a vast repository of knowledge to which the Web can connect. Not only does a
`URL identify the protocol used for the information, but a URL also often identifies the type of
`
`Page 19 of 37
`
`
`
`Principles of
`Gateway ..
`Programming
`
`IN THIS CHAPTER
`
`■ T ransmissiort Control Protocol-Internet
`Protocol (TCP-IP) 374
`■ Why Do We Need HTTP? 376
`■ A Closer Look at the Hypertext Transport
`Protocol (HTTP) 377
`. ■ What Is the Common Gateway
`Interface? 382
`■ The Flow of Data Using the Common
`Gateway Interface 382
`■ A Brief Introduction to Data Passing and
`Methods 385
`■ CGI: An Expanding Horizon 388
`■ Typical Hardware and Server Software
`Platforms 388
`
`Page 20 of 37
`
`
`
`Gateway Programming
`Part IV
`
`In this chapter, I start with principles, including a brief description of the Internet protocols
`that enable the World Wide Web in general and gateway programming in particular: Trans(cid:173)
`mission Control Protocol-Internet Protocol (TCP-IP) and the Hypertext Transport Protocol
`(HTTP).
`The Web can be thought of as a distributed information system. It is capable of supporting,
`seamlessly and globally, rapid and efficient multimedia information transfer between informa(cid:173)
`tion content sites ("servers") and information content requesters ("clients"). The servers are
`distributed in the truest sense of the word because there is no geographic constraint whatso(cid:173)
`ever on their location. The reader should pay particular attention to three critical properties of
`HTTP: its statelessness, its built-in mechanisms for an arbitrarily rich set of data representa(cid:173)
`tions (that is, its extensibility), and its use of the connectionless TCP-IP backbone for data
`communication.
`The chapter then moves on to the Common Gateway Interface (CGI). Important fundamen(cid:173)
`tal terminology is introduced, such as the "methods" that HTTP supports. The advantages
`that the CGI environment affords both information requesters and information providers are
`discussed and illustrated with short Perl programs.
`- Finally, typical hardware and software choices for Web sites are reviewed and the stage is set
`for the examples that I present in Chapters 20 to 25.
`
`Transmission Control Protocol-Internet
`Protocol (TCP-IP)
`
`It's not necessary to be a "propeller-head" (although it helpsl) to grasp the essentials of TCP(cid:173)
`IP. From the standpoint of the Web developer, here's what you really have to know:
`■ TCP guarantees end-to-end transmission of data from the Internet sender to the
`Internet recipient. Big data streams are broken up into smaller "packets" and reas(cid:173)
`sembled when they arrive at the recipient's site. Mercifully, this breakdown and
`reassembly are transparent to Internet users.
`■ IP gives you the familiar addressing scheme of four numbers, separated by periods. For
`example, the NYU EDGAR development site has an IP address of 128. 122. 197. 196. If
`the user always had to type in these numbers to invoke an Internet service, the world
`would be a gloomy place, but of course the Internet provides Domain Name Service
`(DNS)-and so the EDGAR machine has a friendlier name, edgar. stern. nyu. edu.
`■ TCP-IP is a connectionless protocol. This means that the route of data from the sender
`to the recipient is not predetermined. Along the way, the packets of data may well
`encounter numerous routing machines that use algorithmic methods for determining
`the next "packet hop">