Copyright 1996-1998 by James Mohr. All rights reserved. Used by permission of the author.
Be sure to visit Jim's great Linux Tutorial web site at http://www.linux-tutorial.info/
In this chapter we're going to talk about (as the title suggests) building your Internet server. Although you could build the server without actually connecting it to the Internet, I am going to make the assumption that you will want to share it with the rest of the world.
In the first part, we are going to talk a little bit about actually getting hooked up. Next, we're going to go through the basics of putting together a useful (but maybe not perfect), Internet site.
Keep in mind that you don’t really have to connect to the Internet to build an “Internet” server. You could use the same basic features to create and Intranet server. In other words, a server for your own internal network. This would allow you to centralize information, while still making it available to the entire company.
So what is the Web? Well, as I just mentioned, it is a network of machines. Not all machines on the Internet are part of the Web, but we can safely say that all machines on the Web are part of the Internet. The Web is the shortened version of World Wide Web, and as its name implies, it connects machines all over the world.
Created in 1989 at the internationally renowned CERN research lab, the Web was originally begun as a means of linking physicists from all over the world. Because it is easy to use and integrate into an existing network, the Web has grown to a community of tens of thousands of sites with millions of users accessing it. With the integration of Web access software, on-line services has opened the Web up to millions of people who couldn't have used it before.
What the Web really is, is a vast network of inter-linked documents, or resources. These resources may be pure text, but can include images, sound and even videos. The links between resources are made through the use of the concept of hypertext. Now, hypertext is not something new. It has been used for years in on-line help systems, for example, like those in MS-Windows' programs. Certain words or phrases are presented in a different format (often a different color or perhaps underlined). These words or phrases are linked to other resources. When we click on them, the resource that is linked is called up.
Resources are loaded from their source by means of the hypertext transfer protocol, http. In principle, this is very much like ftp, in that resources are files that are transferred to the requesting site. It is then up to the requesting application to make use of that resource, such as display and image, or playing an animation. In many cases, files are actually retrieved using ftp and the application simply saves the file on the local machine.
The application that is used to access the Web is called a Web browser or simply browser. The first commonly user browser was Mosaic from the National Center for Supercomputing Applications (NCSA). Currently Netscape is the most commonly used browser, with the Internet Explorer from Microsoft a distant second. I have worked primarily with Mosaic and Netscape on my SCO machine.
Information is stored on the Web in various formats. This can be text, images, even executable programs. In general, each piece of information is referred to a as a document, which may be loaded individually into your browser or together with other documents. Generally, the term used to refer to what is displayed by the browser is a page. The web keeps all of these documents and pages separate by using a unique identifier called a Universal Resource Locator or URL.
There are several methods by which documents are transferred across the Internet. These methods, or protocols, become part of the URL. URLs are composed of the protocol and the location of the document. You may find that the URL contains the name of the machine that the document is on, followed by the path to the document. Other times, there is just the path. This means that the new document is on the same machine as the current one. In a generalized format a URL looks like this:
An example of a URL using HTTP might look like this:
If a page that is on the machine www.jimmo.com were to already be loaded, a link could reference the same page like this:
So just what is a link? A link is a connection to another document. You can identify links on the page in that they are either underlined or have a different color than most of the text (or sometimes both). By clicking on the link, your Web browser loads the new documents. Also, images can be links and you will see them surrounded by a colored border. Note however, that I have seen many pages with images that are surrounded by a color border that are not links.
To be honest, it is not quite true that a link will bring you to a new document. Links can point to specific places within the same document. This is useful if the document is fairly long. A table of contents can be created at the beginning with links to the various sections. This is a lot quicker than having to scroll through the entire document.
Often, while browsing the Web, you may encounter a page that allows to input some information, press a button and then get some kind of response back. These pages are called forms. The most quickly accessible form (at least that I can think of) would be a Web search engine like Yahoo (www.yahoo.com). Here you input words or phrases and when you click on the accompanying button, something behind the scenes displays a list of all the websites and references it can find that match what you input. Another example of a form is the page on the SCO Web site where you input all of your information to register before you installed the CD.
There are two ways in which you can get connected to the Internet. The first is by connecting yourself directly. You become a point-of-presence on the Internet. Everyone else can reach at least one of your machines. The alternative is to have your site sit on someone else's machine. You create the pages, but someone else manages the connections. Each has there own advantages and disadvantages, which we will get into as we move along.
An Internet Service Provider (ISP) provides Internet services (makes sense). Up until recently this meant providing you will a connection to the Internet, either through a leased line or a dial-up connection.
When you have a leased line, you are normally on the Internet 24 hours a day. Visitors (i.e., potential customers) can reach you any time of the day or night. In fact, they might be connecting to you in the middle of the night your time, but it is the middle of the day their time.
The leased line usually requires a connection through your telephone company, as they are the only ones with the network of connections all over. What the costs are depend on the area and the kind of connection you want. In addition, the number of ISPs grows every day, it doesn't make sense to talk about prices and the connections themselves.
When you get a dial-up connection, you have to make the connection to your ISP every time you want to connect to the Internet. Your system may be configured that every time you send a mail message to the Internet it makes a connection, or it may be configured to only make the connections once an hour or only when you tell it.
If you are going to be providing services on the Internet, whether they are sales or just information, then you should really consider getting a leased line. In today's market, customers will not accept a CLOSED sign. They expect to reach your site any time of the day or night.
If you think that the costs are prohibitive, compare the amount you would spend on some other form of advertising against what you would spend for the leased line. For that price, it is worth it to get a commercial that runs 24-hours a day and is more entertaining and more valuable than anything you have ever seen on TV or in a magazine.
One of the major advantages of this kind of system is the Internet presence. You are accessible through the machine www.your_company.com. No other company has this address. It belongs to you. This is often easier to remember that a phone number as both the www and com parts remain constant. Just the company portion of the domain name needs to be remembered. Additionally, there is a certain amount of prestige in having your domain as compared to just having a page within someone else’s domain.
If you only want to be a passive participant on the Internet, then getting a leased line is not the best thing. With a passive connection, you get the information you want, but normally don't provide anything yourself. If you are constantly on the Internet looking for information, then maybe it would be worth getting a leased line. Otherwise the costs would be prohibitive.
The kind of physical connection you do decide to get (e.g., modem, ISDN) will depend on your needs and the amount of money you want to spend. To make a determination of what kind of connection to get, take a look at the section on networking.
I would recommend that you consider what kind of connection you need before you start looking for an ISP. Although you may really need something else, make a “guess” about what you think you will need. If you wait until you talk with an ISP, they may say that you need the service that they provide.
For example, they may only offer dial-up connections and therefore “recommend” that you set up your pages on their server. Their reason could be anything from security to cost. However, if you feel that you should have the site on your own server, then that is the service that you should get.
You need to consider both the cost and the bandwidth. You might think that you can only afford an ISDN connection, but what will the effects be if your customers have to wait too long to access the information? This is almost as bad as not having a site at all.
Instead of their own site, some business set up pages through an ISP. The Internet services that are being offered include providing you space on their Internet server for your pages. They also then provide the connectivity to the Internet.
Some ISPs have many companies that they provide space for. As a result, their Web site becomes a sort of "Internet Mall" where you can buy anything to suit your needs.
Another service that the ISP might provide is managing your site themselves. Your Internet server may be physically located at their site. You may provide the pages, but the ISP provides the hardware, the Internet connection and the administration. As a result, this might be substantially less expensive than administering it yourself.
Other services that ISP's provide include:
General advice of using WWW as a marketing tool
Assistance in Web services purchases
Consultants that may not offer these services, but consult you about them
My advice in any event is to see for yourself. Look at that company's web site. If they have a bad Web site, that doesn't look professional or is difficult to navigate through you can't expect them to put something worth while together for you.
Putting “your” site on someone else’s machine has a couple of advantages. First, it can be cheaper that a leased line, but still provide your customers with 24 hour access. Second, since you are not connected directly to the Internet there is no danger that someone will break into your system.
The disadvantage is the prestige. You are not a presence on the Internet yourself. You are accessed though a URL; like:
In my mind this diminishes your value. You are secondary to the ISP. If you have your own Web site, you are an equal to companies like IBM and Microsoft.
One important thing to consider is the fact that the Internet is already the marketing tool. No longer do you see 800 numbers in advertisements, but rather the company's URL.
When you do decide upon which direction to head, there are some things you should consider. First, does the ISP provide any kind of initial and on going market research/assessment? This is important in determining what to be offering on your site and whether you are achieving the goals. Part of this is also tracking and analysis of site traffic. For instance, ow many people are accessing your site, where they are located and what they are interested in.
Next is the integration of the Web site information into your overall Internet presence, as well as your corporate image. Do they understand your market well enough to integrate the pages into your market strategy? These cannot appear like disjointed or confused efforts, as your company will then appear confused.
Does the company provide services to develop your pages? Some will simply put your pages on their site, but it is up to your to make sure they work and are connected. Others will create and manage the pages for you.
Will they register your site for you? That is, will they get you an official IP address, or at least an official domain name for you so that you can connect to the Internet? Will they register your new site with the various search engines on the net? This is important so that people know that you are out there. What’s the point of having the greatest Web site if no one know you are there? There are several places that you can get your Web site registered and the ISP should know the correct procedures.
Another question is whether they will be a mail server for you. If your Web site is actually on a machine from the ISP, this becomes an issue. They might just provide the space for the Web pages and the connection to the Internet and nothing more. This means that you cannot get email through them. On the other extreme, my ISP will create as many users for me as I need and if I want, will even configure sendmail so that each user’s mail is forwarded someplace else. Get the details of what they will provide.
Of course, a very important aspect of all of this is what it will cost. This will obviously depend on what the ISP is providing. If you are doing the management of your site, then you may only need to pay for the connection charges. Some will provide a DNS registration services as well as have pointers to your machine from their DNS server. If the pages are on the ISP’s machine, you may need to be paying a monthly “rental” fee for the hard disk space. Although some sites will actually change a fee for each megabyte transferred, this appears to be generally a European phenomenon.
If you have the ISP do more of the work, you will probably be paying for the creation of the pages themselves. This can be on a per-page or per-job basis. If you want your site to be completely interactive, such as with forms, you can expect the fees to be substantially higher.
When deciding on the ISP, you need to consider the hardware they have. If you are just using them as your connection to the Internet, then you are primarily concerned with the type of connection to the Internet that they have. It is unlikely that the ISP has a dial-up line. However, it would not be surprising if they had an ISDN. While this is sufficient for low traffic sites, it can get overburdened quickly.
What kind of security does the ISP have? This means preventing people from getting unauthorized access to their site as well as the protection of the site in case of an emergency. Are their machines connected to an uninterruptable power supply? What kind of backup and restore procedures do they have? Is there a guaranteed recovery time?
If you have any version of SCO OpenServer that has the Graphical Environment then you automatically have a web browser: Mosaic. Mosaic actually comes in two forms on your system, the first being more commonly known as SCOHelp. From the desktop, this is started by clicking on the SCOHelp icon (the life preserver). Although this has the same basic functionality as the Web browser version of Mosaic, there are a few minor differences. Since the subject of this section is “Mosaic and the Web”, we’re going to talk about the Web browser. However, most of the functionality is the same for SCOHelp.
Today, it seems almost unheard of that someone has not surfed the web a time or two. As a result, most people are familiar with the functioning of a Web browser. However, to make sure that we all have a common foundation on which we will build later, I am going to cover some of the basic functionality of Web browsers from the perspective of Mosaic.
Mosaic is started either by double clicking on the Mosaic icon in the Accessories folder or entering /usr/bin/X11/Mosaic from the command line. Be careful with that command line. The ‘M’ in “Mosaic” is capital. Don’t let it give you the same headache it gave me on a few occasions.
Figure 0-1 NCSA Mosaic
When it starts you will see an X client similar to the one in Figure 0-1. Depending on what version of SCO OpenServer you are running, the start-up page may look different. The start-up page is your “home page”. This is a resource of the Mosaic client, so it needs to be defined as such. The default is stored in /usr/lib/X11/app-defaults/Mosaic and looks like this:
At the top of the window are several menus. Just below that are two text fields that provide information about the page currently being displayed. The top one is the title of the page. This is taken from the document itself and is not the filename of the document. The title must be specified by the author when the page is created.
The next line is the URL of the page. As mentioned earlier, a URL is Universal Resource Locator and is a means of uniquely identifying a document any where on the Internet. This normally contains the protocol used to get the document, where the document is location and the file name of the document.
As you browse the Web you move from page to page. The browser normally keeps track of what pages you have visited and in what order. It is therefore possible to retrace your steps in either direction. That is, from one page move back to previously read documents and then forward again.
At the bottom of the screen are several buttons that represent the more common functions. These functions can also be accessed via the menus. From left to right the buttons are:
Back - Return to the previously loaded document.
Forward - Move forward one document (only applicable if you have moved back one or more documents)
Home - Load the document marked as your “home” document.
Reload - Reload the current document.
Open - Open/load a specific URL.
Save As – Save the current document onto the local machine under the name you specify.
Clone - Open a new Mosaic window containing the current document.
New Window - Open a new Mosaic window containing the home document
Depending on which document is currently loaded in the browser, the Forward button or the Back button may be disabled.
On the right of the window is a vertical scroll bar that you can use if the document is too large to fit in the window. The default width of the window is such that there is no horizontal scroll bar. However, you can get one to appear if you narrow the width of the window.
If you have some places that you visit often, there is no need to trace through the route that brought you there or remember the URL. Instead, MOSAIC will remember this for you in the form of a Hot List. A Hot List is nothing more than a list of hot places that youÕve visited. The “Add Current to Hotlist” entry under the Navigate menu will add entries to the Hot List. The “Hotlist” entry will show you your current hotlist and allow you to jump directly to any of the listed document.
The Hot List is stored in $HOME/.mosaic-hotlist-default. This is a simple text file that you could edit by hand. Each entry is composed of URL, the date the entry was added and the title of the document. When you call up the list of documents in your hotlist, it is this title that you will see.
Also under the Navigation menu is the entry “Window History”. This is a list of recently visited files. Unlike the hotlist, these entries are not permanent, and the entries are removed when you leave the browser.
In the file menu are several functions related to documents, in general. These are:
New Window - Open a new Mosaic window containing the home document
Clone - Open a new Mosaic window containing the current document.
Open URL - Open/load a specific URL.
Open Local - Open a file on the local system.
Reload Current - Reload the current document.
Reload Images - Reloads any images on the current page.
Refresh - Redraw the current page.
Save As – Save the current document onto the local machine under the name you specify.
The Options menu gives you the ability (albeit limited) to configure Mosaic. These are:
Fancy Selections - Toggles whether the X cut-and-paste mechanism will retain the formatting (as much as it can)
Load to Local Disk - When selected, you are prompted to save the file to the disk each time a link is selected.
Delay Loading Images - When selected, inline images are not loaded until requested. This speeds up the transfer of the text portion considerably.
Load Images in Current - Load the images in the current document. Disabled unless “Delay Loading Images” is selected.
Reload Config Files - Reloads the map/configuration files. The files contain information such as which applications to start for which type of document.
Flush Image Cache - Clear all cached images. All images are reloaded from the server. Very useful when images get scrambled in transmission
Clear Global History File - Clears the history file for all sessions. (The default file is .mosaic-global-history.)
Fonts - Determines which font to use.
Anchor Underlines - Determines the type of underline to use for links/anchors.
The Navigation menu used to move around the Web contains the following:
Back - Like the button, this brings you back to the previous page.
Forward - Also like the button, this moves you forward one document (only applicable if you have moved back one or more documents)
Home Document - Loads your designated home document.
Window History - Displays the history of which site you have visited since you started this session.
Hotlist - Displays your hotlist.
Add Current to Hotlist - Adds the current document to your hotlist.
Internet Starting Points - Loads a document containing several sites that are good starting points for surfing the web.
Internet Resource Meta-Index - Document with various Internet resources and indices.
The Annotate menu allows you to add comments to a document:
Annotate - Opens the annotate window, where you can save comments about the document.
Edit This Annotation - Allows you to edit any existing annotation for the document.
Delete this annotation - Delete the annotation being viewed.
Netscape Navigator is currently the most wide-spread Web browser. Often people will confuse the product and say that the browser that they are using is Netscape rather than it being from Netscape. Netscape is a company that produced a wide range of networking products from the Web Browser Netscape Navigator to Web servers and beyond. However, this term has become so ingrained in the Web culture that it is accepted. Beside that, Netscape Navigator was the first product Netscape produced and it where they made it big.
Comparing the features of Netscape Navigator to the version of Mosaic you have on the CD is like comparing the Space Shuttle to a pair of roller skates. If you are planning to do anything more than just occasional Web surfing, than getting the Netscape Navigator. (From here on we’ll just call it Netscape to make my typing easier.)
If you have the SCO FastTrack Server, then you will have a copy of the standard Netscape Navigator. Unfortunately, if you just have the copy of the FreeSCO on the CD, then you will have to get a copy of it yourself from the SCO Web site (www.sco.com). You will find this (as well as Netscape Navigator Gold and several Netscape server products) under the SCO Internet Family of products. Note that you will have to register first. In some cases, you are sent licensing information via email, without which the software will not work.
In the next few sections I will give you a quick introduction into Netscape. If you have downloaded it from the SCO Web site, you will not be provided a manual. In addition, clicking on many of the entries in the Help menu will send you to the Netscape Web site. So rather than having you wade through those pages, I will give you a gentle push to get you going. Because of the wide range of functions that Netscape has and the fact that this is not a book about Netscape, I am going to limit myself to some of the more significant features. If you need more information later, the URL is help.netscape.com.
Aside from the functionality, the appearance of Netscape is different from Mosaic (see Figure 0-2). However, the basics are the same as any web browser. Some of the basic functions are:
Figure 0-2 Netscape Navigator
Title bar, showing you the title of the current page
Pull down menus
A toolbar with the more commonly used functions
A “directory” toolbar that brings you to some pre-defined pages.
Colorbar (with a doorkey icons) indicating the document is secure (blue) or insecure (gray).
Status message field, showing you information about the page or the current transfer.
Progress bar indicating the progress of the current transfer
The Netscape window is divided into several functional areas. At the top is the page title. As with Mosaic, the page title must be specified by the author when the page is created.
The input field labeled Netsite/Location/Goto (depending on what page is currently loaded), shows the location of the current page and can be used to input a specific URL. The label will say “Netsite” after you load a page from a Netscape server. If the page is from a non-Netscape server, this will say “Location”. If you input text into the field, the label will change to “Go to”.
The “Contents area” is the main portion of the window. This is where the current document/page that you just loaded is displayed. This can be an HTML page, image or anything else that Netscape is capable of displaying.
The Background generally has two meaning. In one case, this is the color that you define for the background of the content area. This is only applicable if the image you have loaded does not fill the entire content area. In other cases, the background is what has been specified within the page and it will take up the entire content area automatically.
The status message area is at the bottom of the screen and contains the name of the page or the status of the transfer. If you position the cursor over a link, the status message will display the URL to that page. If the image has an “active area” (it is an image map), positioning the cursor over the active area will show a description for that active area.
Security may or may not be an issue to you depending on what kind of information you are transferring. The document being transferred has one of three security states: Secure, unsecure or mixed. At the bottom left of the Netscape window is a doorkey, which indicates the state of the document. If the key is broken on a gray background, this indicates an unsecure document. If the key is blue, this is a secure document. In addition, the number of teeth on the key will vary depending on the “grade” of encryption. Two teeth means high-grade encryption and one tooth mean low-grade.
The functions that the toolbar provides are essentially the same as for Mosaic:
Back - Return to the previously loaded document.
Forward - Move forward one document (only applicable if you have moved back one or more documents)
Home - Load the document marked as your “home” document.
Reload - Reload the current document.
Images - Loads images onto pages (only active when “Auto Load Images” is not checked.)
Open - Open/load a specific URL.
Print - Print the current document
Find - Search on the current page for the specified text
Stop - Stop loading the current document
If you are new to the web, the directory bar is a good place to start. The “What’s New” button takes you to a site at Netscape that lists several new sites on the Web. For example, as I am writing this two sites are “Era of the Spanish Galleons” and “World of Gas Station Memorabilia”.
The “What’s Cool” page is also at Netscape and list a wide range of sites that are, for lack of a better word, cool. Two sites (as I write this) are Shareware.com, a searchable list of shareware and the CIA. The Handbook button takes you to an online, HTML copy of the handbook for the specific product you have.
au: Òshareware and the CIAÓ? Central Intelligence Agency? Please clarify.
If you are looking for a page with a specific kind of information, choose the “Net Search”. This page has links to the more prominent Web search engines such as Yahoo and Alta Vista. You can even customize the page so that you get the appearance you want the next time you visit the page.
The “People” button links to a kind of Internet “White Pages”. What exactly is behind it, I am not sure. However, it seems to be looking through the white pages of telephone books all across the country. I found an address for myself that was four years old, so it is not as up-to-date as it could be. However, I did find an old friend that I hadn’t talked to in over 10 years, so it’s worth a look.
The “Software” button brings you to Netscape’s download area for their various products. This includes add-ins and a lot of other things.
Netscape provides the same basic functions as Mosaic, plus a lot more. Books have been written about the exciting things that you can do with it. Rather than turning this book into another one on Netscape, we’ll just go over enough things to get you on your way.
As with other browsers, clicking on the hyperlinks links on a page are one way of getting around. This applies to Mosaic as well as Netscape. If you know the URL that you want, you don’t have to go clicking around the links until you find what you want. Like Mosaic, Netscape allows you to input a specific URL. In fact, both allow you to open up files on your local system.
With Netscape, you have three choices. First, there is the Open button on the toolbar. Next, in the File menu, there is an entry to “Open a location” where you input a URL. There is also an entry to “Open a File” which will open a file on your local hard disk. Next, there is a window labeled “location”. Here you can input the URL as well. This is different from Mosaic.
Netscape also has a menu labeled “Go”. As you might guess, this is used to go to specific pages. Here you can go backward, forward, or home as well as stop loading the current pages. In addition, there is a list of the most recently visited pages, so you don’t have to do a lot of back-tracking. Keep in mind that this list only contains a single branch of your search. If you backup and go down a different branch, the previous entries will be lost.
The longer you surf the net the more often you will return to your favorite sites. After a while, the URLs for these sites are burned into your memory and you can repeat them in your sleep. If not, you need some way to keep track of them. You could write down all of the URLs that you want to keep track of, but if you are like me, you’ll end up with dozens of pieces of paper buried in several piles on your desk. Fortunately, Netscape can do the work for you.
One difference between Netscape and Mosaic is the term used for the list of your favorite sites. As mentioned, the Mosaic term is Hotlist. The Netscape term is for a single entry is “bookmarkÓ. Collectively they are simply called bookmarks.
In the Bookmarks menu are two sections. Add Bookmark will add the current bookmark to your list. Now that bookmark will appear at the bottom of the Bookmark menu, making for quick access. The bottom of the Bookmarks menu is your list of Bookmarks.,
It may be that the information on the page is something that you want to keep for posterity. To save time, most browsers keep a copy of recently accessed pages on the system so that they do not need to copy them from the net each time. This “cache” of files is configurable (at least under Netscape), so you could go into the cache directory and make copies of the files you want. The problem is that Netscape saves the files with unintelligible names, which means you have to through each file for the one you are looking for.
Fortunately, things are not all that bad, for Netscape allows you to explicitly save the page that is currently loaded. This is done with the “Save As” entry in the File menu. One interesting thing is when the page is broken down into multiple frames, you can save each frame individually. The menu entry will then change to “Save Frame As”.
Perhaps you want to print out the page. This is also done from the File menu. Instead of just printing the pages, you have a wide range of options to configure the print job. This include settings the margins, defining the header and footer and printing the pages in reverse order. (which has saved me a lot of work.) You also have the ability to see the page before it is printed using the “Print Preview” entry.
If you look in the options menu, you will see that there are quite a number of things that you can configure. Many are simply too complicated to address here. In short, these options are:
General Preferences - General configuration parameters, such as the startup page, colors, and so on (discussed in more detail below).
Mail and News Preferences - Settings applicable to mail and news, including your identity and the servers you want to connect to.
Network Preferences - Settings applicable to the network connection including the proxy server you use. Here, also, you configure where and how large your page cache is.
Security Preferences - Settings applicable to security, such as what event generate alerts.
Tool bars - Several menu entries to select which tools bars are displayed.
Autoload images - Determines whether images are automatically loaded or not.
Document encoding - Specific the character set the incoming page should be interpreted as. Useful for many foreign languages.
Perhaps the menu that I use most is the General Preference. This is not to say that I use it often. However, there quite a number of things that you can configure here. There are several tabs that you can use to select the aspect of Netscape that you wish to configure.
The first is the general appearance. Here, you define things like how the toolbars look, what application (browser, news, mail) is started when Netscape is started, and what the start/home page is. (See
au: missing text above
The Font tab is used to define what fonts are used. Here you defined both the fix fonts and the proportional fonts. You have a selection of several fonts, but I have found that only a few combinations really look good. Here, too, you can change the base font size.
The Applications tab is used to define what applications are used. I will be frank when I say that I have never changed these and have never had an occasion to worry about them.
The Helper tab defines what applications are to be started with each type of file. There is a list of MIME (Multipurpose Internet Mail Extension) type and the corresponding application. In most cases the default is either “Unknown: Prompt User” or “Netscape”. In the first case, the browser will prompt you for a course of action. This can be anything from starting a specific application or simply saving the document to disk. In the other case, Netscape will load and display the document. This is the case for HTML pages or images such as GIF or JPEG.
The Images tab defines the way images are displayed on the screen. One aspect of this is whether to display images as they are being loaded. If you select “After loading”, Netscape will wait until after the page is completely loaded before displaying the image.
If you are planning to use Netscape as your mailer, then you should also look in the Mail and News menu. Here you define which servers to connect to and some basic information about yourself. Much of this information will be passed along in either the mail or news message.
The Network entry brings up a window with several tabs. The Cache tab allows you to define not allow where Netscape keeps it’s cached files, but also the total space, both in memory and on the hard disk it should use. The Connections tab basically allows you to limit how much bandwidth to use. Here, you can specify the maximum number of connections as well as the size of the buffer to use.
The Proxies tab is only used if your machine is not connected directly to the Internet. If you go through a machine that acts as a filter between your machine and the Internet, such as a firewall, you can define what host to use for different network protocols, including the specific port to access.
In my opinion, the Protocols tab belongs in the security section. Here you define when Netscape should show an alerts, such as when a form is submitted by email or when a remote server is sending you a cookie. (A cookie is a file that is stored on the local machine. It can be read later, such as during a subsequent visit to reset values. It could also be used as example like “shopping carts” where you order items on several pages and this information is stored in a cookie.) Here, too, you tell Netscape whether to use your email address as your password when connecting via anonymous ftp.
There is the Languages tab. This allows you to enable or disable Java and Java script.
The next menu entry is for security. The first tab (General) again defines when alerts are sent. In this case, the secure options deal specifically with secure documents. Here, too, is where you define the encryption for the secure socket layer (SSL).
As one might guess, the Password tab allows you to set a password. This is useful if other people could have access to your computer and you wish to keep them from accessing Netscape.
Certificates are a means of identification across the Internet. Personal certificates allow you to identify yourself to others. For example, if you submit information in a form, the certificate identifies you to the remote site. Servers can even request specific certificates, which they could issue you if necessary. This might then only be valid on their site.
Site certificates identify others to you. For example, when you submit information in a form, the site certificate allows you identify the recipient, so you know that it is going to the right place. For more details on certificates, see the Netscape Doc.
Unlike other browsers, such as Mosaic, mail and news are built into Netscape. Either can be started from the Window menu. By selecting the appropriate entry from the Window menu a new window will open up containing either the Netscape mail reader or news reader. News is a subject for another book, but mail is something I want to touch on briefly.
When you start a new mail window, you get something similar to Figure 0-3. The left hand box is for the mail folders. You can add folders using the “New Folder” entry in the File menu. This allows you to organize your mail. The right hand box is the contents of the current folder. At the bottom is the contents of the message that you have selected.
Figure 0-3 Netscape Mail
At the top of the window and the tool bar, the functions are fairly straight forward and are basically the same as any character based mail reader. In essence, the menus are the same as for the Netscape browsers, with slight changes to make it function as a mail reader.
Netscape mail can be used to read both local mail (that is, mail send to this machine) and mail that you have to go and get (such as from a POP3 server). By default, it doesn’t work for either as you need to configure it first. The only thing I have found necessary to get local mail working is to configure my email address. This is done through the Identity tab in the “Mail and News Preferences” entry in the Options menu. The default email address is set to your user name with no machine name. I have tried to configure it as simply [email protected], but it wants a fully qualified domain name. Once that is entered, you can send mail. However, you can read your mail anyway.
One of the nice things about the Open Server server is that it runs pretty much out of the box. It is set up to serve files to SCOHelp, but you can easily change the configuration files to match the more common setups. Then all you need to do is copy your Web pages into the right directory and youÕre up and running. If you are not interested in all the fancy features or security that your Web server can provide, then just a few modifications to the default configuration might be enough. However, there are many interesting and useful features that can improve the efficiency and security of your site.
Web resources are provided by Web Servers. A Web Server is simply a machine running the HTTP daemon, scohttpd. Like other network daemons, scohttpd receives requests from a Web client (such as Mosaic or Netscape) and sends it the requested resource.
Like the FTP daemon, scohttpd, it is a secure means of allowing anonymous access to our system. We can define a root directory, which, like FTP, prevents users from going "above" the defined root directory. Access to files or directories can be defined on a machine-by-machine basis, and we can even provide password control.
When scohttpd starts, it reads its configuration files and begins listening for requests from a document viewer (one that uses the HTTP protocol). By default, the configuration file is /var/scohttp/conf/scohttpd.conf, but it can be changed as you need. When a document is requested, scohttpd checks for the file relative to the DocumentRoot (defined in /var/scohttp/conf/srm.conf). By default this is set to /var/scohttp/htdocs. If we want to make references to our documents, we can use symbolic links to point to places other than the DocumentRoot.
The configuration files are normally located under /var/scohttp. Here are the default directories that the scohttpd daemon accesses. This is the ServerRoot, which we can define in the scohttpd configuration file. Note that the daemon will be looking for the configuration file relative to this root. If we change it, we have to specify the configuration file when we start scohttpd.
There are two primary options to scohttpd that we can use. The -d options specifies a different ServerRoot directory than the default. The -f option specified a different server configuration file than the default. The default file is normally the scohttpd.conf file in the conf subdirectory under your server root directory.
The directory conf contains the configuration information for various aspects of the scohttpd daemon. The files here are:
access.conf - Access control file.
scohttpd.conf - Main server configuration file.
mime.types - MIME types description file
srm.conf - Server resource management file.
The general behavior of scohttpd is defined by the scohttpd.conf file. Configuration parameters are set through directives. Each directive has a specific name and is followed by the value for that directive. In a general form, this would look like this:
Note that in contrast to many configuration files, there is no equal sign between the directive an its value. Also, some configuration files (which we will get to later) allow multiple value. However, the ones that you find in the scohttpd.conf are just single values.
One of the first things that needs to be done is to decide whether scohttpd runs stand-alone or through inetd. The big difference is one of performance. If we run scohttpd through inetd, each time we make a request inetd needs to start a new process, which means loading the scohttpd daemon binary. If there is already a copy running, the system should be able to copy the pages in memory without having to load the binary from the hard disk.
If scohttpd is running as standalone, a copy is always in memory. When a connection request is made, scohttpd can easily make a copy of itself, without the need to go to the hard disk. Which way you run scohttpd depends on your needs and the traffic on you site. The more traffic, the more it makes sense to have the daemon always in memory.
How we set the server type is done with the ServerType directive. For example, to set it us as standalone, the entry would look like this:
By default, scohttpd listens on port 457 if it is run in stand-alone mode. This is the port that has been reserved for use by SCOhelp. Normally, the HTTP daemon is listing on port 80. You can configure the scohttpd.conf file so that it is listing on port 80, but SCOhelp won’t find it correctly. Instead, I suggest that you run two as standalone and specify different ports.
The next entries to look at are the User and Group entries. These determine under what user and group the server runs when answering requests. This in turn determines what access the server has. The default user is nouser, so this user must have access to the files. The default group is nogroup.
We might want to consider creating a user and a group specifically for our Web server. I find it easier to monitor and control access when we specifically assign access in this way. Since the only place we give the HTTP user access is in the documents directory, there is less risk of giving them more access than necessary. The same applies to the group.
The next directive to look at does not really need to be changed to get the system working. This is the ServerAdmin directive that specifies the email address of where problem reports should go. Often, when the server detects a problem, it will send this address to the browser. This allows our visitors to send us email, whenever problems occur.
I would suggest that we create a user called "webmaster,” which seems to be a convention on other sites. By convention, this user is the contact point for people visiting your sites. You then define the ServerAdmin to be webmaster.
Next we have the ServerRoot, from which most path references are based. The scohttpd has a “preconception” of where the server root and unless told otherwise it will look in /var/scohttp. As I mentioned earlier, you can pass a configuration file to scohttpd when you start it. So, if the ServerRoot is redefined in the configuration file, all references will be relative to that directory.
You can also specify a particular log file for reporting errors. This is done with the ErrorLog directive. The default points to logs/error_log. Note that this is a relative path. Relative to what? The server root. So, if your server root is /var/scohttpd, then this file is /var/scohttpd/logs/error_log. Following this is the TransferLog directive. This is log/transfer_log which keeps a record of transfer from your system.
The PidFile directive gives the path name of the text file containing the scohttpd PID. Rather than searching through the process table directly, this file can be read when signals need to be sent to scohttpd, such as when the system is being shutdown.
Lastly is the ServerName directive. This is the hostname that the server will send back to the clients. The most common thing to put in here is www.domain, as www is the most common name for the Web server. This name doesn't have to exist, but it is advisable that even if the machine is not really called www, we have a DNS alias.
If you make changes to this file, you need to force scohttpd to re-read this file. This can be done by sending signal 1 to the scohttpd process. You can quickly do this with:
kill -1 `cat PidFile`
where PidFile is the scohttpd PidFile that we defined earlier.
If we are setting up our server to allow access to all the documents, then we are pretty much set. However, if we want to restrict access to specific directories, we can do this through the access.conf file.
When we look in the access.conf file, we will see that it is broken down into several sections. Each section refers to a particular directory and is delimited by the <Directory> (to indicate the beginning of the section) and </Directory> (to indicate the end of the section). This is very similar to the formatting in HTML. Each section/set of permissions is delimited by a directive, which specifies to which directory the access should apply. For example, on many systems we would have an entry that looks like this:
Options Indexes FollowSymLinks
This section applies to the directory /var/scohttp/cgi-bin. In this case we are saying that from within /var/scohttp/cgi-bin directory, the Options are Indexes (user can retrieve indices created by the service) and FollowSymLinks (scohttpd will follow symbolic links).
A sub-section within each <Directory> section is the <Limit> directive, which essentially describes the limit of the access. It needs a specific access method after it. For example:
(Note that as of this writing, the only access method support is GET).
Since the <Limit> directive needs to be defined for a specific directory, we end up with something similar to this:
<Directory directory_path> list of options
In it's simplest form, the <Limit> directory can allow or deny access to specific directories. This might be useful if we want to allow users from within our own domain to have access to specific directories, but not make them world accessible. Let's assume that we have already defined a directory and want to restrict access to just our domain. We might have an entry that looks like this:
allow from our.domain
deny from all
The first line determines in what order the access rights will be evaluated. Here we said that we first check who is allowed, before we check who is denied. If our computer is from the domain that is specified with our.domain, we get in. Otherwise, we are denied access.
It may be possible that our company is broken down into smaller domains, for example into individual departments. It may be necessary to restrict access to everyone except those departments. For example, finance documents may be restricted to the finance and admin domains. We could then have something that looks like this:
allow from finance.our.domain admin.our.domain
deny from all
In this configuration, someone from the marketing sub-domain would not have any access to this directory. Note that the names we specify on the access and deny lines need only be separated by white spaces. Therefore, it could have looked like this:
allow from finance.our.domain
There are also many sites on the Internet that restrict access to everyone, except those that have an account. Having an account also means that we have a password. Although we can take the system passwd file, we have to edit it a bit in order to get it to work. In addition, the user names and passwords that scohttpd has are completely independent of the system accounts. Therefore, we can have users that do not have system accounts that still access files via scohttpd. Just the same way a user can have a system account, but can't access the web pages.
To add this functionality, we have to make some changes to the access.conf file. The new entry might look like this:
AuthName Secret Stuff
At the top and bottom of this section, we will see the directive that defines what directory this is valid for. In this case /var/scohttp/html/secret. At the top of the section we define the access we want to allow. The first line:
says that we do not want access control lists in specific directories to override the accesses we define here. If we did want to, we would create an access control list (ACL) in the file .htaccess in each directory we wanted to change. This has a similar format, but that we will get to later.
The AuthName entry is a label for the directory we are defining access to. Although this does not effect the access, it is required. When we try to access one of the restricted directories, this label will appear along with the password prompt.
The AuthType entry specifies the type of authorization used for this directory. As of this writing, only Basic is supported.
The AuthUserFile is the full path to the user password file. This is not the system password file (/etc/passwd), but rather the password file specifying access to directories. The AuthGroupFile list users that are part of specific groups. One thing we can do is to restrict access to files either on a per user basis or a per group basis as we will see in a moment.
Inside the Limit section we have the require directive. Generically, this has the syntax:
In this case, the entity is "valid-user" which means any user listed in AuthUserFile. If we wanted to specify just a single user, the entry would look like this:
require user <user_name>
Where <user_name> is the name of the user as specified in AuthUserFile to whom we want to give access. If we want to specify a group, the entry would look like this:
require group <group_name>
where <group_name> is the group name defined in AuthGroupFile. Keep in mind that just because a user is listed in AuthGroupFile does not mean that they have access. The system has no way of identifying them, so they still need a user name and password in AuthUserFile. If they are listed in AuthUserFile but not in AuthGroupFile, they will still be denied access.
Note that as of this writing, most of these options are not listed in the SCO documentation. (At least I cannot find them anywhere.) I am basing my description on the NCSA HTTP doc and my own experience with configuring these files. Therefore, I apologize in advance if I got something wrong.
When we test the behavior of scohttpd by making changes to these files, we have to tell it to re-read the files. Despite what some books say, we don't have to reboot our machine. Instead, all we need to do is send a hangup signal (SIGHUP) to the inetd process (kill -HUP <pid_of_inetd>).
I mentioned a moment ago a case where we might want to have a set of permissions on a specific directory tree, but want to have different accesses for specific sub-directories. This is accomplished with the .htaccess file. However, to enable it we have to set the AllowOverride directive to All in access.conf.
There are other ways to configure the .htaccess file. For example, setting AllowOverride to AuthConfig allows us to change the AuthName, AuthType, AuthUserFile and AuthGroupFile directives. With this we can then change the label for this directory or the paths to the password and group files. Note that all the other values, including what entity is required (user, group) is still in effect.
We can also set AllowOverride to Limit, so that we can change whatever we had in the access.conf file. For example, if we wanted to restrict access to the this directory to a single person we could change .htaccess to:
require user UserName
Where UserName is the name of the user that should have access. Needless to say we could have also changed the requirement to be a specific group. As this point, I'll mention that the AllowOverride directive can take more than one argument, so if we wanted to specify a new AuthName as well as a new Limit, it might look like this:
AllowOverride AuthConfig Limit
A little while ago we talked about the server resource management file (srm.conf). Although the configuration that we currently have would be sufficient to run a decent web server, the srm.conf file has a few things that warrant taking a quick look at.
The first is the AccessFileName directive. This defines the name of access file that we would copy into a directory to override the accesses defined in access.conf (assumming this is allowed at all.) By default, this is the file .htaccess that we have been talking about all along.
The Alias directive is used to assign an alias to a specific path. This is useful in keeping a specific file structure, but still being able to move the tree with limited problems. By default there is often one Alias defined:
Alias /icons/ /var/scohttp/icons/
Another alias directive is ScriptAlias. This defines where script files are located. The default is:
ScriptAlias /cgi-bin/ /var/scohttp/cgi-bin/
This is used when we have CGI scripts (more on those later) that we want to execute. We can specify an alias here, so we don’t need to specify the whole path.
The DocumentRoot specifies the root directory for our documents. For example, let's assume DocumentRoot is set to /var/scohttp/html. When we would specify a URL as http://www.our.domain/file.html, the file file.html would be physically located in the directory /var/scohttp/html. If we specified the URL http://www.our.domain/data/file.html, this file would be physically in /var/scohttp/html/data.
There are several indexing directives, which we will get to next.
The DirectoryIndex directive in the srm.conf file determines the default file the server should deliver if none is specified by the client. For example, if we input http://www.our.domain in the browser, the server would try to deliver a file in the DocumentRoot (since no path was specified). The file delivered is what is specified in srm.conf as the DirectoryIndex directive, which is index.html, by default. So, if there was a file called index.html the URL http://www.our.domain would be the same as http://www.our.domain/index.html.
If the file specified by DirectoryIndex (index.html, in our case) does not exist, what the server does is dependent on whether indexing is allowed or not. Remember that in the scohttpd.conf file we can specify configurations on a per directory basis. One of the options we can specify is indexing. If indexing is turned on, the server will deliver an index (directory listing) of that directory. For example, assume we have an entry in scohttpd.conf that looks like this:
This means that for the directory /var/scohttp/html, indexing is enabled. So, if there was not index.html file in /var/scohttp/html, we would just get a listing of the files in that directory. Note that this is a dynamic list. We do not need to create it ourselves. When a visitor inputs the appropriate URL or clicks on a link, the directory listing is created for them on-the-fly.
We can also automatically include files in the listing that might provide some additional information about the directory. We can use either the HeaderName or ReadmeName directives (or both) which would be displayed before and after the index listing, respectively.
Supplemental to this is the FancyIndexing. This adds icons, file names and other "fancy" things to the directory listing (Index). This is a boolean value, so it is turned on with FancyIndex On and off with FancyIndexing Off.
When FancyIndexing is on, we get a header at the top of the list. In addition, each entry will be proceeded by an icon. My recommendation is that we only turn this on if we have CPU cycles to burn. For each file in the directory, an icon needs to be sent as well as the information such as modification time. Without FancyIndexing, we get just a listing of the files, which should be sufficient.
Fancy indexing can also be turned on using the IndexOptions directive. This has the syntax:
IndexOptions option1 option2
Some useful options are:
Turns FancyIndexing on.
Makes the icon part of the anchor for the link. Normally the icon just sits there and does nothing. (Only valid when FancyIndexing is on)
Scans the information in the HTML <TITLE> section to include as a description. (Only valid for those file without a pre-defined description. See below.)
The last modified date is not printed. (Only valid when FancyIndexing is on)
The size of the file is not printed. (Only valid when FancyIndexing is on)
File description is not printed. (Only valid when FancyIndexing is on)
Descriptions are added using the AddDescription directive. The syntax is:
AddDescription "description" filename
Here we can specify full names for files or use wildcards. For example, if we wanted to provide the description "WWW Pages” for all HTML documents, the entry would look like this:
AddDescription " WWW Pages" *.html
What icons are used for what kind of documents is also configured through the srm.conf file. There are two ways of doing this. The easiest is via the AddIcon directive. Here we have a one-to-one mapping between a file extension and the icon that is used. The syntax for this directive is:
AddIcon /virtual/path .ext1 .ext2 ...
The /virtual/path is the path relative to the server root. So if we have a sub-directory of our server root named icons, the reference to the icon text.gif would be /icons/text.gif. In most distributions this is also aliased with:
alias /icons /var/scohttp/icons/
This alias then replaces any area where /icons appears.
Following the path to the icons are the extensions for which the icons should be used. Note that in many cases if I wanted to use this icon for all files ending in .txt and .doc, the whole line would look like this:
AddIcon /icons/text.gif .txt .doc
There are three additional types that we can use icons for:
For the parent directory
for any directory
The next way is not directly dependent on the file types. This is when we use the MIME types to determine what icons to use. Granted, the MIME types are dependent on the file types, but using this mechanism allows to have a consistent configuration. The syntax for this is:
AddIconByType icon type1 type2 ...
An example of the simplest form would be:
AddIconByType /icons/text.gif text/*
Here the type is defined as text/*, so this is valid for any MIME type of text (e.g., html, txt), which would then have the text.gif icon. We can make this a little more complicated by defining a text string to be displayed for non-graphic clients. An example would then look like this:
In some cases, we don't want to display anything. This can be accomplished using the IndexIgnore directive. The general syntax of this directive is:
IndexIgnore pattern1 pattern2
where pattern1 and pattern2 are the patterns of the files that we don’t want to list. For example, the default says to ignore the HEADER and README files, so the entry might look like this:
IndexIgnore */HEADER* */README*
This says to ignore any file in any directory that starts with HEADER or README. Therefore, files like HEADER.TXT or README.NOW will be ignored, but 0README.NOW will be displayed.
If we are going to set up an Internet server, then we should definitely consider allowing access via ftp. If we are providing specific documents, such as white papers, source code and even compiled programs, our visitors may not want to deal with clicking through our web site or waiting for the graphics to download. Having the documents available through ftp means that our visitors can go right to where the documents are stored. The only thing that is sent across the line is the commands, output and then the document, once they have found it.
This doesn't meant that these documents are unavailable via the Web. We could have links on our Web pages that point to them, as well. We could either have links connecting with HTTP or with FTP. If we have links that access the file via FTP, a lot of browsers will be able to handle this and actually show us the same directory structure we would have if we logged in directly with ftp. If the browser can't handle it, the visitor can still access the file using ftp directly.
What is happening during all this is that the browser is simply logging us in as the user anonymous itself. They don't have any special privileges and if anonymous ftp isn't working correctly, we can't get in using a browser.
Setting up anonymous is fairly straight forward. However setting it up correctly is not. There are a lot of security issues involved, which we talked about in the chapter on networking. Here we'll get into what we need to do to address those security issues. Fortunately, the later versions of SCO UNIX have made all of this simpler. The system is already preconfigure (in most cases) to allow anonymous ftp and there are configuration files that allow us to specifically define what access can and cannot be made. First we'll talk a little bit about doing this all by hand and then we'll talk about how this problem has been solved in the new versions.
The first step is to create the ftp user, just like you would any other user, using scoadmin. All functions carried out will be done as if by this user. So the access permissions on the files should allow the user ftp the appropriate access.
The ftp user's home directory does not need to be any place special. We could simply create it where all the other home directories are (e.g., /usr) or put it in it's own filesystem (which could be then be mounted under /usr/ftp). This would keep the system from having problems if someone were to send large files to us and filled up our root filesystem. Make the directory owned by root and not by ftp. This prevents the ftp user from changing the permissions. However, we make the group of the directory the same as ftp so that we can still give ftp access. Set the permissions to read and execute, but no write (555). This means that anonymous ftp users can read files and access directories, but cannot write anything.
In the Network Guide in on-line doc, there is a section in Chapter 3 (Administering TCP/IP) on setting up anonymous ftp. This contains a list of the necessary steps. Rather than simply repeat these steps here, I would suggest that you start up X and then SCO help. You can then cut and paste these lines into a file to create a script.
You should then edit the file to remove all of the pound-signs (#). These are used in the documentation to indicate a prompt, but in the shell script, they become comments and the script essentially does nothing. (The command in vi would be :%s/#//g) Also, change the first line to #!/bin/csh instead of just starting csh. Otherwise the script does not work right.
One thing I would like to say about the instructions concerns the passwd file. The instructions call for you to copy /etc/passwd into the etc directory under ftp. This is dangerous if you have low security. At the higher levels, your encrypted password is not stored in /etc/passwd, but rather in /etc/shadow. When your system is set to low security, the encrypted passwords are in /etc/passwd. Since this file is readable by anyone using anonymous ftp, they could copy it to their local system and try to crack the password. Therefore, you should either increase security or remove the entries in the copy of the password file for all real users.
The reason that you need the passwd file in the first place is not for the password, but for the user id. The same applies for the group file, since it is only used for the group ID. When you do a listing of a file, without these two files, all you see are the numbers of the user ID and group ID. By including them, they are translated into the names.
We may want to allow visitors to send files to us. Therefore we need to have at least one directory writeable. Consider creating an incoming directory where the ftp user has write but not read permission. That is, they can copy files in, but not out. This prevents our site from becoming a repository for undesired files (pirated software, objectionable, for example). The best thing to do is set the permissions as 1733. This makes the directory writeable, but not readable by the ftp user. The first '1' is the same as chmod +t and for a directory this prevents other people from deleting files that they don’t own.
Access with ftp is controlled by several files in the /etc directory. The primary file is ftpaccess. As it's name implies, it controls access with ftp. Here we not only define who has access, but also the kind of access they have and even how many people can log in and when the can login. In addition, this file allows us to define the behavior of the system. For example, when a user logs in for the first time or changes to a particular directory, we can have the system display messages.
Through this file, each user is assigned to a specific group. Groups do not necessarily mean individual users, but can mean complete sites. That is, we can assign an entire site (based on their IP address) to a group and control their access this way. For example, we might have documents that only people from within our company can access. Rather than having to give everyone their own accounts, we define their group, and therefore their access, based on their IP address.
This can be even more precisely defined, for example, by saying that if a user has an account on the local machine and comes from a particular IP address he belongs to one class. If the same user comes from a different IP address he belongs to a different class. In this way, we need define a only single account and password and still limit the users access. Note that in the cases of the IP addresses, we don't have to define each individual host address, but we can defined networks as well.
Before we go into the other capabilities, let's go over some examples of class. All definitions take the form:
One of the keywords is class and a class entry takes the form:
class <class_name> <typelist> <address> [<address> ...]
Where <class_name> is the name we have given for this class of machines, <typelist> is the type of users, and <address> is the IP addresses this class should be applied to. By default, the first line of our ftpaccess file probably looks like this:
class all real,guest,anonymous *
This defines the group all for the user types real, guest, and anonymous (all there is). The address can also be a domain name. However, it's easier to spoof. Therefore, it is recommended that you use the IP address. Since the IP address is "*" this will be valid for all IP addresses. One might think that an anonymous user is a guest, but there is a difference that we will get into in a moment. Let's now define a group of anonymous users that come from our local network: 192.168.42. It might look like this:
class local anonymous 192.168.42.*
If we simply added this line to the ftpaccess file after the first class definition, then things would not behave as we might think. What we need to keep in mind is that once a match as been made, it sticks. Since every user from every machine is a member of the all group, no matter who logs in, that's the group they are assigned to. Since this is the broadest reaching group, I would suggest putting it last (assuming we keep it at all). That way, if a user doesn't fit into any of the other groups, he will fit into this one.
So what can we do with this? Let's make things easy and define a message file that will be different for members of our local domain. (A message file is a file that is displayed when a particular action occurs.) Maybe we want to tell them more about our what out site has to offer and where things reside. Our completed ftpaccess file might look like this:
class local real,guest,anonymous 192.168.42.*
class all real,guest,anonymous *
message /local.msg login local
message /welcome.msg login all
We've define the local group and then defined the group for anyone that does not come from our local domain. Next we define the message capability. This has the general form:
message <message_path> [<action> <class>,<class>,...].
The first entry says to show the message local.msg that is in the root directory (relative to the FTP root directory). Note that if a user logs in as a real user, their home directory is the same as when they log in with telnet or anything else. Therefore, no message will be shown in this case unless there really is a file called local.msg in the / directory. The solution I came up with is to have all system messages in a single directory. (Note that some sources refer to the "action" as "when".)
For example, we create a directory in ftp's home directory called usr/messages. This is then symbolically linked to /usr/messages. All messages are then placed in this directory. If we log in as a real user, the messages are read out of /usr/messages. If we login as an anonymous user the messages are read from ~ftp/usr/messages. Any directory related messages (more on them shortly) we just keep in that directory.
One entry would then look like this:
message /usr/messages/local.msg login local
If I were to login as a user from the local domain who actually did have an account on this machine, what would be read is /usr/messages/local.msg. If I logged in as an anonymous user from within the domain, the file that would be read would be ~ftp/usr/messages/local.msg. Since the directories are symbolically linked the same file is read whether we are logged in as a real user, anonymous or guest.
We could also configure the system so that whenever we changed directories to somewhere specific a message is displayed. For example, we might have software that hasn't been thoroughly tested, but we still want to make it available. Lets say this software is located in ~ftp/pub/software/beta. We would then have a line that looked like this:
message /usr/messages/beta.msg cwd=/pub/software/beta
In this case, the action was that we changed directories into /pub/software/beta. Therefore, the message /usr/messages/beta.msg was displayed. If we logged in as anonymous user, this would be ~ftp/usr/messages/beta.msg.
There may be some cases where we want to restrict the access of real users. To do this, we use the guestgroup capability. This has the syntax:
guestgroup <groupname> [ <groupname> ... ]
where <groupname> is the name of a system group as defined in /etc/group. So, if we had a system group called noftp, the entry would look like this:
Any time a real user logged in that was a member of this group, the system would behave just as if this were anonymous ftp. However, we need to configure this account. This might look like this:
We next have the concept of a "readme" file. This is a file that we should read (makes sense). With software, this is usually the file that most people ignore that has vital information about how to install the software and prevent problems. On ftp sites these files usually have specific information about what's in the directories. If the directory contains downloadable software, this would probably contain information about how to install the software.
The difference between a readme file and a message file is that a readme file simply exists in the directory, whereas a message file is actually displayed. Otherwise the principle is the same. Each time an event occurs (login or changing directories) the file is accessed. For the readme file, that access means that the fact of it's existence is displayed to the visitor along with the date that the file was last changed. First this serves as a marker to a visitor. If he sees that the README file hasn't been changed, there's no need for him to download it.
When a visitor logs in, he might see something like this:
230-Please read the file README
230- it was last modified on Sat Oct 12 18:21:35 1996 - 1 day ago
This says that the file was last modified 1 day ago. If the visitor hadn't been to the site for a month, then this file is one he hasn't seen before, so it might we worth it to download the file.
Note that a readme file doesn't need to be called "readme." In fact, we can call it anything we want. The general format for the readme file capability is:
readme <file_path> [<action> [<class>]]
As we can see, since we have both action and class fields, readme files can be setup to be shown each time a user logs in or when he changes directories. In addition, we can have different readme files for different classes of users. I could have an entry that looks like this:
readme /usr/messages/README* login
The file that is referenced is in /usr/messages and the appropriate information is displayed whenever a visitor logs in. Note that since we did not include a class of users, this is valid for everyone. Also note the asterisk following the file name. This does not mean that the file is real called "README*", but rather any file that starts with README will be shown.
For example, another way this file can be used is to have a list of changes. We could have a file called /usr/messages/README.general which contains general information, and one called README.changes which lists the changes to the site. The information about both of these files will be displayed when a user logs in.
Another entry I created looks like this:
readme README* login
This displays any file starting with "README" that is in the current directory when the user logs in. If it's an anonymous user this would be ~ftp/. If this were a real user (that wasn't mapped to the guest group) this would be their home directory. If we create a file ~jimmo/README.jimmo, then every time I log in with ftp, I see references to this file. This is a useful way of sending messages to users without giving them an interactive account.
There is something special that we can do with the contents of our message files (not the READMEs), and that is we can display certain kinds of information dynamically.
local time (form Thu Nov 15 17:12:42 1990)
free space in partition of CWD (kbytes) [not supported on all systems]
current working directory
the maintainer's email address as defined in ftpaccess
remote host name
local host name
username as determined via RFC931 authentification
username given at login time
maximum allowed number of users in this class
current number of users in this class
So far, all we have done is grouped users into classes and decided what messages they see, but we haven't done anything to control their access. The files and directories that are created for us are configured in such a way as to limit access greatly. There are several directories which contain the necessary files and programs. However, there are a few programs in the bin directory that we might want to limit access to.
The first capability that we'll talk about to limit access is deny. The general syntax for it is:
deny <address> <message_file>
where <address> is the IP address or hostname/domain that we want to deny access to. The <message_file> is any file that we might want to display instead of simply saying that access is denied. One way this capability can be used is if we have a site that is overburdening our server. Maybe they are downloading all of our files and the strain on the server and network is making life for the other user's miserable.
To figure out who is taking up all the bandwidth we can enable logging. We can log both the commands issued or the transfers that are done. The syntax to log commands is:
log commands <typelist>
where <typelist> is any of our standard users types: real, anonymous and guest. To log transfers the syntax is:
log transfers <typelist> <directions>
where <typelist> is again our user types and <directions> is the direction we want to log, using the keywords "inbound" and "outbound."
The shortcoming of this scheme is that we cannot enable logging for classes of users (at least I have tried on several different versions without success.) In addition, transfers are logged in /var/log/xferlog, whereas the commands are logged in /var/log/messages (or the equivalent). Since every command is logged, our messages file can grow rather quickly. To login a listing takes 4 lines and each time a directory list is shown, it takes another two.
Even commands that cannot be executed are logged. For example, if I want to change to the pub directory and make a typo and say cd pbu, the command is logged, although no action was taken. If transfer logging is enabled, an entry is made for both the command and the actual transfer is logged in xferlog. If we have a busy site, we need to be careful that our messages files doesn't grow too large.
Part of the logging process is the ability to know just who is accessing our system. Most UNIX-like ftp programs know the user login name of whoever initiated the connection as well as the machine's name. This can be used to enable more detailed login. Using the passwd-check capability we can force the user to input his logname and machine name as a password in order to get in.
Actually this can be set at several levels, depending on our needs. For example, if we have a very trusted site, we don't even need to enable the passwd-check capability and everyone can get in, even if the login name/machine name they enter is completely bogus (e.g., just typing random keys). If we have a little less trusting site, we can require that the password contain at least a "@". At the highest level, we can require that the password is a fully RFC 822 compliant address, like company.domain.
So what do we do if the password doesn't match our requirements? We can simply issue a warning and let them in anyway. Or we can issue the warning and toss them out. The syntax of the passwd-check capability is:
passwd-check <none|trivial|rfc822> (<enforce|warn>)
An example of this would be:
passwd-check trivial warn
With this the visitor is required to have a @ in the password and if it isn't a valid password, a warning is generated, but they are still admitted. If the entry looked like this:
passwd-check rfc822 enforce
the visitor would be required to input a RFC 822 compliant address and if not, they would be tossed out. Remember that I mentioned that the system can tell who we are and where we come from. Therefore, it's not sufficient to make up a name and use it as a password.
We can also deny access to specific files. This is done with the noretrieve capability. The syntax is:
noretrieve <filename> <filename> ....
where <filename> is either a path or a filename. For example, if we wanted to keep users from downloading the /etc/passwd file, the line would look like this:
However, if we wanted to deny access to core files, no matter where there were, the entry would look like this:
Aside from setting the permissions, we can control where files can be uploaded by using the upload capability. The syntax for this command is:
upload <root-dir> <directory> <yes|no> <owner> <group> <mode> ["dirs"|"nodirs"]
Where <root-dir> is the root directory of where the following conditions are applied. This is relative to the system root and not the root of the ftp directory. The <directory> is the path (relative to the root we just gave and <yes|no> determines specifically whether download is allowed or not. Permissions on the file are set by <owner>, <group>, and <mode>. These have the same meaning as for other kinds a of files. For example, we might have something like this:
upload /home/ftp /incoming yes root root 0600
This says the uploads are allowed into the ftp user's home directory and that all files will have an owner and group of root. The permissions for these files will be 0600 (read only for the owner).
I recommend that if you do have an incoming directory, the files not be readable by others. If the files were readable, it might happen that your site becomes a drop for pirated software or objectionable content. This could make you legally responsible. There are cases where people have copied such files into an incoming directory and the files were later picked up by someone else. This maintains the anonymity of both the sender and the receiver. If no one can read these files, they can copy new one to us, but will only be able to read the files that we specify.
Access is also controlled by the /etc/ftpusers file. This contains a list of local users who are not allowed to used FTP no matter what other things may say. All the doc I have ever seen suggests putting both root and uucp in here. I would add all of the system accounts such as bin, sys, daemon, and so on. Normally, they should not be able to login anyway. However, it is better to err on the side of security.
One thing you should think about is that the Netscape Internet Fastrack Server from SCO configures anonymous FTP with the click of a button. As one SCO engineer put it “No need to go through that painful list of steps to set it up.”
SCO offers a range of products to not only ease your move into the Internet, but also to expand your existing Internet connection. Although all of the SCO operating system products provide a built-in web server, configuring it is not as easy as many people would like. In addition, beyond simple Web page publishing, there is little you can do without a great deal of effort.
The only solution that SCO provides is the Netscape FastTrack Server. Although it has many features, the FastTrack Server is referred to as an “entry level” server as it does not have all of the advanced features that something like the Netscape Enterprise Server has. In addition to publishing on the Internet, the Fast Track server make publishing on your company’s “Intranet” just as easy.
Since security is a primary concern whenever you connect to the Internet, the security features of the Fast Track server give you fine control over who has what access to your site. This even includes encryption of the data between the server and the client. Access can be granted on a per user basis, or by groups, domains and even IP address. The Fast Track Server also incorporates Secure Sockets Layer 3.0 (perhaps the most widely accepted Internet channel security standard).
Installation and management of the server are both very easy. During installation, the Fast Track Server will recognize the various aspects of your system and adjust itself accordingly. Once your system is installed, management is accomplished through an interface you should already be familiar with: the Netscape Navigator. Even if you are not familiar with it, providing you with a single interface allows you to move from browsing the Web to administering it with a touch of a button.
Because you are accessing the management functions through a Web browser, there is nothing that is preventing you from accessing Fast Track servers on other machines. In fact, that is it what is was intended to do. In addition, you can use the Netscape browser to access a number of reports to give you statistics about the activity on your site.
If the functionality provided by the Fast Track Server are not enough, SCO provides several other Netscape Servers. The Netscape Enterprise Server starts off by providing all of the features that the Fast Track Server does.
One of the major shortcomings of most HTML editors is that there is no way to keep track of the changes you make to your pages. While you can separate your system into a development sever and a productive server, keeping track of the versions of individual pages is something that must be done by hand. Because HTML pages are text, you could implement a revision control system, like RCS or SCCS. However, the Enterprise Server make things easier by including the MKS Integrity Engine for Mortice Kern Systems (makers of the MKS Toolkit, www.mks.com). Among the features provided include:
Check in/out capabilities
The Netscape Enterprise Server also enhances the management functionality of the server. One very useful feature is the support of multiple domains. This means that a single machine can host a large number of unique domains. For ISPs, the benefits of this are obvious, since you do not need a separate machine for each domain. However, even within a single company there are benefits. For example, if you have individual domains based on departments. For example, www.company.finance.com or www.company.research.com. These would appear as individual domains, although they would all be on one server.
Also provided is what is called “configuration rollback.” Just like the rollback in a database, you can easily return to a previous state of the configuration. This allows for testing, but still enables you to quickly return to a previous state. Administration is also performed using the Secure Sockets layers so you can easily manage systems across the Internet without the risk of someone else gaining access.
In addition, there are several enhancements to the server that improve its performance considerably. One enhancement is optimized caching, which is designed to increase the capacity and to significantly reduce access time. There is also HTTP “keep-alives”, which reduces network traffic by using a single TCP session across multiple HTTP connections. This lowers the overhead and performance loss of establishing the connection for each session. In addition, there is even symmetric multi-processor (SMP) support, as well as platform-specific optimizations.
To me the most exciting feature is Netscape LiveWire. This provides “Visual site management”, which enables you to graphically view the organization of your entire web. The features of the LiveWire Site Manager make running your web site a breeze. You can re-organize easily with LiveWire drag-n-drop capabilities. All links are then automatically updated to ensure that all of the links are consistent. You can also run the “External Link Checker” to ensure that links at other sites are also valid.
For larger operations, Live Wire allows you to important entire Webs. These can then be integrated into an existing site. This allows different departments to be working on their own piece of the web and then integrate it all into a single company-wide web. When the web is ready to be published, it can easily be deployed from your development site (or wherever) to the specific Web server.
The Netscape Mail Server is another add-on product that will enable your SCO machine to provide even more services more your users. The key advantage is that communications between a wide range of systems (such as Zmail, Eudora Pegasus Mail, Microsoft Exchange and, of course, SCO) is now done on a central server. The Netscape Mail Server also overcomes the problems associated with the proprietary mail solutions so common in businesses.
Netscape Mail Server is a native SMTP/MIME system, which is the defacto standard for Internet Mail. This allows it to interoperate with any other SMTP-compliant email system. The Netscape Mail Server also provides support for “sometimes” sites, such as home offices or employees who travel a lot. Such users have most of the same functionality as local users do, including the management of the mailboxes (creating, deleting, renaming, and so on), searching, selective retrieval of messages, and the like.
One of the key features of the Netscape Mail Server is its scalability. By using a client-server architecture, it can support several thousand users on a single server. In addition, performance is increased considerably over file-based email systems like traditional UNIX or Microsoft Mail.
Using much of the same technology as the Web servers, the Netscape Mailserver can ensure the security of your messages, even across the Internet. This includes the Secure Socket Layer and the S/KEY single-session key. Administrative functions can also be separated based on passwords. For example, one password may be needed to simply retrieve mail, while another is necessary to change the configuration. In addition, users can have access to the mail functionality without the need to have their own accounts on any server. My favorite feature is the ability to limit which domains and hosts can send or receive mail. I am constantly being inundated with junk mail. The Netscape Mail Server allows me prevent mail from those company’s I know are just sending junk. Wouldn’t it be nice to do this with phone calls?
Then there is the Netscape News Server. Although allowing access to Internet News groups, the Netscape News Server makes managing company internal newsgroup easy. Access to newsgroups is based on an Access Control List (ACL), which allows you to keep departmental newsgroups within the department, for example. As with other Netscape servers, the news server also employs the same security techniques, such as SSL. News can be transferred across the Internet, which keeps company internal information private while still making it available to remote sites.
The Netscape News Server supports the Network News Transfer Protocol (NNTP), which allows it to exchange news with any Usenet-compatible news server. It also supports rich contents postings which means you can embed graphics or MIME documents directly in the news posting.
As traffic across the Internet gateway and the wide area network grows at exponential rates, so does network congestion. This situation presents numerous challenges to network administrators, who need to manage network traffic, prevent access to inappropriate content, protect their network from viruses, and selectively allow encrypted traffic in and out.
If you are bothered with slow response times due to high network traffic, you may not need to get a faster connection. The Netscape Proxy Server may provide a solutions. By replicating and filtering Web content, you can speed up access considerably. Using the “Replication-on-demand” feature, documents or entire sites can be downloaded ahead of time., thereby speeding up access. One example implementation of this would be a corporate Web that is managed at a single location. The entire contents of the Web (or selected pieces) could replicated to the remote offices.
Sites on the Internet can also be filtered out as needed. Therefore, sites with “questionable” content can be restricted. Access can be controlled through username/password combinations, IP address or wildcards.
Now that we have gone over the basics of setting up our Internet server, we can address the other issues necessary to make a really first rate server. That is the subject of the next chapter.
Next: Running An Internet Server
Copyright 1996-1998 by James Mohr. All rights reserved. Used by permission of the author.
Be sure to visit Jim's great Linux Tutorial web site at http://www.linux-tutorial.info/