This manual covers the main aspects of TCP/IP and Ethernet in detail, including practical implementation, troubleshooting and maintenance.
Revision 6
Website: www.idc-online.com
E-mail: idc@idc-online.com
IDC Technologies Pty Ltd
PO Box 1093, West Perth, Western Australia 6872
Offices in Australia, New Zealand, Singapore, United Kingdom, Ireland, Malaysia, Poland, United States of America, Canada, South Africa and India
Copyright © IDC Technologies 2013. All rights reserved.
First published 2001
ISBN: 978-1-922062-07-9
All rights to this publication, associated software and workshop are reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher. All enquiries should be made to the publisher at the address above.
Whilst all reasonable care has been taken to ensure that the descriptions, opinions, programs, listings, software and diagrams are accurate and workable, IDC Technologies do not accept any legal responsibility or liability to any person, organization or other entity for any direct loss, consequential loss or damage, however caused, that may be suffered as a result of the use of this publication or the associated workshop and software.
In case of any uncertainty, we recommend that you contact IDC Technologies for clarification or assistance.
All logos and trademarks belong to, and are copyrighted to, their companies respectively.
IDC Technologies expresses its sincere thanks to all those engineers and technicians on our training workshops who freely made available their expertise in preparing this manual.
One of the great protocols that has been inherited from the Internet is TCP/IP and this is being used as the open standard today for all network and communications systems. The reasons for this popularity are not hard to find. TCP/IP and Ethernet are truly open standards available to competing manufacturers and provide the user with a common standard for a variety of products from different vendors. In addition, the cost of TCP/IP and Ethernet is relatively low. Initially TCP/IP was used extensively in military applications and the purely commercial world such as banking, finance, and general business. But of great interest has been the strong movement to universal usage by the hitherto disinterested industrial and manufacturing spheres of activity who has traditionally used their own proprietary protocols and standards. These proprietary standards have been almost entirely replaced by the usage of the TCP/IP suite of protocols.
This is a hands-on book that has been structured to cover the main areas of TCP/IP and Ethernet in detail, while covering the practical implementation of TCP/IP in computer and industrial applications. Troubleshooting and maintenance of TCP/IP networks and communications systems in industrial environment will also be covered.
After reading this manual we would hope you would be able to:
Typical people who will find this book useful include:
You should have a modicum of computer knowledge and know how to use the Microsoft Windows operating system in order to derive maximum benefit from this book.
The structure of the book is as follows.
Chapter 1: Introduction to Communications. This chapter gives a brief overview of what is covered in the book with an outline of the essentials of communications systems.
Chapter 2: Networking Fundamentals. An overview of network communication, types of networks, the OSI model, network topologies and media access methods.
Chapter 3: Half-duplex (CSMA/CD) Ethernet Networks. A description of the operation and performance of the older 10 Mbps Ethernet networks commencing with the basic principles.
Chapter 4: Fast and Gigabit Ethernet Systems. A minimum speed of 100 Mbps is becoming de rigeur on most Ethernet networks and this chapter examines the design and installation issues for Fast Ethernet and Gigabit Ethernet systems, which go well beyond the traditional 10 Mbps speed of operation.
Chapter 5: Introduction to TCP/IP. A brief review of the origins of TCP/IP to lay the foundation for the following chapters.
Chapter 6: Internet layer protocols. This chapter fleshes out the Internet Protocol (both IPv4 and IPv6) – perhaps the workhorses of the TCP/IP suite of protocols – and also examines the operation of ARP, RARP and ICMP.
Chapter 7: Host-to-Host (Transport) layer protocols. The TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) are both covered in this chapter.
Chapter 8: Application layer protocols. A thorough coverage of the most important Application layer protocols such as FTP, TFTP, TELNET, DNS, WINS, SNMP, SMTP, POP, BOOTP and DHCP.
Chapter 9: TCP/IP utilities. A coverage focusing on the practical application of the main utilities such as PING, ARP, NETSTAT NBTSTAT, IPCONFIG, WINIPCFG, TRACERT, ROUTE and the HOSTS file.
Chapter 10: LAN system components. A discussion on the key components for interconnecting networks such as repeaters, bridges, switches and routers.
Chapter 11: VLANs. An overview of Virtual LANS; what they are used for, how they are set up, and how they operate.
Chapter 12: VPNs. This chapter discusses the rationale behind VPN deployment and takes a look at the various protocols and security mechanisms employed.
Chapter 13: The Internet for communication. The various TCP/IP protocols used for VoIP (Voice over IP) are discussed here, as well as the H.323 protocols and terminology.
Chapter 14: Security considerations. The security problem and methods of controlling access to a network will be examined in this chapter. This is a growing area of importance due to the proliferation attacks on computer networks by external parties.
Chapter 15: Process automation. The legacy architectures and the factory of the future will be examined here together with an outline of the key elements of the modern Ethernet and TCP/IP architecture.
Chapter 16: Troubleshooting Ethernet. Various troubleshooting techniques as well as the required equipment used will be described here, focusing on the medium as well as layers 1 and 2 of the OSI model.
Chapter 17: Troubleshooting TCP/IP. This chapter covers the troubleshooting and maintenance of a TCP/IP network, focusing on layers 3 and 4 of the OSI model and dealing with the use of protocol analyzers.
Chapter 18: Satellites and TCP/IP. An overview of satellites and the problems/solutions associated with running TCP/IP over a satellite link.
When you have completed study of this chapter you should:
Communications systems transfer messages from one location to another. The information component of a message is usually known as data (derived from the Latin word for items of information). All data is made up of unique code symbols or other entities on which the sender and receiver of the messages have agreed. For example, binary data is represented by two states viz. ‘0’ and ‘1’. These are referred to as binary digits or ‘bits’ and are represented inside computers by the level of the electrical signals within storage elements; a high level could represent a ‘1’, and a low-level could represent a ‘0’. Alternatively, the data may be represented by the presence or absence of light in an optical fiber cable.
A communications process requires the following components:
This process is shown in Figure 1.1.
The transmitter encodes the information into a suitable form to be transmitted over the communications channel. The communications channel moves this signal from the source to one or more destination receivers. The channel may convert this energy from one form to another, such as electrical to optical signals, whilst maintaining the integrity of the information so the recipient can understand the message sent by the transmitter.
For the communications to be successful the source and destination must use a mutually agreed method of conveying the data.
The main factors to be considered are:
The form of the physical connections is defined by interface standards. Some agreed-upon coding is applied to the message and the rules controlling the data flow and the detection and correction of errors are known as the protocol.
An interface standard defines the electrical and mechanical aspects of the interface to allow the communications equipment from different manufacturers to interoperate.
A typical example is the TIA-232-F interface standard (commonly known as RS-232). This specifies the following three components:
It should be emphasized that the interface standard only defines the electrical and mechanical aspects of the interface between devices and does not define how data is transferred between them.
A wide variety of codes have been used for communications purposes. Early telegraph communications used Morse code with human operators as transmitter and receiver. The Baudot code introduced a constant 5-bit code length for use with mechanical telegraph transmitters and receivers. The commonly used codes for data communications today are the Extended Binary Coded Decimal Interchange Code (EBCIDIC) and the American Standard Code for Information Interchange (ASCII).
A protocol is essential for defining the common message format and procedures for transferring data between all devices on the network. It includes the following important features:
An analog communications channel conveys signals that change continuously in both frequency and amplitude. A typical example is a sine wave as illustrated in Figure 1.2. On the other hand, digital transmission employs a signal of which the amplitude varies between a few discrete states. An example is shown in Figure 1.3.
As the signal travels along a communications channel its amplitude decreases as the physical medium resists the flow of the signal energy. This effect is known as signal attenuation. With electrical signaling some materials such as copper are very efficient conductors of electrical energy. However, all conductors contain impurities that resist the movement of the electrons that constitute the electric current. The resistance of the conductors causes some of the electrical energy of the signal to be converted to heat as the signal progresses along the cable resulting in a continuous decrease in the electrical signal. The signal attenuation is measured in terms of signal loss per unit length of the cable, typically dB/km.
To allow for attenuation, a limit is set for the maximum length of the communications channel. This is to ensure that the attenuated signal arriving at the receiver is of sufficient amplitude to be reliably detected and correctly interpreted. If the channel is longer than this maximum specified length, repeaters must be used at intervals along the channel to restore the signal to acceptable levels.
Signal attenuation increases as the frequency increases. This causes distortion to practical signals containing a range of frequencies. This problem can be overcome by the use of amplifiers that amplify the higher frequencies by greater amounts.
The quantity of information a channel can convey over a given period is determined by its ability to handle the rate of change of the signal, i.e. the signal frequency. The bandwidth of an analog channel is the difference between the highest and lowest frequencies that can be reliably transmitted over the channel. These frequencies are often defined as those at which the signal at the receiving end has fallen to half the power relative to the mid-band frequencies (referred to as the -3 dB points), in which case the bandwidth is known as the -3 dB bandwidth.
Digital signals are made up of a large number of frequency components, but only those within the bandwidth of the channel will be able to be received. It follows that the larger the bandwidth of the channel, the higher the data transfer rate can be and more high frequency components of the digital signal can be transported, and so a more accurate reproduction of the transmitted signal can be received.
The maximum data transfer rate (C) of the transmission channel can be determined from its bandwidth, by use of the following formula derived by Shannon.
C= 2B log2 M bps
Where:
B = bandwidth in hertz and M levels are used for each signaling element.
In the special case where only two levels, ‘ON’ and ‘OFF’ are used (binary), M = 2 and C = 2 B. For example, the maximum data transfer rate for a PSTN channel with 3200 hertz bandwidth carrying a binary signal would be 2 × 3200 = 6400 bps. The achievable data transfer rate would then be reduced by half because of the Nyquist rate. It is further reduced in practical situations because of the presence of noise on the channel to approximately 2400 bps unless some modulation system is used.
As the signals pass through a communications channel the atomic particles and molecules in the transmission medium vibrate and emit random electromagnetic signals as noise. The strength of the transmitted signal is normally large relative to the noise signal. However, as the signal travels through the channel and is attenuated, its level can approach that of the noise. When the wanted signal is not significantly higher than the background noise, the receiver cannot separate the data from the noise and communication errors occur.
An important parameter of the channel is the ratio of the power of the received signal (S) to the power of the noise signal (N). The ratio S/N is called the Signal to Noise ratio, normally expressed in decibels (dB).
S/N = 10 log10 (S/N) dB
Where signal and noise levels are expressed in watts, or
S/N = 20 log10 (S/N) dB
Where signal and noise levels are expressed in volts
A high signal to noise ratio means that the wanted signal power is high compared to the noise level, resulting in good quality signal reception
The theoretical maximum data transfer rate for a practical channel can be calculated using the Shannon-Hartley law, which states that:
C = B log2 (1+S/N) bps
Where:
C = data rate in bps
B = bandwidth of the channel in hertz
S = signal power in watts and
N = noise power in watts
It can be seen from this formula that increasing the bandwidth or increasing the S/N ratio will allow increases to the data rate, and that a relatively small increase in bandwidth is equivalent to a much greater increase in S/N ratio.
Digital transmission channels make use of higher bandwidths and digital repeaters or regenerators to regenerate the signals at regular intervals and maintain acceptable signal to noise ratios. The degraded signals received at the regenerator are detected, then retimed and retransmitted as nearly perfect replicas of the original digital signals, as shown in Figure 1.7. Provided the signal to noise ratios is maintained in each link, there is no accumulated noise on the signal, even when transmitted over thousands of kilometers.
Simplex
A simplex channel is unidirectional and allows data to flow in one direction only, as shown in Figure 1.8. Public radio broadcasting is an example of a simplex transmission. The radio station transmits the broadcast program, but does not receive any signals back from the receiver(s).
This has limited use for data transfer purposes, as we invariably require the flow of data in both directions to control the transfer process, acknowledge data etc.
Half-duplex
Half-duplex transmission allows simplex communication in both directions over a single channel, as shown in Figure 1.9. Here the transmitter at ‘A’ sends data to a receiver at ‘B’. A line turnaround procedure takes place whenever transmission is required in the opposite direction. Transmitter ‘B’ is then enabled and communicates with receiver ‘A’. The delay in the line turnaround reduces the available data throughput of the channel.
Full-duplex
A full-duplex channel gives simultaneous communications in both directions, as shown in Figure 1.10.
Data communications depends on the timing of the signal transmission and reception being kept correct throughout the message transmission. The receiver needs to look at the incoming data at correct instants before determining whether a ‘1’ or ‘0’ was transmitted. The process of selecting and maintaining these sampling times is called synchronization.
In order to synchronize their transmissions, the transmitting and receiving devices need to agree on the length of the code elements to be used, known as the bit time. The receiver also needs to synchronize its clock with that of the sender in order to determine the right times at which to sample the data bits in the message. A device at the receiving end of a digital channel can synchronize itself using either asynchronous or synchronous means as outlined below.
Here the transmitter and receiver operate independently, and the receiver synchronizes its clock with that of the transmitter (only) at the start of each message frame. Transmissions are typically around one byte in length, but can also be longer. Often (but not necessarily) there is no fixed relationship between one message frame and the next, such as a computer keyboard input with potentially long random pauses between keystrokes.
At the receiver the channel is monitored for a change in voltage level. The leading edge of the start bit is used to set a bistable multivibrator (flip-flop), and half a bit period later the output of the flip-flop is compared (ANDed) with the incoming signal. If the start bit is still present, i.e. the flip-flop was set by a genuine start bit and not by a random noise pulse, the bit clocking mechanism is set in motion. In fixed-rate systems this half-bit delay can be generated with a monostable (one-shot) multivibrator, but in variable-rate systems it is easier to feed a 4-bit (binary or BCD) counter with a frequency equal to 16 times the data frequency and observing the ‘D’ (most significant bit) output, which changes from 0 to 1 at a count of eight, half-way through the bit period.
The input signal is fed into a serial-in parallel-out shift register. The data bits are then captured by clocking the shift register at the data rate, in the middle of each bit. For an eight-bit serial transmission (data plus parity), this sampling is repeated for each of the eight data bits and a final sample is made during the ninth time interval to identify the stop bit and to confirm that the synchronization has been maintained to the end of the frame. Figure 1.13 illustrates the asynchronous data reception process.
The receiver here is initially synchronized with the transmitter, and then maintains this synchronization throughout the continuous transmission of multiple bytes. This is achieved by special data coding schemes, such as Manchester, which ensure that the transmitted clock is encoded into the transmitted data stream. This enables the synchronization to be maintained at the receiver right to the last bit of the message, and thus allows larger frames of data (up to several thousand bytes) to be efficiently transferred at high data rates.
The synchronous system packs many bytes together and sends them as a continuous stream, called a frame. Each frame consists of a header, data field and checksum. Clock synchronization is provided by a predefined sequence of bits called a preamble.
In some cases the preamble is terminated with a ‘Start of Frame Delimiter’ or SFD. Some systems may append a post-amble if the receiver is not otherwise able to detect the end of the message. An example of a synchronous frame is shown in Figure 1.14. Understandably all high-speed data transfer systems utilize synchronous transmission systems to achieve fast, accurate transfers of large amounts of data.
Manchester is a bi-phase signal-encoding scheme, and is used in the older 10 Mbps Ethernet LANs. The direction of the transition in mid-interval (negative to positive or positive to negative) indicates the value (‘1’ or ‘0’, respectively) and provides the clocking.
The Manchester codes have the advantage that they are self-clocking. Even a sequence of one thousand ‘0s’ will have a transition in every bit; hence the receiver will not lose synchronization. The price paid for this is a bandwidth requirement double that which is required by the RZ-type methods.
The Manchester scheme follows these rules:
In Manchester encoding, the beginning of a bit interval is used merely to set the stage. The activity in the middle of each bit interval determines the bit value: upward transition for a ‘1’ bit, downward for a ‘0’ bit.
Differential Manchester is a bi-phase signal-encoding scheme used in Token Ring LANs. The presence or absence of a transition at the beginning of a bit interval indicates the value; the transition in mid-interval just provides the clocking.
For electrical signals, bit values will generally be represented by one of three possible voltage levels: positive (+V), zero (0 V), or negative (–V). Any two of these levels are needed – for example, + V and –V.
There is a transition in the middle of each bit interval. This makes the encoding method self-clocking, and helps avoid signal distortion due to DC signal components.
For one of the possible bit values but not the other, there will be a transition at the start of any given bit interval. For example, in a particular implementation, there may be a signal transition for a ‘1’ bit.
In differential Manchester encoding, the presence or absence of a transition at the beginning of the bit interval determines the bit value. In effect, ‘1’ bits produce vertical signal patterns; ‘0’ bits produce horizontal patterns. The transition in the middle of the interval is just for timing.
The RZ-type codes consume only half the bandwidth taken up by the Manchester codes. However, they are not self-clocking since a sequence of a thousand ‘0s’ will result in no movement on the transmission medium at all.
RZ is a bipolar signal-encoding scheme that uses transition coding to return the signal to a zero voltage during part of each bit interval. It is self-clocking.
In the differential version, the defining voltage (the voltage associated with the first half of the bit interval) changes for each ‘1’ bit, and remains unchanged for each ‘0’ bit.
In the non-differential version, the defining voltage changes only when the bit value changes, so that the same defining voltages are always associated with ‘0’ and ‘1’. For example, +5 volts may define a ‘1’, and –5 volts may define a ‘0’.
NRZ is a bipolar encoding scheme. In the non-differential version it associates, for example, +5 V with ‘1’ and –5 V with ‘0’.
In the differential version, it changes voltages between bit intervals for ‘1’ values but not for ‘0’ values. This means that the encoding changes during a transmission. For example, ‘0’ may be a positive voltage during one part and a negative voltage during another part, depending on the last occurrence of a ‘1’. The presence or absence of a transition indicates a bit value, not the voltage level.
MLT-3 is a three-level encoding scheme that can also scramble data. This scheme is one proposed for use in FDDI networks. The MLT-3 signal-encoding scheme uses three voltage levels (including a zero level) and changes levels only when a ‘1’ occurs.
It follows these rules:
MLT-3 is not self-clocking, so that a synchronization sequence is needed to make sure the sender and receiver are using the same timing.
The Manchester codes, as used for 10 Mbps Ethernet, are self-clocking but consume unnecessary bandwidth (at 10 Mbps it introduces a 20 MHz frequency component on the medium). For this reason it is not possible to use it for 100 Mbps Ethernet, even over Cat5 cable. A solution is to revert back to one of the more bandwidth efficient methods such as NRZ or RZ. The problem with these, however, is that they are not self-clocking and hence the receiver loses synchronization if several zeros are transmitted sequentially. This problem, in turn, is overcome by using the 4B/5B technique.
The 4B/5B technique codes each group of four bits into a five-bit code. For example, the binary pattern 0110 is coded into the five-bit pattern 01110. This code table has been designed in such a way that no combination of data can ever be encoded with more than 3 zeros on a row. This allows the carriage of 100 Mbps data by transmitting at 125 MHz, as opposed to the 200 Mbps required by Manchester encoding.
All practical data communications channels are subject to noise, particularly where equipment is situated in industrial environments with high electrical noise, such as electromagnetic radiation from adjacent equipment or electromagnetic induction from adjacent cables. As a consequence the received data may contain errors. To ensure reliable data communication we need to check the accuracy of each message.
Asynchronous systems often use a single bit checksum, the parity bit, for each message, calculated from the seven or eight data bits in the message. Longer messages require more complex checksum calculations to be effective. For example the Longitudinal Redundancy Check (LRC) calculates an additional byte covering the content of the message (up to 15 bytes) while a two-byte arithmetic checksum can be used for messages up to 50 bytes in length. Most high-speed LANs use a 32-bit Cyclic Redundancy Check (CRC)
The CRC method detects errors with very high degree of accuracy in messages of any length and can, for example, detect the presence of a single bit error in a frame containing tens of thousands of bits of data. The CRC treats all the bits of the message block as one binary number that is then divided by a known polynomial. For a 32-bit CRC this is a specific 32-bit number, specially chosen to detect very high percentages of errors, including all error sequences of less than 32 bits. The remainder found after this division process is the CRC. Calculation of the CRC is carried out by the hardware in the transmission interface of LAN adapter cards.
When you have completed study of this chapter you should be able to:
Linking computers and other devices together to share information is nothing new. The technology for Local Area Networks (LANs) was developed in the 1970s by minicomputer manufacturers to link widely separated user terminals to computers. This allowed the sharing of expensive peripheral equipment as well as data that may have previously existed in only one physical location.
A LAN is a communications path between two or more computers, file servers, terminals, workstations and various other intelligent peripheral equipment, generally referred to as devices or hosts. A LAN allows access to devices to be shared by several users, with full connectivity between all stations on the network. It is usually owned and administered by a private owner and is located within a localized group of buildings.
The connection of a device such as a PC or printer to a LAN is generally made through a Network Interface Card or NIC. The networked device is then referred to as a node. Each node is allocated a unique address and every message sent on the LAN to a specific node must contain the address of that node in its header. All nodes continuously watch for any messages sent to their own addresses on the network. LANs operate at relatively high speeds (Mbps range and upwards) with a shared transmission medium over a fairly small geographical (i.e. local) area.
On a LAN the software controlling the transfer of messages among the devices on the network must deal with the problem of sharing the common resources of the network without conflict or corruption of data. Since many users can access the network at the same time, some rules must be established by which devices can access the network, when, and under what conditions. These rules are covered under the general heading of Media Access Control.
When a node has access to the channel to transmit data, it sends the data within a packet (or frame) that generally includes, in its header, the addresses of both the source and the destination nodes. This allows each node to either receive or ignore data on the network.
There are two basic types of communications processes for transferring data across networks, viz. circuit switching and packet switching. These are illustrated in Figure 2.1
In a circuit switched process a continuous connection is made across the network between the two different points. This is a temporary connection that remains in place as long as both parties wish to communicate, i.e. until the connection is terminated. All the network resources are available for the exclusive use of these two parties whether they are sending data or not. When the connection is terminated the network resources are released for other users. A call in an older (non-digital) telephone system is an example of a circuit switched connection.
The advantage of circuit switching is that the users have an exclusive channel available for the transfer of their data at any time while the connection is made. The obvious disadvantage is the cost of maintaining the connection when there is little or no data being transferred. Such connections can be very inefficient for the bursts of data that are typical of many computer applications.
Packet switching systems improve the efficiency of the transfer of bursts of data, by sharing the one communications channel with other similar users. This is analogous to the efficiencies of the mail system as discussed in the following paragraph.
When you send a letter by mail you post the stamped, addressed envelope containing the letter in your local mailbox. At regular intervals the mail company collects all the letters from your locality and takes them to a central sorting facility where the letters are sorted in accordance with the addresses of their destinations. All the letters for each destination are sent off in common mailbags to those locations, and are subsequently delivered in accordance with their addresses. Here we have economies of scale where many letters are carried at one time and are delivered by the one visit to your street/locality. Efficiency is more important than speed, and some delay is normal – within acceptable limits.
Packet switched messages are broken into a series of packets of certain maximum size, each containing the destination and source addresses and a packet sequence number. The packets are sent over a common communications channel, interleaved with those of other users. The ‘switches’ (routers) in the system forward the messages based on their destination address. Messages sent in multiple packets are reassembled in the correct order by the destination node.
All packets do not necessarily follow the same path. As they travel through the network they may get separated and handled independently from each other, but eventually arrive at their correct destination albeit out of their transmitted sequence. Some packets may even be held up temporarily (stored) at a node, due to unavailable lines or technical problems that might arise on the network. When the time is right, the node then allows the packet to pass or be ‘forwarded’.
The Internet is an example of a global packet switching network.
Packet switched services generally support two types of services viz. datagram services and virtual circuits. In a self-contained LAN all packets will eventually reach their destination. However, if packets are to be switched ACROSS networks i.e. on an internetwork such as a Wide Area Network (WAN), then a routing decision must be made.
There are two possible approaches. The first is referred to as a ‘datagram’ service. The destination address incorporated in the data header will allow the routing to be performed. There is no guarantee when any packet will arrive at its destination, and sequential packets may well arrive out of sequence. The principle is similar to the mail service. You may send four postcards from your holiday in the South of France, but there is no guarantee that they will arrive in the same order that you posted them. If the recipient does not have a telephone, there is no easy method of determining that they have, in fact, been delivered.
Such a service is called an ’unreliable’ service. The term ‘unreliable’ is here not used in its everyday context, but instead refers to the fact that there is no mechanism for informing the sender whether the packet had been delivered or not. The service is also called ‘connectionless’ since there is no logical connection between sender and recipient.
The second approach is to set up a logical connection between transmitter and receiver, and to send packets of data along this connection or ‘virtual circuit’. Whilst this might seem to be in conflict with the earlier statements on circuit switching, it should be quite clear that this does NOT imply a permanent physical circuit being dedicated to the one packet stream of data. Instead, the circuit shares its capacity with other traffic. The important point to note is that the route for the data packets to follow is taken up-front when all the routing decisions are taken. The data packets just follow that pre-established route. This service is known as ‘reliable’ and is also referred to as a connection oriented service.
LANs are characterized by high-speed transmission over a restricted geographical area. Gigabit Ethernet (1000BaseX), for example, operates at 1000 Mbps. The restriction on size, though, is a deployment and cost issue and not a technical one, as current Ethernet technology allows switches to be interconnected with fiber into networks covering distances of thousands of kilometers.
While LANs operate where distances are relatively small, Wide Area Networks (WANs) are used to link LANs that are separated by large distances that range from a few tens of meters to thousands of kilometers. WANs normally use the public telecommunication system to provide cost-effective connection between LANs. Since these links are supplied by independent telecommunications utilities, they are commonly referred to (and illustrated as) a ‘communications cloud’. Routers perform this type of activity. They store the message at LAN speed and re-transmit it across the communications cloud at a different speed. When the entire message has been received at the access router of the remote LAN, it is once again forwarded at LAN speed. A typical speed at which a WAN interconnects varies between 9600 bps for a leased line to 40 Gbps for SDH/SONET. This concept is shown in Figure 2.3.
If reliability is needed for a time critical application, WANs can be considered quite unreliable, as delay in the information transmission is varied and wide. For this reason, WANs can only be used if the necessary error detection/correction software is in place, and if propagation delays can be tolerated within certain limits.
An intermediate type of network – a Metropolitan Area Network (MAN) – operates typically at 100Mbps. MANs use fiber optic technology to communicate over distances of up to several hundred kilometers. They are normally used by telecommunication service providers or utilities within or around cities. The distinction between MANs and LANs is, however, becoming rather academic since current Ethernet LAN technology (100 Mbps switches with 120 km fiber links) are also used to implement MANs.
The coupling ratio provides an academic yardstick for comparing the performance of these different kinds of networks. It is useful to give us an insight into the way these networks operate.
Coupling ratio α = τ / T
Where:
τ Propagation delay for packet
T Average packet transmission time
α = <<1 indicates a LAN
α = 1 indicates a MAN
α = >>1 indicates a WAN
This is illustrated in the following examples and Figure 2.4.
A cheaper alternative to a WAN, which uses dedicated packet switched links (such as X.25) to interconnect two or more LANs, is the Virtual Private Network (VPN), which interconnects LANs by utilizing the existing Internet infrastructure.
A potential problem is the fact that the traffic between the networks shares all the other Internet traffic and hence all communications between the LANs are visible to the outside world. This problem is solved by utilizing encryption techniques to make all communications between the LANs transparent (i.e. illegible) to other Internet users.
A communication framework that has had a tremendous impact on the design of LANs is the Open Systems Interconnection (OSI) model. The objective of this model is to provide a framework for the coordination of standards development and allows both existing and evolving standards activities to be set within that common framework.
The wiring together of two or more devices with digital communication is the first step towards establishing a network. In addition to the hardware requirements as discussed above, the software problems of communication must also be overcome. Where all the devices on a network are from the same manufacturer, the hardware and software problems are usually easily overcome because all the system components have usually been designed within the same guidelines and specifications.
When devices from several manufacturers are used on the same application, the problems seem to multiply. Networks that are specific to one manufacturer and work with proprietary hardware connections and protocols are called closed systems. Usually these systems were developed at a time before standardization became popular, or when it was considered unlikely that equipment from other manufacturers would be included in the network.
In contrast, ‘open’ systems conform to specifications and guidelines that are ‘open’ to all. This allows equipment from any manufacturer that complies with that standard to be used interchangeably on the network. The benefits of open systems include wider availability of equipment, lower prices and easier integration with other components.
Faced with the proliferation of closed network systems, the International Organization for Standardization (ISO) defined a ‘Reference Model for Communication between Open Systems’ (ISO 7498) in 1978. This has since become known as the OSI model. The OSI model is essentially a data communications management structure, which breaks data communications down into a manageable hierarchy (‘stack’) of seven layers. Each layer has a defined purpose and interfaces with the layers above it and below it. By laying down functions and services for each layer, some flexibility is allowed so that the system designers can develop protocols for each layer independently of each other. By conforming to the OSI standards, a system is able to communicate with any other compliant system, anywhere in the world.
The OSI model supports a client/server model and since there must be at least two nodes to communicate, each layer also appears to converse with its peer layer at the other end of the communication channel in a virtual (‘logical’) communication. The concept of isolation of the process of each layer, together with standardized interfaces and peer-to-peer virtual communication, are fundamental to the concepts developed in a layered model such as the OSI model. This concept is shown in Figure 2.5.
The actual functions within each layer are provided by entities (abstract devices such as programs, functions, or protocols) that implement the services for a particular layer on a single machine. A layer may have more than one entity – for example a protocol entity and a management entity. Entities in adjacent layers interact through the common upper and lower boundaries by passing physical information through Service Access Points (SAPs). A SAP could be compared to a predefined ‘postbox’ where one layer would collect data from the previous layer. The relationship between layers, entities, functions and SAPs is shown in Figure 2.6.
In the OSI model, the entity in the next higher layer is referred to as the N+1 entity and the entity in the next lower layer as N–1. The services available to the higher layers are the result of the services provided by all the lower layers.
The functions and capabilities expected at each layer are specified in the model. However, the model does not prescribe how this functionality should be implemented. The focus in the model is on the ‘interconnection’ and on the information that can be passed over this connection. The OSI model does not concern itself with the internal operations of the systems involved.
When the OSI model was being developed, a number of principles were used to determine exactly how many layers this communication model should encompass. These principles are:
The use of these principles led to seven layers being defined, each of which has been given a name in accordance with its process purpose. The diagram below shows the seven layers of the OSI model.
The service provided by any layer is expressed in the form of a service primitive with the data to be transferred as a parameter. A service primitive is a fundamental service request made between protocols. For example, layer W may sit on top of layer X. If W wishes to invoke a service from X, it may issue a service primitive in the form of X.Connect.request to X. An example of a service primitive is shown in Figure 2.8. Service primitives are normally used to transfer data between processes within a node.
Typically, each layer in the transmitting site, with the exception of the lowest, adds header information, or Protocol Control Information (PCI), to the data before passing it through the interface between adjacent layers. This interface defines which primitive operations and services the lower layer offers to the upper one. The headers are used to establish the peer-to-peer sessions across the sites and some layer implementations use the headers to invoke functions and services at the N+1 or N–1 adjacent layers.
At the transmitter, the user application (e.g. the client) invokes the system by passing data, primitive names and control information to the highest layer of the protocol stack. The stack then passes the data down through the seven layers, adding headers (and possibly trailers), and invoking functions in accordance with the rules of the protocol at each layer. At each level, this combined data and header is called a Protocol Data Unit or PDU. At the receiving site, the opposite occurs with the headers being stripped from the data as it is passed up through the layers. These header and control messages invoke services and a peer-to-peer logical interaction of entities across the sites. Generally speaking, layers in the same stack communicate with parameters passed through primitives, and peer layers communicate with the use of the PCI (headers) across the network.
At this stage it should be quite clear that there is no physical connection or direct communication between the peer layers of the communicating applications. Instead, all physical communication is across the Physical, or lowest layer of the stack. Communication is down through the protocol stack on the transmitting node and up through the stack on the receiving node. Figure 2.9 shows the full architecture of the OSI model, whilst Figure 2.10 shows the effects of the addition of headers to the respective PDUs at each layer. The net effect of this extra information is to reduce the overall bandwidth of the communications channel, since some of the available bandwidth is used to pass control information.
Briefly, the services provided at each layer of the stack are:
The Application layer is the topmost layer in the OSI reference model. This layer is responsible for giving applications access to the network. Examples of application-layer tasks include file transfer, electronic mail (e-mail) services, and network management. Application-layer services are much more varied than the services in lower layers, because the entire range of application and task possibilities is available here. The specific details depend on the framework or model being used. For example, there are several network management applications. Each of these provides services and functions specified in a different framework for network management. Programs can get access to the application-layer services through Application Service Elements (ASEs). There are a variety of such application service elements; each designed for a class of tasks. To accomplish its tasks, the application layer passes program requests and data to the presentation layer, which is responsible for encoding the Application layer’s data in the appropriate form.
The Presentation layer is responsible for presenting information in a manner suitable for the applications or users dealing with the information. Functions such as data conversion from EBCDIC to ASCII (or vice versa), the use of special graphics or character sets, data compression or expansion, and data encryption or decryption are carried out at this layer. The Presentation layer provides services for the Application layer above it, and uses the Session layer below it. In practice, the presentation layer rarely appears in pure form, and it is the least well defined of the OSI layers. Application- or Session-layer programs will often encompass some or all of the Presentation layer functions.
The Session layer is responsible for synchronizing and sequencing the dialog and packets in a network connection. This layer is also responsible for ensuring that the connection is maintained until the transmission is complete and that the appropriate security measures are taken during a ‘session’ (i.e., a connection). The Session layer is used by the Presentation layer above it, and uses the Transport layer below it.
Transport layer
In the OSI reference model, the Transport layer is responsible for providing data transfer at an agreed-upon level of quality, such as at specified transmission speeds and error rates. To ensure delivery, some Transport layer protocols assign sequence numbers to outgoing packets. The Transport layer at the receiving end checks the packet numbers to make sure all have been delivered and to put the packet contents into the proper sequence for the recipient. The Transport layer provides services for the session layer above it, and uses the Network layer below it to find a route between source and destination. The Transport layer is crucial in many ways, because it sits between the upper layers (which are strongly application-dependent) and the lower ones (which are network-based).
The layers below the Transport layer are collectively known as the ‘subnet’ layers. Depending on how well (or not) they perform their functions, the Transport layer has to interfere less (or more) in order to maintain a reliable connection.
The Network layer is the third layer from the bottom up, or the uppermost ‘subnet layer’. It is responsible for the following tasks:
The Data Link layer is responsible for creating, transmitting, and receiving data packets. It provides services for the various protocols at the Network layer, and uses the Physical layer to transmit or receive material. The Data Link layer creates packets appropriate for the network architecture being used. Requests and data from the Network layer are part of the data in these packets (or frames, as they are often called at this layer). These packets are passed down to the Physical layer and from there they are transmitted to the Physical layer on the destination host via the medium. Network architectures (such as Ethernet, ARCnet, Token Ring, and FDDI) encompass the Data Link and Physical layers.
The IEEE 802 networking working groups have refined the Data Link layer into two sub-layers viz, the Logical Link Control (LLC) sub-layer at the top and the Media Access Control (MAC) sub-layer at the bottom. The LLC sub-layer provides an interface for the Network layer protocols, and control the logical communication with its peer at the receiving side. The MAC sub-layer controls physical access to the medium.
The Physical layer is the lowest layer in the OSI reference model. This layer gets data packets from the Data Link layer above it, and converts the contents of these packets into a series of electrical signals that represent ‘0’ and ‘1’ values in a digital transmission. These signals are sent across a transmission medium to the Physical layer at the receiving end. At the destination, the physical layer converts the electrical signals into a series of bit values. These values are grouped into packets and passed up to the Data Link layer.
The required mechanical and electrical properties of the transmission medium are defined at this level. These include:
The medium itself is, however, not specific here. For example, Fast Ethernet dictates that Cat5 cable should be used, but the cable itself is specified in TIA/EIA-568B.
Interoperability is the ability of network users to transfer information between different communications systems; irrespective of the way those systems are supported. One definition of interoperability is:
‘The capability of using similar devices from different manufacturers as effective replacements for each other without losing functionality or sacrificing the degree of integration with the host system. In other words, it is the capability of software and hardware systems on different devices to communicate together. This results in the user being able to choose the right devices for an application independent of the supplier, control system and the protocol.’
Internetworking is a term that is used to describe the interconnection of differing networks so that they retain their own status as a network. What is important in these concepts is that internetworking devices be made available so that the exclusivity of each of the linked networks is retained, but that the ability to share information, and physical resources if necessary, becomes both seamless and transparent to the end user.
At the plant floor level the requirement for all seven layers of the OSI model is often not required or appropriate. Hence a simplified OSI model is often preferred for industrial applications where time critical communications is more important than full communications functionality provided by the full seven layer model. Such a protocol stack is acceptable since there will be no internetworking at this level. A well-known stack is that of the S-50 Fieldbus standard, which is shown in Figure 2.11.
Generally most industrial protocols are written around the Data Link layer (to send/receive frames) and the Application layer (to deal with clients and servers). The Physical layer is required for access to the bus.
When the reduced OSI model is implemented the following limitations exist:
One of the challenges with the use of the OSI model is the concept of interoperability and the need for definition of another layer above the Application layer, called the ‘User’ layer. The user layer is not formally part of the OSI model but is found in systems such as DeviceNet, ProfiBus and Foundation Fieldbus to accommodate issues such as device profiles and software building blocks (‘function blocks’) for control.
Most modern field buses no longer adhere to the 3 layer concept, and implement a full stack (including TCP/IP) in order to provide routing and hence allow large-scale deployment. Examples are ProfiNet, IDA and Ethernet/IP.
From the point of view of internetworking, TCP/IP operates as a set of programs that interacts at the transport and network layer levels without needing to know the details of the technologies used in the underlying layers. As a consequence this has developed as a de facto industrial internetworking standard. Many manufacturers of proprietary equipment are using TCP/IP to facilitate internetworking.
A protocol has already been defined as the rules for exchanging data in a manner that is understandable to both the transmitter and the receiver. There must be a formal and agreed set of rules if the communication is to be successful. The rules generally relate to such responsibilities as error detection and correction methods, flow control methods, and voltage and current standards. However, there are other properties such as the size of the data packet that are important in the protocols that are used on LANs.
Another important responsibility is the method of routing the packet, once it has been assembled. In a self-contained LAN i.e. intranetwork, this is not a problem since all packets will eventually reach their destination by virtue of design. However, if the packet is to be switched across networks i.e. on an internetwork – such as a WAN – then a routing decision must be made. In this regard we have already examined the use of a datagram service vis à vis a virtual circuit.
In summary, there are many different types of protocols, but they can be classified in terms of their functional emphasis. One scheme of classification is:
The Institute of Electrical and Electronic Engineers in the USA has been given the task of developing standards for local area networking under the auspices of the IEEE 802 LAN/MAN Standards Committee (LMSC). Some IEEE LAN/MAN standards (but not all) are also ratified by the International Organization for Standardization and published as an ISO-IEC standard with an ‘8’ in front of the 802 designation. To date, the following LAN/MAN standards have been published by the ISO: ISO/IEC 8802.1, ISO/IEC 8802.2 ISO.IEC 8802.3, ISO/IEC 8802.5 and ISO/IEC 8802.11.
The LMSC assigns various topics to working groups and Technical Advisory Groups (TAGs). Some of them are listed below, but it must be kept in mind that it is an ever-changing situation.
This sub-committee is concerned with issues such as high level interfaces, internetworking and addressing.
There are a series of sub-committees (thirteen at present), such as:
This is the interface between the Network layer and the specific network environments at the Physical layer. The IEEE has divided the Data Link layer in the OSI model into two sub-layers viz. the Media Access Control (MAC) sub-layer, and the Logical Link Control (LLC) sub-layer. The LLC protocol is common for most IEEE 802 standard network types. This provides a common interface to the network layer of the protocol stack. The IEEE 802.2 protocol used at this sub-layer is based on IBM’s HDLC protocol, and can be used in three modes.
These are:
The Carrier Sense, Multiple Accesses with Collision Detection type LAN is commonly, but strictly speaking incorrectly, known as Ethernet. Ethernet actually refers to the original DEC/INTEL/XEROX product known as Version II (Bluebook) Ethernet.
Subsequent to ratification this system has been known as IEEE 802.3. IEEE 802.3 is virtually, but not entirely, identical to Bluebook Ethernet. The frame contents differ marginally and the following chapter will deal with this anomaly.
Subsequently, several additional specifications have been approved such as IEEE 802.3u (100 Mbps or Fast Ethernet), IEEE 8023z (1000 Mbps or Gigabit Ethernet) and IEEE 802.3ae (10000 Mbps or Ten Gigabit Ethernet).
This standard is the ratified version of the original IBM Token Ring LAN. In IEEE 802.5, data transmission can only occur when a station holds a token. The logical structure of the network wiring is in the form of a ring, and each message must cycle through each station connected to the ring.
The original specification called for a single ring, which creates a problem if the ring gets broken. A subsequent enhancement of the specification, IEEE 802.5u, introduced the concept of a dual redundant ring, which enables the system to continue operating in case of a cable break.
The original IBM Token Ring specification supported speeds of 4 and 16 Mbps. IEEE 802.5 at first supported only 1 and 4 Mbps, but currently includes 100 Mbps and 1 Gbps versions.
The physical media for Token Ring can be unshielded twisted pair, shielded twisted pair, coaxial cable or optical fiber.
The IEEE 802.11 Wireless LAN standard uses the 2.4 GHz band and allows operation to 1 or 2 Mbps. The 802.11b standard also uses the 2.4 GHz band, but allows operation at 11 Mbps. The IEEE 802.11a specification uses the 5.7 GHz band instead and allows operation at 54 Mbps. The IEEE 802.11g specification allows operation at 54 Mbps using the 2.4 Ghz band. The next generation of products (IEEE 802.11n) will support 100 Mbps.
This specification covers the system known as 100VG AnyLAN. Developed by Hewlett-Packard, this system operates on voice grade (Cat3) cable – hence the VG in the name. The AnyLAN indicates that the system can interface with both IEEE 802.3 and IEEE 802.5 networks (by means of a special speed adaptation bridge).
Other working groups and TAGs include:
There is no ‘dot 13’ working group. Many of the older working groups and TAGS (IEEE 802.4,6,7,8,9,10 and 14) have been disbanded while new ones are added as and when required.
The way in which the nodes are connected to form a network is known as its topology. There are many topologies available but they can be categorized as either broadcast or point-to-point.
Broadcast topologies are those where the message ripples out from the transmitter to reach all nodes. There is no active regeneration of the signal by the nodes and so signal propagation is independent of the operation of the network electronics. This then limits the size of such networks.
Figure 2.12 shows an example of a broadcast topology.
In a point-to-point communications network, however, each node is communicating directly with only one node. That node may actively regenerate the signal and pass it on to its nearest neighbor. Such networks have the capability of being made much larger. Figure 2.13 shows some examples of point-to-point topologies.
A logical topology defines how the elements in the network communicate with each other and how information is transmitted through a network. The different types of media access methods determine how a node gets to transmit information along the network. In a bus topology information is broadcast, and every node gets the same information within the amount of time it actually takes a signal to propagate down the entire length of cable. This time interval limits the maximum speed and size for the network. In a ring topology each node hears from exactly one node and talks to exactly one other node. Information is passed sequentially, in a predefined order. A polling or token mechanism is used to determine who has transmission rights, and a node can transmit only when it has this right.
A physical topology defines the wiring layout for a network. This specifies how the elements in the network are connected to each other electrically. This arrangement will determine what happens if a node on the network fails. Physical topologies fall into three main categories viz. bus, star, and ring. Combinations of these can be used to form hybrid topologies in order to overcome weaknesses or restrictions in one or other of these three component topologies.
A bus refers to both a physical and a logical topology. As a physical topology a bus describes a network in which each node is connected to a common single communication channel or ‘bus’. This bus is sometimes called a backbone, as it provides the spine for the network. Every node can hear each message packet as it goes past.
Logically, a passive bus is distinguished by the fact that packets are broadcast and every node gets the message at the same time. Transmitted packets travel in both directions along the bus, and need not go through the individual nodes, as in a point-to-point system. Instead, each node checks the destination address included in the frame header to determine whether that packet is intended for it or not. When the signal reaches the end of the bus, an electrical terminator absorbs the packet energy to keep it from reflecting back again along the bus cable, possibly interfering with other frames on the bus. Both ends of the bus must be terminated, so that signals are removed from the bus when they reach the end.
If the bus is too long, it may be necessary to boost the signal strength using some form of amplification, or repeater. The maximum length of the bus is primarily limited by signal attenuation issues. Figure 2.14 illustrates the bus topology.
Advantages of a bus topology
Bus topologies offer the following advantages:
Disadvantages of a bus topology
These include:
In a physical star topology multiple nodes are connected to a central component, generally known as a hub. The hub of a star is often just a wiring center; i.e. a common termination point for the node cables. In some cases the hub may actually be a file server (a central computer that contains a centralized file and control system), with all the nodes attached to it with point-to-point links. When used as a wiring center a hub may, in turn, be connected to the file server or to another hub.
All frames going to and from each node must pass through the hub to which the node is connected. The telephone system is the best known example of a star topology, with lines to individual subscribers coming from a central telephone exchange location.
There are not many LAN implementations that use a logical star topology. The low impedance ARCNet networks are probably the best example. However, the physical layout of many other LANs physically resemble a star topology even though they are logically interconnected in a different way. An example of a star topology is shown in Figure 2.15.
Advantages of a star topology
Disadvantages of a star topology
In a physical ring topology the frames are transmitted sequentially from node to node, in a predefined order. Nodes are arranged in a closed loop, so that the initiating node is the last one to receive a frame. As such it is an example of a point-to-point system.
Information traverses a one-way path, so that a node receives frames from exactly one node and transmits them to exactly one other node. A message frame travels around the ring until it returns to the node that originally sent it. Each node checks whether the message frame’s destination node matches its address. When the frame reaches its destination, the destination node accepts the message and then sends it back to the sender in order to acknowledge receipt.
Since ring topologies use token passing to control access to the network, the token is returned to sender together with the acknowledgment. The sender then releases the token to the next node on the network. If this node has nothing to say, it passes the token on to the next node, and so on. When the token reaches a node with a frame to send, that node sends its frame. Physical ring networks are rare, because this topology has considerable disadvantages compared to a more practical star-wired ring hybrid, which is described later.
Advantages of a ring topology
Disadvantages of a ring topology
As well as these three main topologies, some of the more important variations will now be considered. These are just variations, and should not be considered as topologies in their own right.
Star-wired ring topology
IBM Token Ring networks are the best-known example of a star-wired ring topology. A star-wired ring topology is a hybrid physical topology that combines features of the star and ring topologies. Individual nodes are connected to a central hub, as in a star network. Within the hub, however, the connections are arranged into an internal ring. Thus, the hub constitutes the ring, which must remain intact for the network to function. The hubs, known as Multistation Access Units (MAUs), may be connected to other hubs. In this arrangement, each internal ring is opened and connected to the attached hubs, to create a larger, multi-hub ring.
The advantage of using star wiring instead of simple ring wiring is that it is easy to disconnect a faulty node from the internal ring. The IBM data connector is specially designed to close a circuit if an attached node is disconnected physically or electrically. By closing the circuit, the ring remains intact, but with one less node. In Token Ring networks a secondary ring path can be established and used if part of the primary path goes down. The star-wired ring is illustrated in Figure 2.17.
The advantages of a star-wired ring topology include:
The disadvantages of a star-wired ring topology include:
Distributed star topology
A distributed star topology is a physical topology that consists of two or more hubs, each of which is the center of a star arrangement. A good example of such a topology is an ARCNet network with at least one active hub and one or more active or passive hubs. The 100VG ANYLAN uses a similar topology.
Mesh topology
A mesh topology is a physical topology in which there are at least two paths to and from every node. This type of topology is advantageous in hostile environments in which connections are easily broken. If a connection is broken, at least one substitute path is always available. A more restrictive definition requires each node to be connected directly to every other node. Because of the severe connection requirements, such restrictive mesh topologies are feasible only for small networks.
Tree topology
A tree topology, also known as a distributed bus or a branching tree topology, is a hybrid physical topology that combines features of star and bus topologies. Several buses may be daisy-chained together, and there may be branching at the connections (which will be hubs). The starting end of the tree is known as the root or head end. This type of topology is used in delivering cable television services.
The advantages of a tree topology are:
The disadvantages include:
A common and important method of differentiating between different LAN types is to consider their media access methods. Since there must be some method of determining which node can send a message, this is a critical area that determines the efficiency of the LAN. There are a number of methods that can be considered, of which the two most common in current LANs are the ‘contention’ and ‘token passing’ methods. .
Contention is the basis for a first-come-first-served media accesses control method. This operates in a similar manner to polite human communication. We listen before we speak, deferring to anyone who already is speaking. If two of us start to speak at the same time, we recognize that fact and both stop, before starting our messages again a little later. In a contention-based access method, the first node to seek access when the network is idle will be able to transmit. Contention is at the heart of the Carrier Sense Multiple Access/Collision Detection (CSMA/CD) access method used by Ethernet.
The ‘carrier sense’ component requires a node to listen out for a ‘carrier’ before it transmits. There is no actual carrier present but the name relates back to the original Aloha project of the University of Hawaii. ’Carrier sense’ in the Ethernet context simply means to listen out for any current transmission on the medium. The length of the channel and the finite signal propagation delay means that there is still a distinct probability that more than one transmitter will attempt to transmit at the same time, as they both will have heard ‘no carrier’. The collision detection logic ensures that more than one simultaneous transmission on the channel will be detected. The system is a probabilistic system, since the exact timing of access to the channel cannot be ascertained in advance.
Token passing is a ‘deterministic’ media access method in which a token is passed from node to node, according to a predefined sequence. A token is a special frame consisting of a few bytes that cannot be mistaken for a message. At any given time the token can be available or in use. When an available token reaches a node, that node can access the network for a maximum predetermined time, before passing the token on.
This deterministic access method guarantees that every node will get access to the network within a given length of time, usually in the order of a few milliseconds. This is in contrast to a probabilistic access method (such as CSMA/CD), in which nodes check for network activity when they want to access the network, and the first node to claim the idle network gets access to it. Because each node gets its turn within a fixed period, deterministic access methods are more efficient on networks that have heavy traffic. With such networks, nodes using probabilistic access methods spend much of their time competing to gain access and relatively little time actually transmitting data over the network. Network architectures that support the token passing access method include ARCNet, FDDI, and Token Ring.
To transmit, the node first marks the token as ‘in use’, and then transmits a data packet, with the token attached. In a ring topology network, the packet is passed from node to node, until the packet reaches its destination. The recipient acknowledges the packet by sending the message back to the sender, who then sends the token on to the next node in the network.
In a bus topology network, the next recipient of a token is not necessarily the node that is nearest to the current token passing node. Instead, the next node is determined by some predefined rule. The actual message is broadcast on to the bus for all nodes to ‘hear’. For example, in a Modbus Plus network the token is passed on to the node with the next higher network address. Networks that use token passing generally have some provision for setting the priority with which a node gets the token. Higher-level protocols can specify that a message is important and should receive higher priority.
A Token Ring network requires an Active Monitor (AM) and one or more Standby Monitors (SMs). The AM keeps track of the token to make sure it has not been corrupted, lost, or sent to a node that has been disconnected from the network. If any of these things happens, the AM generates a new token, and the network is back in business. The SM makes sure the AM is doing its job and does not break down and get disconnected from the network. If the AM is lost, one of the SMs becomes the new AM, and the network resumes operation. These monitoring capabilities result in complex circuitry on the NICs.
The deterministic nature of token passing does not necessarily mean fast access to the bus, but merely predictable access to the bus. A heavily loaded Modbus Plus network can have a token rotation time of 500 milliseconds!
Polling refers to a process of checking elements such as computers or queues, in some defined order, to see whether the polled element needs attention (wants to transmit, contains jobs, and so on). In roll call polling, the polling sequence is based on a list of elements available to the controller. In contrast, in hub polling, each element simply polls the next element in the sequence.
In LANs, polling provides a deterministic media access method in which the controller polls each node in succession to determine whether that node wants to access the network. In some systems the polling is done by means of software messages being passed to and fro, which could slow down the process. In order to overcome this problem, systems such as 100VG AnyLAN employ a hardware-polling mechanism that uses voltage levels to determine whether a node needs to be serviced or not.
When you have completed study of this chapter you should be able to:
The Ethernet network concept was developed by Xerox at its Palo Alto Research Center (PARC) in the mid-seventies. It was based on the work done by researchers at the University of Hawaii where campus sites on the various islands were interconnected with the ALOHA network, using radio as the medium. This network was colloquially known as ‘Ethernet’ since it used the ‘ether’ as the transmission medium.
The philosophy was quite straightforward. Any station that wanted to broadcast to another station would do so without deferring to any transmission in progress. The receiving station then had the responsibility of acknowledging the message, advising the transmitting station of a successful reception. This primitive system did not rely on any detection of collisions (two radio stations transmitting at the same time) but, instead, expected acknowledgment within a predefined time.
The initial system was so successful that Xerox soon applied it to their other sites, typically interconnecting office equipment and shared resources such as printers and computers acting as repositories of large databases.
In 1980 the Ethernet Consortium consisting of Xerox, Digital Equipment Corporation and Intel (the DIX consortium) issued a joint specification based on the Ethernet concept, known as Ethernet Version 1. This was later superseded by the Ethernet Version 2 (Blue Book) specification. Version 2 was offered to the IEEE for ratification as a formal standard and in 1983 the IEEE issued the IEEE 802.3 CSMA/CD (Ethernet) standard.
Although IEEE 802.3 effectively replaced the old standard, the Blue Book legacy still remains in the form of a slightly different data encapsulation in the frame. This is, however, a software issue.
Later versions of Ethernet (from 100 Mbps upwards) also support full-duplex, although they have to support CSMA/CD for the sake of backward compatibility. Industrial versions of Ethernet typically operate at 100 Mbps and above in full-duplex mode, and often support the IEEE 802.1p/Q modified (tagged) frame structure. This allows a reasonable degree of deterministic operation. Many modern Ethernet based field buses make use of additional measures such as the IEEE 1588 clock synchronizing standard to achieve real-time operation with deterministic behavior in the microsecond region.
The original IEEE 802.3 standard defines a range of cable types. They include coaxial, twisted pair and fiber optic cable. Despite 10 Mbps Ethernet being a legacy technology, it is still used in older installations and therefore a brief overview of the technology is in order. In addition, many of the design issues in the newer versions can only be understood against the backdrop of the original technology.
The IEEE 802.3 standard has several variants. Many of them died an early death, but the following versions are still used in older installations:
Thick wire coaxial cable (RG-8), single cable bus
Thin wire coaxial cable (RG-8), single cable bus
Unscreened Twisted Pair cable (TIA/IEA 568B Cat3) star topology
Optical fiber, 10 Mbps, twin fiber point to point
10Base5 (Thicknet) is a legacy technology but some systems are still in use in industrial applications. We will therefore deal with it rather briefly. It uses a coaxial cable as a bus (also referred to as a ‘trunk’). The RG-8 cable has a 50-ohm characteristic impedance and is yellow or orange in color. The naming convention ‘10Base5’ indicates a 10 Mbps data rate, baseband signaling, and 500-meter segment lengths. The cable is difficult to work with, and so cannot normally be taken to the node directly. Instead, it is laid in a cabling tray and the transceiver electronics (the Medium Attachment Unit or MAU) is installed directly on the cable. From there an intermediate cable, known as an Attachment Unit Interface (AUI) cable is used to connect to the NIC. The AUI cable can be up to 50 meters long. and consists of 5 individually shielded pairs viz. two pairs each for transmit and receive (control plus data) plus one for power.
The MAU connection to the trunk is made by cutting the cable and inserting an N-connector and a coaxial T-piece, or by using a ‘vampire’ tap. The latter is a mechanical connection that clamps directly over the cable. Electrical connection is made via a screw-in probe that connects to the center conductor, and sharp teeth that puncture the cable sheath to connect to the braid. These hardware components are shown in Figure 3.1.
The location of the connection is important to avoid multiple electrical reflections on the cable and the trunk cable is marked every 2.5 meters with a black or brown ring to indicate where a tap should be placed. Fan-out boxes can be used if there are a number of nodes to be connected, allowing a single tap to feed multiple nodes. Connection at either end of the AUI cable is made through a 25-pin D-connector, with a slide latch, often called a DIX connector. This has been shown in Figure 3.2.
There are certain rules to be followed. These include:
The physical layout of a 10Base5 segment is shown in Figure 3.3
With 10Base5 there is no on-board transceiver on the NIC. Instead, the transceiver is located in the MAU and this is fed with power from the NIC via the AUI cable. Since the transceiver is now remote from the NIC, the node needs to be aware that the termination can detect collisions if they do occur. This confirmation is performed by a Signal Quality Error (SQE), or heartbeat, test function in the MAU. The SQE signal is sent from the MAU to the node on detecting a collision on the bus. However, on completion of every frame transmission by the MAU, the SQE signal is also sent to ensure that the circuitry remains active, and that collisions can be detected. Mixing devices that support SQE with those that don’t could cause problems, as non-SQE-enabled NICs could perceive ‘normal’ SQE pulses as collisions, leading to jam signals and repetitive re-broadcasting of messages.
The other type of coaxial cable Ethernet network is 10Base2 and this is sometimes referred to as ‘Thinnet’ or ‘thinwire Ethernet’. It uses the thinner (5 mm diameter) RG-58 A/U or C/U coax cable with a 50-ohm characteristic impedance. The cable is connected to the 10Base2 NICs or 10Base5 MAUs by means of BNC T-piece connectors.
Connectivity requirements stipulate that:
The physical layout of a 10Base2 Ethernet segment is shown in Figure 3.4.
10BaseT uses Cat3 (or better) AWG24 UTP cable for connection to the node. The physical topology is a star, with nodes connected to hub. Logically it forms a (chained) bus, since when one station transmits all others can ‘hear’ it. The four-pair cable from hub to node has a maximum length of 100 meters. One pair is used for receive and another is used for transmit. The connectors specified are RJ-45. Figure 3.5 shows schematically how the 10BaseT nodes are interconnected by the hub. This is also known as a ‘chained bus’ configuration, as opposed to 10Base5 and 10Base2 that are ‘branched bus’ configurations.
Collisions are detected by the NIC and so a signal received by the hub must be retransmitted on all ports. The electronics in the hub must also ensure that the stronger retransmitted signal does not interfere with the weaker input signal. The effect is known as far end crosstalk (FEXT), and is handled by special adaptive crosstalk echo cancellation circuits.
The 10BaseT star topology became very popular but has been largely superseded by faster versions such as Fast and Gigabit Ethernet. The shared hubs have also been replaced with switching hubs, and the preferred mode of operation is full-duplex instead of CSMA/CD. This will be discussed in the next chapter.
The 10BaseF was a star topology using wiring hubs, and comprised three variants namely 10BaseFL, 10BaseFP and 10BaseFB. Of these, only 10BaseFL survived. 10BaseFL is a point-to-point technology using two fibers, with a range of 2000m.
10 Mbps Ethernet signals are encoded using Manchester encoding as described in chapter 1, which allows the clock to be extracted at the receiver end in order to synchronize the transmission/reception process
The voltage swings were from –0.225 to –1.825 volts in the original Bluebook Ethernet specification. In IEEE 802.3 voltages on coax cables are specified to swing between 0 and –2.05 volts with a rise and fall time of 25 ns at 10 Mbps. IEEE 802.3 voltages on UTP swing between -0.7V and +0.7V .
The method used is one of contention. Each node has a connection via a transceiver to the common bus. The transceiver can transmit and receive simultaneously. It can, therefore, be in any one of three states namely idle (listen only), transmit, or be in contention. In the idle state the node merely listens to the bus, monitoring all traffic. If it then wishes to transmit information, it will defer whilst there is any activity on the bus since this is the ‘carrier sense’ component of the access method. At some stage the bus will become silent. After a period of 96 bit times, known as the inter-frame gap (to allow the passing frame to be received and processed by the destination node) the transmission process commences. The node is now in the transmit mode, and will transmit and listen at the same time. This is because there is no guarantee that another node at some other point on the bus has not also started transmitting having recognized the absence of traffic.
After a short delay, as the two signals propagate towards each other on the cable, there will be a collision of signals. Obviously the two transmissions cannot coexist on the common bus, since this is a baseband system. The transceiver detects this collision, since it is monitoring both its input and output, and recognizes the difference. The node now goes into a state of contention. It will continue to transmit for a short time (the jam signal) to ensure the other transmitting node detects the contention, and then perform a back-off algorithm to determine when it should again attempt to transmit its waiting frames.
Since there is a finite time for the transmission to propagate to the ends of the bus, and thus ensure that all nodes recognize that the medium is busy, the transceiver turns on a collision detection circuit whilst the transmission takes place. Once a certain number of bits (576) have been transmitted, provided that the network cable segment specifications have been complied with, the collision detection circuitry can be disabled. If a collision should take place after this, it will be the responsibility of higher level protocols to request retransmission. This is a far slower process than the hardware collision detection process.
Here is a good reason to comply with cable segment specifications. The initial ‘danger’ period is known as the collision window and is effectively twice the time interval for the first bit of a transmission to propagate to all parts of the network. The ‘slot time’ for the network is then defined as the worst-case time delay that a node must wait before it can reliably know that a collision has occurred. This is defined as:
Slot time = 2 * (transmission path delay) + safety margin
The slot time is fixed at 512 bits or 64 bytes, i.e. 51.2 microseconds for 10 Mbps Ethernet.
The transceiver of each node is constantly monitoring the bus for a signal. As soon as one is recognized, the NIC activates a carrier sense signal to indicate that transmissions cannot be made. The first bits of the MAC frame are a preamble and consist of 56 bits of 1010 etc. On recognizing these, the receiver synchronizes its clock and converts the Manchester encoded signal back into binary form. The eighth byte is a start of frame delimiter, and is used to indicate to the receiver that it should strip off the first eight bytes and commence determining whether this frame is for its node by reading the destination address in the frame header. If the address is recognized, the data is loaded into a frame buffer within the NIC.
Further processing then takes place, including the calculation and comparison of the frame CRC, and checking it against with the transmitted CRC. The NIC also checks that the frame contains an integral number of bytes and is neither too short nor too long. Provided all is correct, the frame is passed to the Data Link layer for further processing.
Collisions are a normal part of a CSMA/CD network since the monitoring and detection of collisions is the method by which a node ensures unique access to the shared medium. It is only a problem when there are excessive collisions. This reduces the available bandwidth of the cable and slows the system down while retransmission attempts occur. There are many reasons for excessive collisions and this will be addressed shortly.
The principle of collision cause and detection is shown in the following diagram.
Assume that both node 1 and node 2 are in listen mode and node 1 has frames queued to transmit. All previous traffic on the medium has ceased, and the inter-frame gap from the last transmission has expired. Node 1 now commences to transmit its preamble signal, which propagates both left and right across the cable. At the left end the termination resistance absorbs the transmission, but the signal continues to propagate to the right. However, the Data Link layer in node 2 also has a frame to transmit and since the NIC ‘sees’ a free cable, it also commences to transmit its preamble. Again, the signals propagate down the cable, and some short time later they ‘collide’. Almost immediately, node 2’s transceiver recognizes that the signals on the cable are corrupted, and the logic incorporated in the NIC asserts a collision detect signal. This causes node 2 to send a jam signal of 32 bits, and then stop transmitting. The standard allows any data to be sent as long as, by design, it is not the value of the CRC field of the frame. It appears that most nodes will send the next 32 bits of the data frame as a jam, since that is instantly available.
This jam signal continues to propagate along the cable, superimposed on the signal still being transmitted from node 1. Eventually node 1 recognizes the collision, and goes through the same jam process as node 2. The frame from node 1 must therefore be at least twice the end-to-end propagation delay of the network, or else the collision detection will not work correctly. The jam signal from node 1 will continue to propagate across the network until absorbed at the far end terminator, meaning that the system vulnerable period is three times the end to end propagation delay.
After the jam sequence has been sent, the transmission is halted. The node then schedules a retransmission attempt after a random delay controlled by a process known as the truncated binary exponential back off algorithm. The length of the delay is chosen so that it is a compromise between reducing the probability of another collision and delaying the retransmission for an unacceptable length of time. The delay is always an integer multiple of the slot time. In the first attempt the node will choose, at random, either one or zero slot times delay. If another collision occurs, the delay will be chosen at random from 0, 1, 2 or 3 slot times, thus reducing the probability that a further collision will occur. This process can continue for up to 10 attempts, with a doubling of the range of slot times available for the node to delay transmission at each attempt. After ten attempts the node will attempt 6 more retries, but the slot times available for the delay period will remain as they were at the tenth attempt. After sixteen attempts it is likely that there is a problem on the network and the node will cease to retransmit.
The basic Ethernet/IEEE 802.3 frame format is shown below. Each field in the frame will now be described in detail.
This field is used for the synchronization of the receiver clocks.
This consists of 7 bytes containing 10101010, resulting in a square wave being transmitted. The preamble is used by the receiver to synchronize its clock to the transmitter.
This single byte field consists of 10101011. It enables the receiver to recognize the commencement of the address fields. Technically speaking this is just another byte of the preamble and, in fact, in the Ethernet version 2 specification it was simply viewed as part of an 8-byte preamble.
This field contains three fields, allocated as follows:
These are the physical addresses of both the source and destination nodes and are usually embedded in the firmware of the NICs. The fields are 6 bytes long and made up of two three-byte blocks. The first three bytes are known as the Organizationally Unique Identifier (OUI) and identifies the manufacturer. This is administered by the IEEE. The second block is the device identifier, and each card will have a unique address under the terms of the license to manufacture. This means there are 224 unique addresses for each OUI. These addresses are known as MAC addresses, media addresses or hardware addresses.
There are three addressing modes:
The destination address is set to all ‘1’s or FF-FF-FF-FF-FF-FF
The first bit of the destination address is set to a 1. It provides group restricted communications
First bit of the address is 0. A typical unicast MAC address is 00-06-5B-12-45-FC
This two-byte field contains the length of the data field. If the frame is transmitted in Blue Book format, then this field contains an IEEE registered ‘type’ number, identifying the higher level protocol that is sending the data. An example is 0x800 (800 Hex) for IP. It is easy for the software to take care of this, since any number larger than 1500 (maximum number of data bytes) denotes a type number.
The information that has been handed down to the Data Link layer for transmission. This varies from 0 to 1500 bytes.
The padding is conceptually part of the data field. Since the minimum frame length is 64 bytes (512 bits), the pad field will pad out any frame that does not meet this minimum specification. This pad, if incorporated, is normally random data. The CRC is calculated over the data in the pad field. Once the CRC checks OK, the receiving node discards the pad data. Pad data therefore varies between 0 and 46 bytes.
A 32-bit CRC value that is computed in hardware at the transmitter and appended to the frame. It is the same algorithm used in the IEEE 802.5 standard.
The main reasons for collision rates on a half-duplex Ethernet network are:
A few suggestions on reducing collisions are:
Legacy (half duplex) Ethernet systems had several ‘design rules’, that are irrelevant in the context of modern high-performance switched full duplex systems. However, an awareness of these rules may be necessary in the case of maintenance work, so they are briefly addressed here.
It is important to maintain the overall Ethernet requirements as far as length of the cable is concerned. Each variant (e.g. 10Base5) has a particular maximum segment length allowable. The recommended maximum length is 80% of this figure.
Coaxial cable segments need not be made from a single homogenous length of cable, and may comprise multiple lengths joined by coaxial connectors. Although 10Base5 and 10Base2 cables have the same nominal 50-ohm impedance they can only be mixed within the same 10Base2 cable segment (not within a 10Base5 segment) to achieve greater segment length.
On 10Base5 cable segments it is preferable that the total segment be made from one length of cable or from sections off the same drum of cable. If multiple sections of cable from different manufacturers are used, then these should be standard lengths of 23.4 m, 70.2 m or 117 m (± 0.5 m), which are odd multiples of 23.4 m (half wavelength in the cable at 5 MHz). These lengths ensure that reflections from the cable-to-cable impedance discontinuities are unlikely to add in phase. Using these lengths exclusively a mix of cable sections should be able to be made up to the full 500 m segment length.
If the cable is from different manufacturers and there are potential mismatch problems, it should be confirmed that signal reflections due to impedance mismatches do not exceed 7% of the incident wave.
In 10Base5 systems the maximum length of the transceiver cables is 50 m but this only applies to specified IEEE 802.3 compliant cables. Other AUI cables using ribbon or office grade cables can only be used for short distances (less than 12.5 m).
Connection of MAUs to the cable causes signal reflections due to their bridging impedance. Their placement must therefore be controlled to ensure that these reflections do not significantly add in phase. In 10Base5 systems the MAUs are spaced at 2.5 m multiples, coinciding with the cable markings. In 10Base2 systems the minimum MAU spacing is 0.5 m.
The maximum bus length is made up of five segments, connected by four repeaters. The total number of segments can be made up of a maximum of three coax segments, containing nodes, and two 10BaseFL link segments (see Figure 3.8). This is referred to as the ‘5-4-3-2 rule’ and the number of repeaters is referred to as the ‘repeater count’. It is important to verify that the above transmission rules are met by all paths between any two nodes on the network.
The maximum network size is, therefore, as follows:
IEEE 802.3 states that the shield conductor of the RG-8 coaxial cable should make electrical contact with an effective ground reference at one point only. This is normally done at a terminator, since most RG-8 terminators have a screw terminal to which a ground lug can be attached, preferably using a braided cable, to ensure good grounding.
All other splices, taps or terminators should be jacketed so that no contact can be made with any metal objects. Insulating boots or sleeves should be used on all in-line coaxial connectors to avoid unintentional ground contact.
When you have studied this chapter you should be able to:
Although 10 Mbps Ethernet with over 200 million installed nodes world-wide became the most popular method of linking computers on networks, its speed was too slow for data- intensive or real-time applications.
From a philosophical point of view there are several ways to increase speed on a network. The easiest, conceptually, is to increase the bandwidth and allow faster changes of the data signal. This requires a high bandwidth medium and generates a considerable amount of high frequency electrical noise on copper cables, which is difficult to suppress. The second approach is to move away from the serial transmission of data on one circuit to a parallel transmission over multiple circuits. A third approach is to use data compression techniques in order to transfer multiple bits for each electrical transition. A fourth approach (used with Gigabit Ethernet) is to operate circuits in full-duplex mode, allowing simultaneous transmission in both directions. All these approaches are used to implement 100 Mbps Fast Ethernet and 1000 Mbps Gigabit Ethernet transmission on fiber optic as well as copper cables.
Typically most LAN systems use coaxial cable, Shielded Twisted Pair (STP), Unshielded Twisted Pair (UTP) or fiber optic cables. The capacitance of the coaxial cable imposes a serious limit on the distance over which the higher frequencies can be handled and also prohibits the use of full duplex. Consequently 100 Mbps systems do not use coaxial cable.
The UTP is obviously popular because of ease of installation and low cost. This is the basis of the 10BaseT Ethernet standard. The Cat3 cable allows only 10 Mbps over 100m while Cat5 allows 100 Mbps data rates over 100m. The four pairs in the standard cable allow several parallel data streams to be handled.
Fiber optic cables have enormous bandwidths and excellent noise immunity, thus they are the obvious choice for long-haul connections and installation in areas with a high degree of noise.
The 100Base-T variants use the existing Ethernet MAC layer with various enhanced Physical Media Dependent (PMD) sub-layers to improve the speed. They are described in the IEEE 802.3u and 802.3y standards as follows.
IEEE 802.3u defines three variants, namely:
100Base-TX and 100Base-FX are collectively known as 100Base-X. Another specification, IEEE 802.3y, defines a single variant, 100Base-T2, which uses two pairs of wires of Cat3 or better. 100Base-T2 was, unfortunately, released much later than its counterparts and did not ‘make it’ to the marketplace, with the result that most documents and articles on Fast Ethernet do not even refer to it. 100Base-BX and 100Base-LX10 are the new versions of Fast Ethernet.
The approach illustrated in Figure 4.1 is possible because the original IEEE 802.3 specifications defined the MAC sub-layer independently of the various physical PMD layers it supports. The MAC sub-layer defines the format of the Ethernet frame as well as the operation of the CSMA/CD mechanism. The time dependent parameters are defined in IEEE 802.3 in terms of bit-time intervals, so they are speed dependent. The 10 Mbps Ethernet inter-frame gap is actually defined as a time interval of 96 bit times, which translates to 9.6 microseconds for 10 Mbps Ethernet and 960 nanoseconds for Fast Ethernet.
One of the limitations of 100Base-T systems operating in half-duplex mode is the size of the collision domain, which is 250 m. This is the maximum size network in which collisions can be detected; ten times smaller than the maximum size of a 10 Mbps network. Since the distance between workstation and hub is limited to 100 m, the same as with 10BaseT, and usually only one hub is allowed in a collision domain, networks larger than 200 m must be logically interconnected by store-and-forward type devices such as bridges, routers or switches. This is not necessarily a bad thing, since it segregates the traffic within each collision domain, reducing the number of collisions on the network.
Since all modern Fast Ethernet systems (especially industrial implementations) are full duplex in any case, this is not really a problem.
The dominant 100Base-T systems are 100Base-TX and 100Base-FX. The other two variants are only included here from an academic viewpoint.
100Base-BX was developed to provide service over a single strand of optical fiber (unlike 100Base-FX, which uses a pair of fibers). Single-mode fiber along with a special multiplexer splits the signal into transmit and receive wavelengths (1310/1550nm). The terminals on each side of the fiber are not equal, as downstream transmission uses a wavelength of 1550nm, and upstream transmission uses 1310 nm wavelength. It has a nominal reach of 10, 20 or 40 km.
The version 100Base-LX10 provides service over two single-mode optical fibers up to 10 km using 1310 nm wavelength.
The IEEE 802.3u standard fits into the OSI model as shown in Figure 4.2. Note that the unchanged IEEE 802.3 MAC layer sits beneath the LLC sub-layer as the lower half of the Data Link layer of the OSI model.
Its Physical layer is divided into the following sub-layers and their associated interfaces:
A convergence sub-layer is added for the 100Base-TX and FX systems, which use the ANSI X3T9.5 PMD layer developed for the twisted pair version of FDDI. The FDDI PMD layer operates as a continuous full-duplex 125 Mbps transmission system, so a convergence layer is needed to translate this into the 100 Mbps half-duplex data bursts expected by the IEEE 802.3 MAC layer.
The PHY layer specifies 4B/5B coding of the data, data scrambling and the non return to zero – inverted (NRZI) data coding together with the clocking, data and clock extraction processes. The 4B/5B technique is described in Chapter 1.
The PMD sub-layer uses the ANSI TP-X3T9.5 PMD layer and operates on two pairs of Cat5 cable. It uses stream cipher scrambling for data security and MLT-3 bit encoding as described in Chapter 1. Hence for a 31.25 MHz baseband signal this allows for a 125 Mbps signaling bit stream providing a 100 Mbps throughput (4B/5B encoder). The MAC outputs a NRZ code. This code is then passed to a scrambler, which ensures that there are no invalid groups in its NRZI output. The NRZI converted data is passed to the three level code block and the output is then sent to the transceiver.
The three-level code results in a lower frequency signal. Noise tolerance is not as high as in the case of 10BaseT because of the multi-level coding system, hence Cat5 cable is required.
Cat5 wire, RJ-45 connectors and a hub or switch are requirements for 100Base-TX. These factors and a maximum distance of 100 m between the nodes and hubs result in an architecture identical to that of 10BaseT.
The IEEE 802.3u specification defines two classes of 100Base-T hubs, also called ‘repeaters’:
Class I hubs have a greater delay (0.7 μs maximum) and so only permit one hub in a collision domain. The class I hub fully decodes each incoming TX or T4 packet into its digital form at the MII; then sends the packet out as an analog signal from each of the other ports in the hub. Hubs are available with all T4 ports, all TX ports or combinations of TX and T4 ports. The latter are called translational hubs. Their layout is shown in Figure 4.4.
Class II hubs operate like 10BaseT hubs, connecting the ports (all of the same type) at the analog level. These then have lower inter-repeater delays (0.46 μs maximum) and so two repeaters are permitted in the same collision domain, but only 5 m apart. Alternatively, in an all-fiber network, the total length of all the fiber segments is 228 meters. This allows two 100 m segments to the nodes with 28 m between the hubs or any other combination. Most Fast Ethernet hubs available today are class II.
100Base-T adapter cards (NICs) are readily available as standard 100Mbps and as 10/100Mbps. These cards are interoperable at the hub at both speeds.
As in the case of 10BaseT, the maximum distance between hub and NIC is 100 meters, made up as follows:
The following maximum cable distances are in accordance with the 100Base-T bit budget (see next section). These are only applicable to 100Base-FX hubs. With switches there is no distance limitation, and distances between 3km (multimode fiber) and 120 km (single mode fiber) are common.
The cable distance and the number of hubs that can be used in a 100Base-T collision domain depend on the delay in the cable, the time delay in the hubs and NIC delays. The maximum round-trip delay for 100Base-T systems is the time to transmit 64 bytes or 512 bits, and equals 5.12 μs. A frame has to go from the transmitter to the most remote node and back to the transmitter for collision detection within this round trip time. Therefore the one-way time delay will be half this.
The maximum-sized collision domain can then be determined by the following calculation:
Repeater delays + Cable delays + NIC delays + Safety factor (5 bits min.) < 2.56μs
Table 4.1 gives typical maximum one-way delays for various components. Hub and NIC delays for specific components can be obtained from the manufacturers.
Notes
If the desired distance is too great it is possible to create a new collision domain by using a switch instead of a hub.
Most 100Base-T hubs are stackable, which means that multiple units can be placed on top of each other and interconnected by means of a fast backplane bus. Such connections do not count as repeater hops and make the ensemble function as a single repeater.
Can two Fast Ethernet nodes be interconnected via two class II hubs that are connected by 50 m fiber? One node is connected to the first hub with 50 m UTP while the other has a 100 m fiber connection to the second hub.
Calculation: Using the time delays in Table 4.2:
The total one-way delay of 2.445 μs is within the required interval (2.56 μs) and allows a safety factor of at least 5 bits, so this connection is permissible.
Gigabit Ethernet uses the same IEEE 802.3 frame format as 10 Mbps and 100 Mbps Ethernet systems. It operates at ten times the clock speed of Fast Ethernet, i.e. at 1Gbps. By retaining the same frame format as the earlier versions of Ethernet, backward compatibility is assured.
Gigabit Ethernet is defined in the IEEE 802.3z and IEEE 802.3ab standards. IEEE 802.3z defines the Gigabit Ethernet Media Access Control (MAC) layer functionality as well as three different physical layers viz. 1000Base-LX and 1000Base-SX using fiber, and 1000Base-CX using copper. These physical layers were originally developed by IBM for the ANSI Fiber channel systems and use 8B/10B encoding to reduce the bandwidth required to send high-speed signals. The IEEE merged the fiber channel with the Ethernet MAC using a Gigabit Media Independent Interface (GMII), which defines an electrical interface, allowing existing fiber channel PHY chips to be used and future physical layers to be easily added. The newer versions of Gigabit Ethernet include 1000BASE-LX10, 1000BASE-BX10, 1000BASE-ZX, and 1000BASE-TX.
1000Base-T was developed to provide service over four pairs of Cat5 or better copper cable. This development is defined by the IEEE 802.3ab standard.
The Gigabit Ethernet versions are summarized in Figure 4.4.
Gigabit Ethernet retains the standard IEEE 802.3 frame format. However, the CSMA/CD algorithm has had to undergo a small change to enable it to function effectively at 1 Gbps. The slot time of 64 bytes used with both 10 Mbps and 100 Mbps systems had to be increased to 512 bytes. Without this increased slot time the network would have been impractically small at one tenth of the size of Fast Ethernet – only 20 meters! The irony of the matter is that all practical Gigabit Ethernet systems run in full duplex mode and this restriction is therefore irrelevant. However, CSMA/CD compatibility has to be built in for adherence to the IEEE 802.3 standard.
The slot time defines the time during which the transmitting node retains control of the medium, and is an important aspect of the collision detection mechanism. With Gigabit Ethernet it was necessary to increase this time by a factor of eight to 4.096 μs in order to compensate for the tenfold speed increase. This results in a collision domain of about 200m.
If the transmitted frame is less than 512 bytes, the transmitter continues transmitting to fill the 512-byte window. Carrier extension symbols are used to fill the remainder of those frames that are shorter than 512 bytes. This is shown in Figure 4.5.
While this is a simple technique to overcome the network size problem, it could result in very low network utilization if a lot of short frames are sent, typical of some industrial control systems. For example, a 64-byte frame would have 448 carrier extension symbols attached, and result in a network utilization of less than 10%. This is unavoidable, but its effect can be mitigated by a technique called packet bursting. Once the first frame in a burst has successfully passed through the 512-byte collision window, using carrier extension if necessary, transmission continues with additional frames being added to the burst until the burst limit of 8192 bytes is reached. This process averages the time wasted sending carrier extension symbols over a number of frames.
The IEEE 802.3z Gigabit Ethernet standard uses the three PHY sub-layers from the ANSI X3T11 fiber channel standard for the 1000Base-SX and 1000Base-LX versions using fiber optic cable and 1000Base-CX using shielded 150-ohm twin-ax copper cable.
The Fiber Channel PMD sub-layer runs at 1 Gbaud and uses the 8B/10B coding of the data, data scrambling and NRZI data coding together with the clocking, data and clock extraction processes. This results in a data rate of 800 Mbps. The IEEE then had to increase the speed of the fiber channel PHY layer to 1250 Mbaud to obtain the required throughput of 1 Gbps.
The 8B/10B technique selectively codes each group of eight bits into a ten-bit symbol. Each symbol is chosen so that there are at least two transitions from ‘1’ to ‘0’ in each symbol. This ensures there will be sufficient signal transitions to allow the decoding device to maintain clock synchronization from the incoming data stream. The coding scheme allows unique symbols to be defined for control purposes, such as denoting the start and end of packets and frames as well as instructions to devices.
The coding also balances the number of ‘1’s and ‘0’s in each symbol, called DC balancing. This is done so that the voltage swings in the data stream will always average to zero, and not develop any residual DC charge, which could result in a distortion of the signal (‘baseline wander’).
This Gigabit Ethernet version was developed for the short backbone connections of the horizontal network wiring. The SX systems operate full-duplex with multimode fiber only, using the cheaper 850 nm wavelength laser diodes. The maximum distance supported varies between 200 and 550 meters depending on the bandwidth and attenuation of the fiber optic cable used. The standard 1000Base-SX NICs available today are full-duplex and incorporate SC fiber connectors.
This version was developed for use in the longer backbone connections of the vertical network wiring. The LX systems can use single mode or multimode fiber with the more expensive 1300 nm laser diodes. The maximum distance recommended by the IEEE for these systems operating in full-duplex is 5 km for single mode cable and 550 meters for multimode fiber cable. Many 1000Base-LX vendors guarantee their products over a much greater distance; typically 10 km. Fiber extenders are available to allow service over as much as 80 km. The standard 1000Base-LX NICs available today are full-duplex and incorporate SC fiber connectors.
This version of Gigabit Ethernet was developed as ‘short haul copper jumpers’ for the interconnection of switches, hubs or routers within a wiring closet. It is designed for 150-ohm ‘twin-ax’ STP cable similar to that used for IBM Token Ring systems. The IEEE specified two types of connectors: the high-speed serial data connector (HSSDC) known as the Fiber Channel style 2 connector, and the 9-pin D-subminiature connector from the IBM Token Ring systems. The maximum cable length is 25 meters for both full- and half-duplex systems.
This version of gigabit Ethernet is similar to 1000BASE-LX which is specified to work up to 10 km, over a pair of single-mode fiber with higher quality optics. Before 1000BASE-LX10 was standardized, it was widely used with either 1000BASE-LX/LH or 1000BASE-LH extensions.
This version is capable of up to 10 km over a single strand of single-mode fiber. It is designed to transmit with different wavelength in each direction. The fiber terminals on each side are not equal, as the one transmitting “downstream” uses 1,490 nm wavelength and the one transmitting “upstream” uses 1,310 nm wavelength.
1000BaseZX operates on ordinary single-mode fiber-optic link, spans up to 70 km using a wavelength of 1,550 nm. It is used as a Physical Medium Dependent (PMD) component for Gigabit Ethernet interfaces found on various switch and router. It operates at a signaling rate of 1250 Mbaud, transmitting and receiving 8B/10B encoded data.
This version of Gigabit Ethernet was developed under the IEEE 802.3ab standard for transmission over four pairs of Cat5 cable. This is achieved by simultaneously sending and receiving over each of the four pairs, as compared to the existing 100Base-TX system which has individual pairs for transmitting and receiving. This is shown in Figure 4.7.
This system uses the PAM5 data encoding scheme originally developed for 100Base-T2. This uses five voltage levels so it has less noise immunity. However the digital signal processors associated with each pair overcomes any problems in this area. The system achieves its tenfold speed improvement over 100Base-T2 by transmitting on twice as many pairs (4) and operating at five times the clock frequency (i.e.125 MHz). Figure 4.8 shows a 1000Base-T receiver using DSP technology.
This version transmits over four pairs of cable, two pairs in each direction of transmission (as opposed to all the four, for 1000Base-T over Cat5). Its simplified design has reduced the cost of required electronics. Since it does not carry out two-way transmission, crosstalk between the cables is significantly reduced, and encoding is relatively simple. 1000Base-TX will only operate over Cat6.
Gigabit Ethernet nodes can be connected to switches, or to less expensive full-duplex repeaters, also referred to as Buffered Distributors. As shown in Figure 4.10, these devices have a basic MAC function in each port, which enables them to verify that a complete frame is received and compute its frame check sequence (FCS) to verify the frame validity. Then the frame is buffered in the internal memory of the port before being forwarded to the other ports of the switch. Thus it combines the functions of a repeater with some features of a switch.
All ports on the device operate at the same speed of 1 Gbps, and in full-duplex mode so it can simultaneously send and receive from any port. The repeater uses IEEE802.3x flow control to ensure that the small internal buffers associated with each port do not overflow. When the buffers are filled to a critical level, it tells the transmitting node to stop sending until the buffers have been sufficiently emptied.
The maximum cable distances that can be used between the node and a full-duplex 1000Base-SX and LX switch depend mainly on the chosen wavelength, the type of cable, and its bandwidth. The maximum transmission distances on multimode cable are limited by the Differential Mode Delay (DMD). The very narrow beam of laser light injected into the multimode fiber results in a relatively small number of rays going through the fiber core. These rays each have different propagation times because they are going through differing lengths of glass by zigzagging through the core to a greater or lesser extent. This can cause jitter and interference at the receiver. The problem is overcome by using a conditioned launch of the laser into the multimode fiber. This spreads the laser light evenly over the core of the multimode fiber so the laser source looks more like a Light Emitting Diode (LED) source, resulting in smoother spreading of the pulses and less interference. This conditioned launch is done in the 1000Base-SX transceivers.
The following table gives the maximum distances for full-duplex 1000Base-X switches.
When you have completed study of this chapter you should be able to:
In the early 1960s The US Department of Defense (DoD) indicated the need for a wide-area communication system, covering the United States and allowing the interconnection of heterogeneous hardware and software systems.
In 1969 The Advanced Research Projects Agency (ARPA) contacted a small private company by the name of Bolt, Barenek and Newman (BBN) to assist with the development of the protocols. Other participants in the project included the University of Berkeley (California) Development work commenced in 1970 and by 1972 approximately 40 sites were connected via TCP/IP. In 1973 the first international connection was made and in 1974 TCP/IP was released to the public.
Initially the network was used to interconnect US government, military and educational sites together. Slowly, as time progressed, commercial companies were allowed access and by 1990 the backbone of the Internet, as it was now known, was being extended into one country after the other.
One of the major reasons why TCP/IP has become the de facto standard world-wide for industrial and telecommunications applications is the fact that the Internet was designed around it in the first place and that, without it, no Internet access is possible.
Whereas the OSI model was developed in Europe by the International Organization for Standardization (ISO), the ARPA model (also known as the DoD or Department of Defense model) was developed in the USA by ARPA. Although they were developed by different bodies and at different points in time, both serve as models for a communications infrastructure and hence provide ‘abstractions’ of the same reality. The remarkable degree of similarity is therefore not surprising.
Whereas the OSI model has 7 layers, the ARPA model has 4 layers. The OSI layers map onto the ARPA model as follows:
The relationship between the two models is depicted in Figure 5.1.
TCP/IP (or rather; the TCP/IP protocol suite) is not limited to the TCP and IP protocols, but consist of a multitude of interrelated protocols that occupy the upper three layers of the ARPA model. With the exception of the Point-to-Point Protocol (PPP) which resides in the upper half of the Network Interface Layer, the TCP/IP suite generally does not include the Network Interface layer, but merely depends on it for access to the medium.
The Network Interface layer is responsible for transporting frames between hosts on the same physical network. It is implemented in the Network Interface Card or NIC, using hardware and firmware (i.e. software resident in Read Only Memory).
The NIC employs the appropriate medium access control methodology, such as CSMA/CA, CMSA/CD, token passing or polling, and is responsible for placing the data received from the upper layers within a frame before transmitting it. The frame format is dependent on the system being used, for example Ethernet or Frame Relay, and holds the hardware address of the source and destination hosts as well as a checksum for data integrity.
RFCs that apply to the Network Interface layer include:
Generally speaking, the TCP/IP suite does not cover the Network Interface layer since there is a multitude of technologies (such a those listed above) that ‘can do the job’. Notable exceptions in this regard are:
Both SLIP and PPP are essentially (OSI) Data Link layer protocols, thus they occupy only the upper half of the TCP/IP Network Interface layer. Both are used to carry IP datagrams over telephone lines, but SLIP has largely been phased out in favor of PPP.
Note: Any Internet-related specification is originally submitted as a request for comments or RFC. As time progresses an RFC may become a standard, or a recommended practice, and so on. Regardless of the status of an RFC, it can be obtained from various sources on the Internet such as http://www.rfc-editor.org.
This layer is primarily responsible for the routing of packets from one host to another. The emphasis is on ‘packets’ as opposed to frames, since at this level the data has not yet been placed in a frame for transmission. Each packet contains the address information needed for its routing through the Internet work to the receiving host.
The dominant protocol at this level is the Internet Protocol (IP).
There are, however, several other protocols required at this level. These protocols include:
This layer is primarily responsible for data integrity between the sender host and receiver host regardless of the path or distance used to convey the message. Communications errors are detected and corrected at this level.
It has two protocols associated with it, these being:
This layer provides the user or application programs with interfaces to the TCP/IP stack. At this level there are many protocols used, some of the more common ones being:
Other protocols at this layer include POP3, RPC, RLOGIN, IMAP, HTTP and NTP. Users can also develop their own Application layer protocols by means of developers’ toolkits.
The TCP/IP suite is shown in Figure 5.2 on the following page:
When you have completed the study of this chapter, you should be able to:
As pointed out in the previous chapter, the Internet layer is not populated by a single protocol, but rather by a collection of protocols.
They include:
IP is at the core of the TCP/IP suite. It is primarily responsible for routing packets towards their destination, from sending host to receiving host via multiple routers. This task is performed on the basis of the IP addresses embedded in the header of each IP datagram.
The most prevalent version of IP in use today is version 4 (IPv4), which uses a 32-bit address. However, because of an ever increasing demand for IP addresses, IPv4 is being replaced by version 6 (IPv6 or IPng), which uses a 128-bit address.
This chapter will focus primarily on version 4 as a vehicle of explaining the fundamental processes involved, but will also provide an introduction to version 6.
The ultimate control over IP addresses is vested in ICANN, the Internet Corporation for Assigned Names and Numbers. This was previously the task of IANA, the Internet Assigned Numbers Authority. This responsibility is, in turn, delegated to the five Regional Internet Registries (RIRs).
They are:
The RIRs allocate blocks of IP addresses to Internet Service Providers (ISPs) under their jurisdiction, for subsequent issuing to users or sub-ISPs.
The use of ‘legitimate’ IP addresses is a prerequisite for connecting to the Internet. For systems not connected to the Internet, any IP addressing scheme may be used. It is, however, recommended that so-called ‘private’ Internet addresses are used for this purpose, as outlined in this chapter.
The MAC or hardware address discussed earlier is unique for each node, and has been allocated to that particular node e.g. NIC at the time of its manufacture. The equivalent for a human being would be an ID or Social Security number. As with a human ID number, the MAC address belongs to that node and follows it wherever it goes. This number works fine for identifying hosts on a LAN where all nodes can ‘see’ (or rather, ‘hear’) each other.
With human beings the problem arises when the intended recipient is living in another city, or worse, in another country. In this case the ID number is still relevant for final identification, but the message (e.g. a letter) first has to be routed to the destination by the postal system. For the postal system, a name on the envelope has little meaning. It requires a postal address.
The TCP/IP equivalent of this postal address is the IP address. As with the human postal address, this IP address does not belong to the node, but rather indicates its place of residence. For example, if an employee has a fixed IP address at work and he resigns, he will leave his IP address behind and his successor will ‘inherit’ it.
Since each host (which already has a MAC or hardware address) needs an IP address in order to communicate across the Internet, resolving host MAC addresses versus IP addresses is a mandatory function. This is performed by the Address Resolution Protocol (ARP), which is to be discussed later on in this chapter.
The IPv4 address consists of 32 bits, e.g.
11000000011001000110010000000001
Since this number is fine for computers but a little difficult for human beings, it is divided into four octets, which for ease of reference could be called w, x, y and z. Each octet is then converted to its decimal equivalent.
The result of the conversion is written as 192.100.100.1 and is known as the ‘dotted decimal’ or ‘dotted quad’ notation.
Consider the following postal address:
The first line, 4 Kingsmill Street, enables the local postal deliveryman at the Australia Post office in Claremont (Perth), Western Australia (postal code 6010) to deliver a letter to that specific residence. This assumes that the letter has already found its way to the Claremont post office from wherever it was posted. The second part of the address enables the postal system to route the letter towards its destination post office from anywhere in the world.
In similar fashion, an IP address has two distinct parts. The first part, the Network ID (NetID) is a unique number identifying a specific network and allows IP routers to forward a packet towards its destination network from anywhere in the world, provided there is a network connection between sender and recipient. The second part, the HostID (HostID) is a number allocated to a specific machine (host) on the destination network and allows the router servicing that host to deliver the packet directly to the host.
For example, in the IP address 192.100.100.5, the computer number or HostID could be 5, and it would be connected to NetID 192.100.100. The number of the network itself would be written as 192.100.100.0.
Originally the intention was to allocate IP addresses in so-called address classes. Although the system proved to be problematic and was consequently abandoned with the result that IP addresses are currently issued ‘classless’, the legacy of IP address classes remains and has to be understood.
To provide for flexibility in assigning addresses to networks, the interpretation of the address field was coded to specify either:
In addition, there was provision for extended addressing modes: class D was reserved for multicasting whilst E was reserved for future use.
The NetID should normally not be all ‘0’s as this indicates a local network. With this in mind, analyze the first octet (‘w’).
For class A, the first bit is fixed at 0. The binary values for ‘w’ can therefore only vary between 000000002 (010) and 011111112 (12710). 0 is not allowed. However, 127 is also a reserved number, with 127.x.y.z reserved for loop-back testing whereby a host sends messages to itself, but bypassing layers 1 and 2 of the stack. 127.0.0.1 is the IP address most commonly used for this. The values for ‘w’ can therefore only vary between 1 and 126, which allows for 126 possible class A NetIDs.
For class B, the first two bits are fixed at 10. The binary values for ‘w’ can therefore only vary between 100000002 (12810) and 101111112 (19110).
For class C, the first three bits are fixed at 110. The binary values for ‘w’ can therefore only vary between 110000002 (19210) and 110111112 (22310).
The relationship between ‘w’ and the address class can therefore be summarized as follows.
There are always two reserved host numbers, irrespective of class. These are ‘all zeros’ or ‘all ones’ for HostID. An IP address with a host number of zero is used as the address of the whole network. For example, on a class C network with the NetID = 200.100.100, the IP address 200.100.100.0 indicates the whole network. If all the HostID bits are set to 1, for example 200.100.100.255, then it means ‘all hosts on network 200.100.100.0’.
For class A, the number of NetIDs is determined by octet ‘w’. Unfortunately, the first bit (fixed at 0) is used to indicate class A and hence cannot be used. This leaves seven usable bits. Seven bits allow 27 = 128 combinations, from 0 to 127. 0 and 127 are reserved; hence only 126 NetIDs are possible. The number of HostIDs, on the other hand, is determined by octets ‘x’, ‘y’ and ‘z’. From these 24 bits, 224 = 16 777 218 combinations are available. All zeros and all ones are not permissible, which leaves 16 777 216 usable combinations.
For class B, the number of NetIDs is determined by octets ‘w’ and ‘x’. The first bits (10) are used to indicate class B and hence cannot be used. This leaves fourteen usable bits. Fourteen bits allow 214 = 16 384 combinations. The number of HostIDs is determined by octets ‘y’ and ‘z’. From these 16 bits, 216 = 65 536 combinations are available. All zeros and all ones are not permissible, which leaves 65,534 usable combinations.
For class C, the number of NetIDs is determined by octets ‘w’, ‘x’ and ‘y’. The first three bits (110) are used to indicate class C and hence cannot be used. This leaves twenty-two usable bits. Twenty-two bits allow 222 = 2 097 152 combinations. The number of HostIDs is determined by octet ‘z’. From these 8 bits, 28 = 256 combinations are available. Once again, all zeros and all ones are not permissible which leaves 254 usable combinations.
Strictly speaking, one should be referring to ‘netmask’ but the term ‘subnet mask’ is commonly used.
For routing purposes it is necessary for a device to strip the HostID off a destination IP address in order to ascertain whether or not the remaining NetID portion of the IP address matches its own network address.
Whilst it is easy for human beings, it is not the case for a computer and the latter has to be ‘shown’ which portion is NetID, and which is HostID. This is done by defining a subnet mask in which a ‘1’ is entered for each bit that is part of NetID, and a ‘0’ for each bit that is part of HostID. The computer takes care of the rest. The ‘1’s start from the left and run in a contiguous block.
For example: A conventional class C IP address, 192.100.100.5, written in binary, would be represented as 11000000 01100100 01100100 00000101. Since it is a class C address, the first 24 bits represent NetID and would therefore be indicated by ‘1’s. The subnet mask would therefore be:
11111111 11111111 1111111 00000000.
To summarize:
The mask, written in decimal dotted notation, becomes 255.255.255.0. This is the so-called default subnet mask for class C. Default subnet masks for classes A and B can be configured in the same manner.
At present IP addresses are issued classless, which means that it is not possible to determine the boundary between NetID and HostID by analyzing the IP address itself. This makes the use of a subnet mask even more necessary.
Subnet masks can also be represented with ‘forward slash’ (/) after the IP address, followed by number of ‘1’s in the mask. A default class C address would therefore be written as 192.100.100.36/24. The /24 is called the prefix.
Assume that an organization based in Sydney has a block of 254 IP addresses, originally issued as a block of Class C addresses (192.100.100.0). It now wishes to open branch offices in Brisbane, Cairns, Perth, Melbourne and Adelaide. For this it would need a total of 6 NetIDs alone. For certain types of ‘cloud’ connections it would also require NetIDs for each point-to-point connection e.g. leased line. For the time being, we will ignore the ‘cloud’ issue and focus on the branch offices only.
The problem could be solved by creating subnetworks under the 192.100.100.0 network address and assigning a different subnetwork number to each LAN. To create a subnetwork, take some of the bits assigned to the HostID and use them for a subnetwork number, leaving fewer bits for HostID. Instead of NetID (24 bits) + HostID (8 bits), the IP address will now represent NetID (24 bits) + SubnetID (3 bits) + HostID (5 bits). To calculate the number of bits to be reassigned from HostID to SubnetID, choose a number of bits ‘n’ so that (2n) is bigger than or equal to the number of subnets required.
For most new routers all subnet numbers (including those consisting of all ‘0’s and all ‘1’s) can be used although it might have to be specifically enabled on the router. For older routers this is generally not possible, so two subnets are lost in the process and hence the number above needs to be modified to (2n)-2. In our case 6 subnets are required; so 3 bits have to be taken from the HostID since (23) = 8.
Note that the first subnet consisting of all zeros is called “subnet zero” and the last subnet made up of all 1s is called the “all-ones subnet”. The use of these subnets was originally discouraged due to possible confusion over having a network and subnet with the same address. This is only relevant today for legacy equipment which does not implement CIDR. All CIDR-compliant routing protocols transmit both length and suffix avoiding this problem.
Since only 5 bits are now available for HostID, each subnet can only have 30 HostIDs numbered 00001 (110) through 11110 (3010), since neither 00000 nor 11111 is allowed. To be technically correct, each subnetwork will only have 29 hosts since one HostID will be allocated to the router on that subnetwork. Note that the total available number of HostIDs has dropped from 254 to 240 (with 8 subnets having 30 hosts each).
The ‘z’ of the IP address is calculated by concatenating the SubnetID and the HostID.
For example, the network number for subnet 7 would have octet ‘z’ as 111 (SubnetID) concatenated with 00000 (HostID = ‘this network’), resulting in 224. The lowest assignable value for octet ‘z’ in subnet 7 would be 111 (SubnetID) concatenated with 00001 (HostID), resulting in 11100001 or 22510. In similar fashion the biggest assignable value would be 11111110 or 25410. The IP addresses on subnet 7 (192.100.100.224) would therefore range between 192.100.100.225 and 192.100.100.254 as shown in Figure 6.7.
In the preceding example, the first 3 bits of the HostID have been allocated as SubnetID, and have therefore effectively become part of the NetID. A default class C subnet mask would unfortunately obliterate these 3 bits, with the result that the local routers would not be able to route messages between the subnets. For this reason the subnet mask has to be extended by another 3 bits to the right, so that it becomes 11111111 11111111 11111111 11100000. The extra bits have been typed in italics, for clarity. The subnet mask for all hosts is now 255.255.255.224 or /27.
In the preceding example the available HostIDs were divided into equally sized chunks. Although it is the simplest approach, it is not always ideal since some subnets require more IP addresses than others. In our example there are 6 terrestrial links (leased lines) between the offices, which would require 7 subnets for the actual LANs and 6 for the links. However, the links only require 2 IP addresses each!
Provided that the routers support Variable Length Subnet Masking or VLSM, the problem could be solved by taking only one of the subnets (say subnet 7 i.e. 192.100.100.224/27) and subnetting it even further, using the same approach as before. For 6 links we need to take another 3 bits from subnet 7’s HostID and tack it onto the existing SubnetID. This will give 8 additional options to add on to the SubnetID, ranging from 000 to 111. The resulting HostID would now be only 2 bits long, but even though the ‘00’ and ‘11’ values cannot be used, there are still 2 usable values, which is sufficient.
Let us do some calculations for the first link (say Sydney-Melbourne). For octet ‘z’ in the network number we will use 000 to designate the link (one of the 8 options) and concatenate that with the existing 111, which results in 111000. Now concatenate that with a further 00 for HostID and the final result is 11100000 or 224. The network number for the Sydney-Melbourne link is thus 192.100.100.224/30. Note the extended subnet mask.
On this particular subnet, only two HostIDs are possible viz. 110 (012) and 210 (102) Octet ‘z’ would therefore be either 11100001 (225) or 11100010 (226). The IP addresses of the router ports attached to this link would therefore be 192.100.100.225/30 and 192.100.100.226/30.
If it is certain that a network will never be connected to the Internet, any IP address can be used as long as the IP addressing rules are followed. If there are fewer than 254 hosts on the network, it is easiest to use class C addresses. Assign each LAN segment its own class C NetID. Then assign each host a complete IP address simply by appending the dotted decimal HostID to the NetID.
If there is a possibility of connecting a network to the Internet, one should not use IP addresses that might result in address conflicts. In order to prevent such conflicts, obtain a set of Internet-unique IP addresses from an ISP or use IP addresses reserved for private works as described in RFC 1918.
ICANN has reserved several blocks of Private IP addresses for this purpose as shown below:
Reserved IP addresses are not routed on the Internet because Internet routers are programmed not to forward messages to or from these addresses.
A private network can easily be connected to the Internet by means of a router capable of performing Network Address Translation (NAT). The NAT router then translates the private addresses to one or more legitimate Internet addresses. An added advantage of this approach is that the private addresses are invisible to the outside world. This, plus the fact that most NAT routers have built-in firewalls, makes it a safe approach.
Initially, the IPv4 Internet addresses were only assigned in classes A, B and C. This approach turned out to be extremely wasteful, as large amounts of allocated addresses were not being used. Not only was the class D and E address space underutilized, but a company with 500 employees that was assigned a class B address would have 65,034 addresses that no-one else could use.
At present IPv4 addresses are considered classless. The issuing authorities (ICANN, through the five RIRs) simply hand down a block of contiguous addresses to ISPs, who can then issue them one by one, or break the large block up into smaller blocks for distribution to sub-ISPs, who will then repeat the process. Because of the fact that the 32 bit IPv4 addresses are no longer considered ‘classful’, the traditional distinction between class A, B and C addresses and the implied boundaries between the NetID and HostID can be ignored. Instead, whenever an IPv4 network address is assigned to an organization, it is done in the form of a 32-bit network address and a corresponding 32-bit mask, usually expressed as a prefix.
In the early days of the Internet, IP addresses were allocated by the Network Information Center (NIC). At that stage this was done more or less at random and each address had to be advertised individually in the routing tables of the Internet routers.
Consider, for example, the case of following 4 private (‘traditional’ class C) networks, each one with its own contiguous block of 256 addresses, being serviced by one ISP:
If we ignore the reserved HostIDs (all ‘0’s and all ‘1’s), then the concentrating router at the ISP would have to advertise 4 × 256 = 1024 separate network addresses. In a real life situation, the ISP’s router would have to advertise tens of thousands of addresses. It would also be seeing hundreds of thousands, if not millions, of addresses advertised by the routers of other ISPs across the globe. In the early nineties the situation was so serious it was expected that, by 1994, the routers on the Internet would no longer be able to cope with the multitude of routing table entries.
To alleviate this problem, the concept of Classless Inter-Domain Routing (CIDR) was introduced. Basically, CIDR removes the imposition of the class A, B and C address masks and allows the owner of a network to ‘supernet’ multiple addresses together. It then allows the concentrating router to aggregate (or ‘combine’) these multiple contiguous network addresses into a single route advertisement on the Internet.
Let us take the same example as before, but this time we allocate contiguous addresses. Note that ‘w’ can have any value between 1 and 255 since the address classes are no longer relevant.
w x y z | |
Network A: | 220.100.0. 0 |
Network B: | 220.100.1. 0 |
Network C: | 220.100.2. 0 |
Network D: | 220.100.3. 0 |
CIDR now allows the router to advertise all 1024 network addresses under one advertisement, using the starting address of the block (220.100.0.0) and a CIDR (supernet mask) of 255.255.252.0. This is achieved as follows.
CIDR uses a mask similar to the one used for subnet masking, but it has fewer ‘1’s than the subnet mask. Whereas the 1s in the subnet mask indicate the bits that comprise the network ID, the ‘1’s in the CIDR (supernet) mask indicates the bits in the IP address that do not change. In the following example we will, for the sake of simplicity, include the HostIDs with all ‘1’s and all ‘0’s.
The total number of computers in this ‘supernet’ can be calculated as follows:
Number of ‘1’s in the subnet mask = 24
Number of hosts per network = (2(32-24) ) = 28 = 256
Number of ‘1’s in the CIDR mask = 22
X= (Number of ‘1’s in the subnet mask minus number of ‘1’s in the CIDR mask) = 2
Number of networks aggregated = 2X = 22 = 4
Total number of hosts = 4 × 256 = 1024
The route advertisement of 220.100.100.0 mask 255.255.252.0 implies a supernet comprising 4 networks, each with 256 possible hosts. The lowest IP address is 220.100.100.1 and the highest is 220.100.3.254. The first mask in the following table (255.255.255.0) is the subnet mask while the second mask (255.255.252.0) is the CIDR mask.
CIDR and the concept of classless addressing go hand in hand since it is obvious that the concept can only work if the ISPs are allowed to exercise strict control over the issuing and allocation of IP addresses. Before the advent of CIDR, clients could obtain IP addresses and regard it as their ‘property’. Under the new dispensation, the ISP needs to keep control over its allocated block(s) of IP addresses. A client can therefore only ‘rent’ IP addresses from an ISP and the latter may insist on its return, should the client decide to change to another ISP.
The IP header is appended to the PDU that IP accepts from higher-level protocols, before routing it across the network. The IP header consists of six 32-bit ‘long words’ and is made up as follows:
Ver: 4 bits
The version field indicates the version of the IP protocol in use, hence the format of the header. In this case it is 4.
IHL: 4 bits
IHL is the length of the IP header in 32 bit ‘long words’, and thus points to the beginning of the data. This is necessary since the IP header can contain options and therefore has a variable length. The minimum value is 5, representing 5 × 4 = 20 bytes.
Type of Service: 8 bits
The Type of Service (ToS) field is intended to provide an indication of the parameters of the quality of service desired. These parameters are used to guide the selection of the actual service parameters when transmitting a datagram through a particular network.
Some networks offer service precedence, which treats high precedence traffic as more important than other traffic (generally by accepting only traffic above a certain precedence at times of high load). The choice involved is a three-way trade-off between low delay, high reliability, and high throughput.
The ToS field is composed of a 3-bit precedence field (which is often ignored) and an unused (LSB) bit that must be 0. The remaining 4 bits may only be turned on one at a time, and are allocated as follows:
RFC 1340 (corrected by RFC 1349) specifies how all these bits should be set for standard applications. Applications such as TELNET and RLOGIN need minimum delay since they transfer small amounts of data. FTP needs maximum throughput since it transfers large amounts of data. Network management (SNMP) requires maximum reliability and Usenet news (NNTP) needs to minimize monetary cost.
Most TCP/IP implementations do not support the ToS feature, although some newer implementations of BSD and routing protocols such as OSPF and IS-IS can make routing decisions on it.
Total Length: 16 bits
This is the length of the datagram, measured in bytes, including the header and data. By using this field and the header length, it can be determined where the data starts and ends. This field allows the length of a datagram to be up to 216 = 65,536 bytes, the maximum size of the segment handed down to IP from the protocol above it.
Such long datagrams are, however, impractical for most hosts and networks. All hosts must at least be prepared to accept datagrams of up to 576 octets (whether they arrive whole or in fragments). It is recommended that hosts only send datagrams larger than 576 octets if they have the assurance that the destination is prepared to accept the larger datagrams.
The number 576 is selected to allow a reasonably-sized data block to be transmitted in addition to the required header information. For example, this size allows a data block of 512 octets plus 64 header octets to fit in a datagram, which is the maximum size permitted by X.25. A typical IP header is 20 octets, allowing some space for headers of higher-level protocols.
Identification: 16 bits
This number uniquely identifies each datagram sent by a host and is normally incremented by one for each datagram sent. In the case of fragmentation it is appended to all fragments of the same datagram for the sake of reconstructing the datagram at the receiving end. It can be compared to the ‘tracking’ number of an item delivered by registered mail or by courier.
Flags: 3 bits
There are two flags:
Fragment Offset: 13 bits
This field indicates where in the original datagram this fragment belongs. The fragment offset is measured in units of 8 bytes (64 bits). In other words, the transmitted offset value is equal to the actual offset divided by eight. This constraint necessitates fragmentation in such a way that the offset is always exactly divisible by eight. It had to be done this way since for large chunks of data (close to 64K) the offset value for the last fragment would be almost 64K, requiring a 16 bit field. However, three bits have been used for the Flags field and therefore the actual offset (in bytes) is divided by 23 = 8 as a precaution.
Time to Live: 8 bits
The purpose of this field is to cause undeliverable datagrams to be discarded. Every router that processes an IP datagram must decrease the TTL by one and if this field contains the value zero, then the datagram must be destroyed.
The original design called for TTL to be a timer function, but that is difficult to implement and currently all routers simply decrement TTL every time they pass a datagram.
Protocol: 8 bits
This field indicates the next (higher) level protocol used in the data portion of the IP datagram, in other words the protocol that resides above IP in the protocol stack and which has passed the datagram on to IP.
Typical values are 0x0806 for ARP and 0x8035 for RARP (0x meaning ‘hexadecimal’).
Header Checksum: 16 bits
This is a checksum on the header only, referred to as a ‘standard Internet checksum’. Since some header fields change (e.g. TTL), this is recomputed and verified at each point that the IP header is processed. It is not necessary to cover the data portion of the datagram, as the protocols making use of IP, such as ICMP, IGMP, UDP and TCP, all have a checksum in their headers to cover their own header and data.
To calculate it, the header is divided up into 16-bit words. These words are then added together (normal binary addition with carry) one by one, and the interim sum stored in a 32-bit accumulator. When done, the upper 16 bits of the result is stripped off and added to the lower 16 bits. If, after this, there is a carry over to the 17th bit, it is carried back and added to bit 0. The result is then truncated to 16 bits.
Source and Destination Address: 32 bits each
These are the 32-bit IP addresses of both the origin and the destination of the datagram.
It should be clear by now that IP might often have difficulty in sending packets across a network since, for example, Ethernet can only accommodate 1500 octets at a time and X.25 is limited to 576. This is where the fragmentation process comes into play. The relevant field here is ‘fragment offset’ (13 bits) while the relevant flags are DF (don’t fragment) and MF (more fragments).
Consider a datagram consisting of an IP header followed by 3500 bytes of data. This cannot be transported over an Ethernet network, so it has to be fragmented in order to ‘fit’. The datagram will be broken up into three separate datagrams, each with their own IP header, with the first two frames around 1500 bytes and the last fragment around 500 bytes. The three frames will travel to their destination independently, and will be recognized as fragments of the original datagram by virtue of the number in the identifier field. However, there is no guarantee that they will arrive in the correct order, and the receiver needs to reassemble them.
For this reason the Fragment Offset field indicates the distance or offset between the start of this particular fragment of data, and the starting point of the original frame. One problem though – since only 13 bits are available in the header for the fragment offset (instead of 16), this offset is divided by 8 before transmission, and again multiplied by 8 after reception, requiring the data size (i.e. the offset) to be a multiple of 8 – so an offset of 1500 won’t do. 1480 will be OK since it is divisible by 8. The data will be transmitted as fragments of 1480, 1480 and finally the remainder of 540 bytes. The fragment offsets will be 0, 1480 and 2960 bytes respectively, or 0, 185 and 370 – after division by 8.
Incidentally, another reason why the data per fragment cannot exceed 1480 bytes for Ethernet, is that the IP header has to be included for each datagram (otherwise individual datagrams will not be routable) and hence 20 of the 1500 bytes have to be forfeited to the IP header.
The first frame will be transmitted with 1480 bytes of data, Fragment Offset = 0, and MF (more flag) = 1.
The second frame will be transmitted with the next 1480 bytes of data, Fragment Offset = 185, and MF = 1.
The last third frame will be transmitted with 540 bytes of data, Fragment Offset = 370, MF = 0.
Some protocol analyzers will indicate the offset in hexadecimal; hence it will be displayed as 0xb9 and 0x172, respectively.
For any given type of network the packet size cannot exceed the so-called MTU (Maximum Transmission Unit) for that type of network. The following are some typical values:
The fragmentation mechanism can be checked by doing a ‘ping’ across a network, and setting the data (–l) parameter in the ping command to exceed the MTU value for the network.
IPng (‘IP new generation’), as documented in RFC 1752, was approved by the Internet Engineering Steering Group in November 1994 and made a Proposed Standard. The formal name of this protocol is IPv6 (‘IP version 6’). After extensive testing, IANA gave permission for its deployment in mid-1999.
IPv6 is an update of IPv4, to be installed as a ‘backwards compatible’ software upgrade, with no scheduled implementation dates. It runs well on high performance networks such as ATM, and at the same time remains efficient enough for low bandwidth networks such as wireless LANs. It also makes provision for Internet functions such as audio broadcasting and encryption.
Upgrading to and deployment of IPv6 can be achieved in stages. Individual IPv4 hosts and routers may be upgraded to IPv6 one at a time without affecting any other hosts or routers. New IPv6 hosts and routers can be installed one by one. There are no pre-requisites to upgrading routers, but in the case of upgrading hosts to IPv6 the DNS server must first be upgraded to handle IPv6 address records.
When existing IPv4 hosts or routers are upgraded to IPv6, they may continue to use their existing address. They do not need to be assigned new IPv6 addresses; neither do administrators have to draft new addressing plans.
The simplicity of the upgrade to IPv6 is brought about through the transition mechanisms built into IPv6. They include the following:
The IPv6 transition mechanisms ensure that IPv6 hosts can inter-operate with IPv4 hosts anywhere in the Internet up until the time when IPv4 addresses run out, and allows IPv6 and IPv4 hosts within a limited scope to inter-operate indefinitely after that. This feature protects the huge investment users have made in IPv4 and ensures that IPv6 does not render IPv4 obsolete. Hosts that need only a limited connectivity range (e.g., printers) need never be upgraded to IPv6.
The changes from IPv4 to IPv6 fall primarily into the following categories:
The header contains the following fields:
Ver: 4 bits
The IP version number, viz. 6.
Traffic Class: 8 bits
This is used in conjunction with the Flow Label and indicates an Internet traffic priority delivery value.
Flow Label: 20 bits
A flow is a sequence of packets sent from a particular source to a particular (unicast or multicast) destination for which the source desires special handling by the intervening routers. This is an optional field to be used if specific non-standard (‘non-default’) handling is required to support applications that require some degree of consistent throughput in order to minimize delay and/or jitter. These types of applications are commonly described as ‘multi-media’ or ‘real-time’ applications.
The flow label will affect the way the packets are handled but will not influence the routing decisions.
Payload Length: 16 bits
The payload is the rest of the packet following the IPv6 header, in bytes. The maximum payload that can be carried behind a standard IPv6 header cannot exceed 65,536 bytes. With an extension header this is possible, and the datagram is then referred to as a Jumbo datagram. Payload Length differs slightly from the IPv4 ‘total length’ field in that it does not include the header length.
Next Header: 8 bits
This identifies the type of header immediately following the IPv6 header, using the same values as the IPv4 protocol field. Unlike IPv4, where this would typically point to TCP or UDP, this field could either point to the next protocol header (TCP) or to the next IPv6 extension header.
Hop Limit: 8 bits
This is an unsigned integer, similar to TTL in IPv4. It is decremented by 1 by each node that forwards the packet. The packet is discarded if hop limit is decremented to zero.
Source IP Address: 128 bits
This is the IPv6 address of the initial sender of the packet.
Destination IP address: 128 bits
This is IPv6 Address of the intended recipient of the packet, which is not necessarily the ultimate recipient, if an optional routing header is present.
IPv6 includes an improved option mechanism. Instead of placing extra option bytes within the main header, IPv6 options are placed in separate extension headers that are located between the IPv6 header and the Transport layer header in a packet.
Most IPv6 extension headers are not examined or processed by routers along a packet’s path until it arrives at its final destination. This leads to a major improvement in router performance for packets containing options. In IPv4 the presence of any options requires the router to examine all options.
IPv6 extension headers can be of arbitrary length and the total amount of options carried in a packet is not limited to 40 bytes as with IPv4. They are also not carried within the main header, as with IPv4, but are only used when needed, and are carried behind the main header. This feature, plus the manner in which they are processed, permits IPv6 options to be used for functions that were not practical in IPv4. Examples of this are the IPv6 authentication and security encapsulation options.
In order to improve the performance when handling subsequent option headers and the transport protocol which follows, IPv6 options are always an integer multiple of 8 bytes long in order to retain this alignment for subsequent headers.
The recommended sequence of headers in an IPv6 packet per RFC 2460 is:
IPv6 addresses are 128 bits long and are identifiers for individual interfaces or sets of interfaces. IPv6 Addresses of all types are assigned to interfaces (i.e. NICs) and NOT to nodes i.e. hosts. Since each interface belongs to a single node, any of that node’s interfaces’ unicast addresses may be used as an identifier for the node. A single interface may be assigned multiple IPv6 addresses of any type.
There are three types of IPv6 addresses. These are unicast, anycast, and multicast.
The IPv6 address is four times the length of IPv4 addresses (128 vs 32). This is 4 billion times 4 billion (296) times the size of the IPv4 address space. This works out to be 340,282,366,920,938,463,463,374,607,431,768,211,456. Theoretically this is approximately 665,570,793,348,866,943,898,599 addresses per square meter of the surface of the Earth (assuming the Earth’s surface is 511,263,971,197,990 square meters). In more practical terms, considering that the creation of addressing hierarchies will reduce the efficiency of the usage of the address space, IPv6 is still expected to support between 8×1017 to 2×1033 nodes. Even the most pessimistic estimate provides around 1500 addresses per square meter of the surface of the Earth.
IPv6 addresses are written in hexadecimal rather than the IPv4 dotted decimal notation. Each cluster of 4 characters is truncated with a colon (‘:’) sign. This results in an address such as 1234:5678:DEF0:1234:5678:0000:0000:9ABC. The ‘0000’ can be expressed as a single ‘0’, i.e. 1234:5678:DEF0:1234:5678:0:0:9ABC. Alternatively the entire string of ‘0’s can be written as ::, which results in 1234:5678:DEF0:1234:5678::9ABC.
IPv6 addresses are allocated to ‘groups’ of addresses used for specific purposes. For example, all addresses starting with the bits 001 (a total of 1/8 of the overall address space) have been allocated as Global Unicast addresses and those starting with 1111 1110 10 (a total of 1/1024 of the total address space) have been allocated as Link-local Unicast addresses. These leading bits were initially called the Format Prefix but the concept of formal Format Prefixes was deprecated in RFC3513. The bits are still there, though they are just not called anything. Global Unicast Addresses therefore start with the binary sequence 001x xxxx xxxx xxxx etc. This is written as 2000::/3, meaning that the first three bits (001) designate a Global Unicast Address and the rest can be anything.
There are several forms of unicast address assignment in IPv6. These are:
Site-local addresses were initially included in this list, but deprecated in September 2004.
Global Unicast addresses
These addresses are used for global communication. They are similar in function to IPv4 addresses under CIDR. Their format is:
The original format included a TLA (Top Level Aggregator), NLA (Next Level Aggregator) and SLA (Site Level Aggregator) but this approach has been deprecated in 1994.
Prefix: 48 bits
The first three bits are always 001, i.e. 2000:/3. The Prefix is the network ID used for routing to a particular site and is similar to the IPv4 NetID, although the composition of the number is more complex and indicates the hierarchy of service providers from the Internet backbone down to the user’s site.
Subnet ID: 16bits
This identifies a subnet within a site.
Interface ID: 64 bits
This is the unique identifier for a particular interface (e.g. a host). It is unique within the specific prefix and subnet and is the equivalent of the ‘HostID’ in IPv4. However, instead of an arbitrary number it would consist of the hardware address of the interface, e.g. the Ethernet MAC address in the IEEE EUI-64 format
Existing 48-bit MAC addresses are converted to the EUI-64 format by splitting them in the middle and inserting the string FF-FE in between the two halves.
Unspecified addresses
This can be written as 0:0:0:0:0:0:0:0, or simply :: (double colon). This address can be used as a source address by a station that has not yet been configured with an IP address. It can never be used as a destination address. This is similar to 0.0.0.0 in IPv4.
Loopback addresses
The loopback address 0:0:0:0:0:0:0:1 can be used by a node to send a datagram to itself. It is similar to the 127.0.0.1 of IPv4.
IPv4-based addresses
It is possible to construct an IPv6 address out of an existing IPv4 address. This is done by prepending 96 zero bits to an IPv4 address. The result is written as 0:0:0:0:0:0:192.100.100.3, or simply ::192.100.100.3.
Link-local unicast addresses
Stations that are not yet configured with either a provider-based address or a site local address may use link local addresses. Link-local addresses start with 1111 1110 10 i.e. FE80::/10. These addresses can only be used by stations connected to the same local network and packets addressed in this way cannot traverse a router, since there is no routing information contained within the header.
Anycast addresses
An IPv6 Anycast address is an address that is assigned to more than one interface (typically belonging to different nodes), with the property that a packet sent to an Anycast address is routed to the ‘nearest’ interface having that address, according to the routing protocols’ measure of distance. Anycast addresses, when used as part of a route sequence, permits a node to select which of several internet service providers it wants to carry its traffic. This capability is sometimes called ‘source selected policies’. This would be implemented by configuring Anycast addresses to identify the set of routers belonging to ISPs (e.g. one Anycast address per ISP). These Anycast addresses can be used as intermediate addresses in an IPv6 routing header, to cause a packet to be delivered via a particular provider or sequence of providers.
Other possible uses of Anycast addresses are to identify the set of routers attached to a particular subnet, or the set of routers providing entry into a particular routing domain. Anycast addresses are allocated from the unicast address space, using any of the defined unicast address formats. Thus, Anycast addresses are syntactically indistinguishable from Unicast addresses. When a Unicast address is assigned to more than one interface, it ‘automatically’ turns it into an Anycast address. The nodes to which the address is assigned must be explicitly configured to know that it is an Anycast address.
Multicast addresses
IPv6 multicast addresses (FF00:/8) are identifiers for groups of interfaces. An interface may belong to any number of multicast groups. Multicast addresses have the following format:
The values are:
0 | Reserved | |
1 | Node-local | scope |
2 | Link-local | scope |
5 | Site-local | scope |
8 | Organization-local | scope |
14 | Global scope | |
15 | Reserved |
The following example shows how it all fits together. The multicast address FF:08::43 points to all NTP servers in a given organization, in the following way:
The Address Resolution Protocol (ARP) is used with IPv4. Initially the designers of IPv6 assumed that it would use ARP as well, but subsequent work by the SIP, SIPP and IPv6 working groups led to the development of the IPv6 ‘neighbor discovery’ procedures that encompass ARP, as well as those of router discovery.
Some network technologies make address resolution difficult. Ethernet NICs, for example, come with built-in 48-bit hardware (MAC) addresses. This creates several difficulties:
To overcome these problems in an efficient manner, and eliminate the need for applications to know about MAC addresses, the Address Resolution Protocol (ARP) (RFC 826) resolves addresses dynamically.
When a host wishes to communicate with another one on the same physical network, it needs the destination MAC address in order to compose the basic layer 2 frame. If it does not know what the destination MAC address is, but has the IP address, it invokes ARP, which broadcasts a special type of datagram in order to resolve the problem. This is called an ARP request. This datagram requests the owner of the unresolved IP address to reply with its MAC address. All hosts on the local network will receive the broadcast, but only the one that recognizes its own IP address will respond.
While the sender could, of course, just broadcast the original message to all hosts on the network, this would impose an unnecessary load on the network, especially if the datagram was large. A small ARP request, followed by a small ARP reply, followed by a unicast transmission of the original datagram, is a much more efficient way of resolving the problem.
Because communication between two computers usually involves transfer of a succession of datagrams, it is prudent for the sender to ‘remember’ the MAC information it receives, at least for a while. Thus, when the sender receives an ARP reply, it stores the MAC address as well as the corresponding IP address in its ARP cache. Before sending any message to a specific IP address it checks first to see if the relevant address binding is in the cache. This saves it from repeatedly broadcasting identical ARP requests.
The ARP cache holds 4 fields of information for each device:
The layout of an ARP datagram is as follows:
Hardware Type: 16 bits
Specifies the hardware interface type of the target, e.g.:
1 = Ethernet
3 = X.25
4 = Token ring
6 = IEEE 802.x
7 = ARCnet
Protocol Type: 16 bits
Specifies the type of high-level protocol address the sending device is using. For example,
204810 (0x800): IP
205410 (0x0806): ARP
328210 (0x8035): RARP
HA Length: 8 bits
The length, in bytes, of the hardware (MAC) address. For Ethernet it is 6.
PA Length: 8 bits
The length, in bytes, of the internetwork protocol address. For IP it is 4.
Operation: 8 bits
Indicates the type of ARP datagram:
1 = ARP request
2 = ARP reply
3 = RARP request
4 = RARP reply
Sender HA: 48 bits
The hardware (MAC) address of the sender, expressed in hexadecimal.
Sender PA: 32 bits
The (internetwork) protocol address of the sender. In the case of TCP/IP this will be the IP address, expressed in hexadecimal (not dotted decimal notation).
Target HA: 48 bits
The hardware (MAC) address of the target host, expressed in hexadecimal.
Target PA: 32 bits
The (internetwork) protocol (IP) address of the target host, expressed in hexadecimal.
Because of the use of fields to indicate the lengths of the hardware and protocol addresses, the address fields can be used to carry a variety of address types, making ARP applicable to a number of different network standards.
The broadcasting of ARP requests presents some potential problems. Networks such as Ethernet employ connectionless delivery systems i.e. the sender does not receive any feedback as to whether datagrams it has transmitted were received by the target device. If the target is not available, the ARP request destined for it will be lost without trace and no ARP response will be generated. Thus the sender must be programmed to retransmit its ARP request after a certain time period, and must be able to store the datagram it is attempting to transmit in the interim. It must also remember what requests it has sent out so that it does not send out multiple ARP requests for the same address. If it does not receive an ARP reply it will eventually have to discard the outgoing datagrams.
Because it is possible for a machine’s hardware address to change, as happens when an Ethernet NIC fails and has to be replaced, entries in an ARP cache have a limited life span after which they are deleted. Every time a machine with an ARP cache receives an ARP message, it uses the information to update its own ARP cache. If the incoming address binding already exists it overwrites the existing entry with the fresh information and resets the timer for that entry.
The host trying to determine another machine’s MAC address will send out an ARP request to that machine. In the datagram it will set operation = 1 (ARP request), and insert its own IP and MAC addresses as well as the destination machine’s IP address in the header. The field for the destination machine’s MAC address will be left zero.
It will then broadcast this message using all ‘ones’ in the destination address of the LLC frame so that all hosts on that subnet will ‘see’ the request.
If a machine is the target of an incoming ARP request, its own ARP software will reply. It swaps the target and sender address pairs in the ARP datagram (both HA and PA), inserts its own MAC address into the relevant field, changes the operation code to 2 (ARP reply), and sends it back to the requesting host.
Proxy ARP enables a router to answer ARP requests made to a destination node that is not on the same subnet as the requesting node. Assume that a router connects two subnets, subnet A and subnet B. Subnet A is 192.168.0.0/16 and subnet B is 192.168.1.0./24. If host A1 (192.168.0.1) on subnet A tries to ping host B1 (192.168.1.1) on subnet B, it will believe B1 resides on the same subnet and will therefore send an ARP request to host B1. This would normally not work as an ARP can only be performed between hosts on the same subnet (where all hosts can ‘see’ and respond to the FF:FF:FF:FF:FF:FF broadcast MAC address). The requesting host, A1, would therefore not get a response.
If proxy ARP has been enabled on the router, it will recognize this request and issue its own ARP request, on behalf of A1, to B1. Upon obtaining a response from B1, it would report back to A1 on behalf of B1. The MAC address returned to A1 will not be that of B1, but rather that of the router NIC connected to subnet A, as this is the physical address where A1 will send data destined for B1.
Gratuitous ARP occurs when a host sends out an ARP request looking for its own address. This is normally done at the time of boot-up (when the TCP/IP stack is being initialized) in order to detect any duplicate IP addresses. The initializing host would not expect a response to the request. If a response does appear, it means that another host with a duplicate IP address exists on the network.
As its name suggests the Reverse Address Resolution Protocol (RARP), as described in RFC 903, does the opposite to ARP. It is used to obtain an IP address when the physical address is known.
Usually a machine holds its own IP address on its hard drive, where the operating system can find it on startup. However, a diskless workstation is only aware of its own hardware address and has to recover its IP address from an address file on a remote server at startup. It could use RARP to retrieve its IP address.
A diskless workstation broadcasts an RARP request on the local network using the same datagram format as an ARP request. It has, however, an opcode of 3 (RARP request), and identifies itself as both the sender and the target by placing its own physical address in both the sender hardware address field and the target hardware address field. Although the RARP request is broadcast, only a RARP server (i.e. a machine holding a table of addresses and programmed to provide RARP services) can generate a reply. There should be at least one RARP server on a network.
The RARP server changes the opcode to 4 (RARP reply). It then inserts the missing address in the target IP address field, and sends the reply directly back to the requesting machine. The requesting machine then stores it in memory until next time it reboots.
All RARP servers on a network will reply to a RARP request, even though only one reply is required. The RARP software on the requesting machine sets a timer when sending a request and retransmits the request if the timer expires before a reply has been received.
On a best-effort LAN such as Ethernet, the provision of more than one RARP server reduces the likelihood of RARP replies being lost or dropped because the server is down or overloaded. This is important because a diskless workstation often requires its own IP address before it can complete its bootstrap procedure. To avoid multiple and unnecessary RARP responses on a broadcast-type network such as Ethernet, each machine on the network is assigned a particular server, called its primary RARP server. When a machine broadcasts a RARP request, all servers will receive it and record its time of arrival, but only the primary server for that machine will reply. If the primary server is unable to reply for any reason, the sender’s timer will expire, it will rebroadcast its request and all non-primary servers receiving the rebroadcast so soon after the initial broadcast will respond.
Alternatively, all RARP servers can be programmed to respond to the initial broadcast, with the primary server set to reply immediately, and all other servers set to respond after a random time delay. The retransmission of a request should be delayed long enough for these delayed RARP replies to arrive.
RARP has several drawbacks. Firstly, very little information (only an IP address) is returned. Protocols such as BootP provide significantly more information, such as the name and location of required boot-up files. Secondly, RARP is a layer 2 protocol and uses a MAC address to obtain an IP address, hence it cannot be routed.
Errors occur in all networks. These arise when destination nodes fail, or become temporarily unavailable, or when certain routes become overloaded with traffic. A message mechanism called the Internet Control Message Protocol (ICMP) is incorporated into the TCP/IP protocol suite to report errors and other useful information about the performance and operation of the network.
ICMP communicates between the Internet layers on two nodes and is used by routers as well as individual hosts. Although ICMP is viewed as residing within the Internet layer, its messages travel across the network encapsulated in IP datagrams in the same way as higher layer protocol (such as TCP or UDP) datagrams. This is done with the protocol field in the IP header set to 0x01, indicating that an ICMP datagram is being carried. The reason for this approach is that, due to its simplicity, the ICMP header does not include any IP address information and is therefore in itself not routable. It therefore has little choice but to rely on IP for delivery. The ICMP message, consisting of an ICMP header and ICMP data, is encapsulated as ‘data’ within an IP datagram with the resultant structure indicated in the figure below.
The complete IP datagram, in turn, has to depend on the lower network interface layer (for example, Ethernet) and is thus contained as payload within the Ethernet data area.
The various uses for ICMP include:
There are a variety of ICMP messages, each with a different format, yet the first 3 fields as contained in the first 4 bytes (‘long word’) are the same for all, as shown in Figure 6.27.
The three common fields are:
ICMP messages can be further subdivided into two broad groups viz. ICMP error messages and ICMP query messages as follows.
ICMP error messages
ICMP query messages
Too many ICMP error messages in the case of a network experiencing errors due to heavy traffic can exacerbate the problem, hence the following conditions apply:
Here follows a few examples of ICMP error messages.
If a router receives a high rate of datagrams from a particular source it will issue a source quench ICMP message for every datagram it discards. The source node will then slow down its rate of transmission until the source quench messages stop; at which stage it will gradually increase the rate again.
Apart from the first 3 fields, already discussed, the header contains the following additional fields:
When a router detects that a source node is not using the best route in which to transmit its datagram, it sends a message to the node advising it of the better route.
Apart from the first 3 fields, already discussed, the header contains the following additional fields:
The code values for redirect messages are as follows.
If an IP datagram has traversed too many routers, its TTL counter will eventually reach a count of zero. The ICMP Time Exceeded message is then sent back to the source node. The Time Exceeded message will also be generated if one of the fragments of a fragmented datagram fails to arrive at the destination node within a given time period and as a result the datagram cannot be reconstructed.
The code field value is then as follows.
Code 1 refers to the situation where a gateway waits to reassemble a few fragments and a fragment of the datagram never arrives at the gateway.
When there are problems with a particular datagram’s contents, a parameter problem message is sent to the original source. The pointer field points to the problem bytes. Code 1 is used to indicate that a required option is missing but that the pointer field is not being used.
When a gateway is unable to deliver a datagram, it responds with this message. The datagram is then deleted.
The values relating the code values in the above unreachable message are as follows.
In addition to the reports on errors and exceptional conditions, there is a set of ICMP messages to request information, and to reply to such requests.
Echo Request and Reply
An Echo Request message is sent to the destination node to essentially enquire: ‘Are you alive?’ A reply indicates that the pathway (i.e. the network links in-between as well as the routers) and the destination node are all operating correctly. The structure of the Request and Reply is indicated below.
The first three fields have already been discussed. The additional fields are:
Time-stamp Request and Reply
This can be used to synchronize the clock of a host with that of a time server.
The ICMP Time-stamp Request and Reply messages enable a client to adjust its clock against that of an accurate server. The times referred to hereunder are 32-bit integers, measured in milliseconds since midnight, Universal Co-ordinated Time (UCT). This was previously known as Greenwich Mean Time or GMT.
The adjustment is initiated by the client inserting its current time in the ‘originate’ field, and sending the ICMP datagram off to the server. The server, upon receiving the message, then inserts the ‘received’ time in the appropriate field.
The server then inserts its current time in the ‘transmit’ field and returns the message. In practice, the ‘received’ and ‘transmit’ fields for the server are set to the same value.
The client, upon receiving the message back, records the ‘present’ time (albeit not within the header structure). It then deducts the ‘originate’ time from the ‘present’ time. Assuming negligible delays at the server, this is the time that the datagram took to travel to the server and back, or the Round Trip Time (RTT). The time to the server is then one-half of this value.
The correct time at the moment of originating the message at the client is now calculated by subtracting the RTT from the ‘transmit’ time-stamp created by the server. The client calculates its error by the relationship between the ‘originate’ time-stamp and the actual time, and adjust its clock accordingly. By repeated application of this procedure all hosts on a LAN can maintain their clocks to within less than a millisecond of each other.
Unlike the Host-to-host layer protocols (e.g. TCP), which control end-to-end communications, IP is rather ‘short-sighted’. Any given IP node (host or router) is only concerned with routing the datagram to the next node, where the process is repeated. Very few routers have knowledge about the entire internetwork, and often the datagrams are forwarded based on default information without any knowledge of where the destination actually is.
Before discussing the individual routing protocols in any depth, the basic concepts of IP routing have to be clarified. This section will discuss the concepts and protocols involved in routing, while the routers themselves will be discussed in Chapter 10.
Refer to Figure 6.39. When the source host prepares to send a message to the destination host, a fundamental decision has to be made, namely: is the destination host also resident on the local network or not? If the NetID portions of the IP address match, the source host will assume that the destination host is resident on the same network, and will attempt to forward it locally. This is called direct delivery.
If not, the message will be forwarded to the ‘default gateway’, i.e. the IP address of a router on the local network. If the router can deliver it directly i.e. the host resides on a network directly connected to the router, it will. If not, it will consult its routing tables and forward it to the next appropriate router.
This process will repeat itself until the packet is delivered to its final destination, and is referred to as indirect delivery.
Each router has a routing table that indicates the next IP address where the packet has to be sent in order to eventually arrive at its ultimate destination. These routing tables can be maintained in two ways. In most cases, the routing protocols will do this automatically. The routing protocols are implemented in software that runs on the routers, enabling them to communicate with each other on a regular basis, allowing them to share their knowledge of the network with each other. In this way they continuously ‘learn’ about the topology of the system, and upgrade their routing tables accordingly. This process is called dynamic routing.
If, for example, a particular router is removed from the system, the routing tables of all routers containing a reference to that router will change. However, because of the interdependence of the routing tables, a change in any given table will initiate a change in many other routers and it will be a while before the tables stabilize. This process is known as convergence.
Dynamic routing can be further sub-classified as distance vector, link-state or hybrid; depending on the method by which the routers calculate the optimum path.
In distance vector dynamic routing, the ‘metric’ or yardstick used for calculating the optimum routes is simply based on distance, i.e. which route results in the least number of ‘hops’ to the destination. Each router constructs a table that indicates the number of hops to each known network. It then periodically passes copies of its tables to its immediate neighbors. Each recipient of the message simply adjusts its own tables based on the information received from its neighbor.
The major problem with the distance vector algorithm is that it takes some time to converge to a new understanding of the network. The bandwidth and traffic requirements of this algorithm can also affect the performance of the network. Its major advantage is that it is simple to configure and maintain as it only uses the distance to calculate the optimum route.
Link state routing protocols are also known as ‘shortest path first’ protocols. This is based on the routers exchanging link state advertisements to the other routers. Link state advertisement messages contain information about error rates and traffic densities and are triggered by events rather than running periodically as with the distance routing algorithms.
Hybridized routing protocols use both the methods described above and are more accurate than the conventional distance vector protocols. They converge more rapidly to an understanding of the network than distance vector protocols and avoid the overheads of the link state updates. The best example of this one is the Enhanced Interior Gateway Routing Protocol (EIGRP).
It is also possible for a network administrator to make static entries into routing tables. These entries will not change, even if a router that they point to is not operational.
For the purpose of routing, a TCP/IP-based internetwork can be divided into several Autonomous Systems (ASs) or domains. An AS consists of hosts, routers and data links that form several physical networks but is administered by a single authority such as a service provider, university, corporation, or government agency.
ASs can be classified under one of three categories:
Routing decisions that are made within an AS are totally under the control of the administering organization. Any routing protocol, using any type of routing algorithm, can be used within an AS since the routing between two hosts in the system is completely isolated from any routing that occurs in other ASs. Only if a host within one AS communicates with a host outside the AS, will another AS (or ASs) be involved.
There are two main categories of TCP/IP routing protocols, namely Interior Gateway Protocols (IGPs) and Exterior Gateway Protocols (EGPs).
Two routers that communicate directly with one another and are both part of the same AS are said to be interior neighbors and are called Interior Gateways. They communicate with each other using IGPs.
In a simple AS consisting of only a few physical networks, the routing function provided by IP may be sufficient. In larger ASs, however, sophisticated routers using adaptive routing algorithms may be needed. These routers will communicate with each other using IGPs such as RIP or OSPF.
Routers in different ASs, however, cannot use IGPs for communication for more than one reason. Firstly, IGPs are not optimized for long-distance path determination. Secondly, the owners of ASs (particularly ISPs) would find it unacceptable for their routing metrics (which include sensitive information such as error rates and network traffic) to be visible to their competitors. For this reason routers that communicate with each other and are resident in different ASs communicate with each other using Exterior Gateway Protocols or EGPs. The Border routers on the periphery, connected to other ASs, must be capable of handling both the appropriate IGPs and EGPs.
The most common EGP (in fact, the only) currently used in the TCP/IP environment is Cisco’s Border Gateway Protocol (BGP), the current version being BGPv4.
The protocols that will be discussed are RIP (Routing Information Protocol), EIGRP (Enhanced Interior Gateway Routing Protocol), and OSPF (Open Shortest Path First).
RIP
RIP originally saw the light as RIP (RFC 1058, 1388) and is one of the oldest routing protocols. The original RIP had a shortcoming in that it could not handle variable-length subnet masks, and hence could not support CIDR. This capability was included with RIPv2. Other versions of RIP include RIPv3 for UNIX platforms and RIPng for IPv6.
RIP is a distance vector routing protocol that uses hop counts as a metric (i.e. form of measurement). Each router, using a special packet to collect and share information about distances, keeps a routing table of its perspective of the network showing the number of hops required to reach each network. In order to maintain their individual perspective of the network, routers periodically pass copies of their routing tables to their immediate neighbors. Each recipient adds a distance vector to the table and forwards the table to its immediate neighbors. The hop count is incremented by one every time the packet passes through a router. RIP only records one route per destination (even if there are more).
Figure 6.41 shows a sample network and the relevant routing tables.
The RIP routers have fixed update intervals and each router broadcasts its entire routing table to other routers at 30-second intervals (60 seconds for Netware RIP). Each router takes the routing information from its neighbor, adds or subtracts one hop to the various routes to account for itself, and then broadcasts its updated table.
Every time a router entry is updated, the timeout value for the entry is reset. If an entry has not been updated within 180 seconds it is assumed suspect and the hop field set to 16 to mark the route as unreachable and it is later removed from the routing table.
One of the major problems with distance vector protocols like RIP is the ‘convergence time’, which is the time it takes for the routing information on all routers to settle in response to some change to the network. For a large network the convergence time can be long and there is a greater chance of frames being misrouted.
RIPv2 (RFC1723) also supports:
RIPng has features similar to that of RIPv2, although it does not support authentication as it uses the standard IPSec security features defined for IPv6.
EIGRP
EIGRP is an enhancement of the original IGRP, a proprietary routing protocol developed by Cisco Systems for use on the Internet. IGRP is outdated since it cannot handle CIDR and VLSM.
EIGRP is a distance vector routing protocol that uses a composite metric for route calculations. It allows for multipath routing, load balancing across 2, 3 or 4 links, and automatic recovery from a failed link. Since it does not only take hop count into consideration, it has better real-time appreciation of the link status between routers and is more flexible than RIP. Like RIP it broadcasts whole routing table updates, but at 90 second intervals.
Each of the metrics used in the calculation of the distance vectors has a weighting factor. The metrics used in the calculation are as follows:
The metric used is:
Metric = K1 * bandwidth + (K2 * bandwidth)/ (256 – Load) + K3 * Delay
Where K1, K2 and K3 are weighting factors.
Reliability is also added by using the modified metric, which modifies the metric calculated in the first equation above.
Metricmodified = Metric * K5/(reliability + K4)
One of the key design parameters of EIGRP is complete independence from routed protocols. Hence EIGRP has implemented a modular approach to supporting routed protocols and can easily be retrofitted to support any other routed protocol.
OSPF
This was designed specifically as an IP routing protocol, hence it cannot transport any other routing protocols such as IPX. It is encapsulated directly in the IP protocol. OSPF can quickly detect topological changes by flooding link state advertisements to all the other neighbors with reasonably quick convergence.
OSPF is a link state routing or Shortest Path First (SPF) protocol, with the first version detailed in RFC 1131. Version 2 (OSPFv2) was released in 1991 and detailed in RFC 1247. Version 2 was subsequently revised in RFCs 1583, 2178 and 2328 with the last being the current version. With OSPF, each router periodically uses a broadcast mechanism to transmit information to all other routers, about its own directly connected routers and the status of the data links to them. Based on the information received from all the other routers each router then constructs its own network routing tree using the shortest path algorithm.
These routers continually monitor the status of their links by sending packets to neighboring routers. When the status of a router or link changes this information is broadcast to the other routers, which then update their routing tables. This process is known as flooding and the packets sent are very small representing only the link state changes.
Using cost as the metric OSPF can support a much larger network than RIP, which is limited to 15 routers. A problem area can be in mixed RIP and OSPF environments if routers go from RIP to OSPF and back when hop counts are not incremented correctly.
The first Exterior Gateway Protocol was, in fact called EGP! The current de facto Internet standard for inter-domain (AS) routing is BGPv4.
BGPv4, as detailed in RFC 1771, performs intelligent route selection based on the shortest autonomous system path. In other words, whereas IGPs such as RIP make decisions on the number of routers to a specific destination, BGPv4 bases its decisions on the number of ASs to a specific destination. It is a so-called path vector protocol, and runs over TCP (port 179).
BGP routers in one AS speak BGP to routers in other ASs, where the ‘other’ AS might be that of an ISP, or another company. Companies with an international presence and a large, global WAN may also opt to have a separate AS on each continent (running OSPF internally) and run BGP between them in order to create a clean separation.
GGP comes in two ‘flavors’ namely ‘internal’ BGP (iBGP) and ‘external BGP’ (eBGP). IBGP is used within an AS and eBGP between ASs. In order to ascertain which one is used between two adjacent routers, one should look at the AS number for each router. BGP uses a formally registered AS number for entities that will advertise their presence in the Internet. Therefore, if two routers share the same AS number, they are probably using iBGP and if they differ, the routers speak eBGP. Incidentally, BGP routers are referred to as ‘BGP speakers’, all BGP routers are ‘peers’, and two adjacent BGP speakers are ‘neighbors.’
The range of non-registered (i.e. private) AS numbers is 64512–65535 and ISP typically issues these to stub ASs i.e. those that do not carry third-party traffic.
As mentioned earlier, iBGP is the form of BGP that exchanges BGP updates within an AS. Before information is exchanged with an external AS, iBGP ensures that networks within the AS are reachable. This is done by a combination of ‘peering’ between BGP routers within the AS and by distributing BGP routing information to IGPs that run within the AS, such as EIGRP, IS-IS, RIP or OSPF. Note that, within the AS, BGP peers do not have to be directly connected as long as there is an IGP running between them. The routing information exchanged consists of a series of AS numbers that describe the full path to the destination network. This information is used by BGP to construct a loop-free map of the network.
In contrast with iBGP, eBGP handles traffic between routers located on different ASs. It can do load balancing in the case of multiple paths between two routers. It also has a synchronization function that, if enabled, will prevent a BGP router from forwarding remote traffic to a transit AS before it has been established that all internal non-BGP routers within that AS are aware of the correct routing information. This is to ensure that packets are not dropped in transit through the AS.
When you have completed this chapter you should be able to:
The Host-to-Host communications layer (also referred to as the ‘Service’ layer, or as the ‘Transport’ layer in terms of the OSI model) is primarily responsible for ensuring end-to-end delivery of packets transmitted by IP. This additional reliability is needed to compensate for the lack of reliability in IP.
There are only two relevant protocols residing in this layer, namely TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). In addition to this, the Host-to-host layer includes the APIs (application programming interfaces) used by programmers to gain access to these protocols.
TCP is a connection-oriented protocol and is therefore reliable, although this word is used in a data communications context and not in the everyday sense. TCP establishes a connection between two hosts before any data is transmitted. Because a connection is set up beforehand, it is possible to verify that all packets are received on the other end and to arrange re-transmission in the case of lost packets. Because of all these built-in functions, TCP involves significant additional overhead in terms of processing time and header size.
TCP includes the following functions:
In order to achieve its intended goals, TCP makes use of ports and sockets, connection oriented communication, sliding windows, and sequence numbers/acknowledgments.
Whereas IP can route the message to a particular machine on the basis of its IP address, TCP has to know for which process (i.e. software program) on that particular machine it is destined. This is done by means of port numbers ranging from 1 to 65,535.
Port numbers are controlled by ICANN, the Internet Corporation for Assigned Names and Numbers. This task has previously been handled by its predecessor, IANA, the Internet Assigned Numbers Authority. These numbers can be divided into three groups.
In order to identify both the location and application to which a particular packet is to be sent, the IP address (location) and port number (process) is combined into a functional address called a socket. The IP address is contained in the IP header and the port number is contained in the TCP or UDP header.
In order for any data to be transferred under TCP, a socket must exist both at the source and at the destination. TCP is also capable of creating multiple sockets for the same port.
A fundamental notion in the TCP design is that every byte of data sent over the TCP connection has a unique 32-bit number. Of course this number cannot be sent along with every byte, yet it is nevertheless implied. However, the number of the first byte in each segment is included in the accompanying TCP header as a ‘Sequence Number’, and for each subsequent byte that number is simply incremented by the receiver in order to keep track of the bytes.
Before any data transmission takes place, both sender and receiver (e.g. client and server) have to agree on the Initial Sequence Numbers (ISNs) to be used. This process is described under ‘establishing a connection’. Since TCP supports bi-directional operation, both client and server will decide on their individual ISNs for the connection, even though data may only flow in one direction for that specific connection.
The sequence number cannot start at zero every time, as it will create serious problems in the case of short-lived multiple sequential connections between two machines. A packet with a Sequence Number from an earlier connection could easily arrive late, during a subsequent connection. The receiver would have difficulty in deciding whether the packet belongs to a former or to the current connection. It is easy to visualize a similar problem in real life. Imagine tracking a parcel carried by UPS if all agents started issuing tracking numbers beginning with zero every morning.
The Sequence Number is generated by means of a 32-bit software counter that starts at 0 during boot-up and increments at a rate of about once every 4 microseconds (although this varies depending on the operating system being used). When TCP establishes a connection, the value of the counter is read and used as the proposed ISN. This creates an apparently random choice of the ISN.
At some point during a connection the counter could roll over from (232 – 1) and start counting from 0 again. The TCP software takes care of this.
TCP acknowledges data received on a per segment basis, although several consecutive segments may be acknowledged at the same time. Since TCP chooses the segment size in such a way that it fits into the data field of the Data Link layer protocol, each segment is associated with a different frame.
The Acknowledgment Number returned to the sender to indicate successful delivery equals the number of the last byte received plus 1; hence it points to the next expected sequence number. For example: 10 bytes are sent, with Sequence Number equal to 33. This means that the first byte is numbered 33 and the last byte is numbered 42. If received successfully, an Acknowledgment Number of 43 will be returned. The sender now knows that the data has been received properly, as it agrees with that number.
The original version of TCP does not issue selective acknowledgments; so if a specific segment contains errors, the Acknowledgement Number returned to the sender will point to the first byte in the defective segment. This implies that the segment starting with that Sequence Number, and all subsequent segments (even though they may have been transmitted successfully) have to be retransmitted. This creates significant problems for long-delay (satellite) transmissions and in this regard a Selective Acknowledgement (SACK) option has been proposed in RFC 2018.
From the previous paragraph it should be clear that a duplicate acknowledgement received by the sender means that there was an error in the transmission of one or more bytes following that particular Sequence Number.
The Sequence Number and the Acknowledgment Number in a specific header are not related at all. The Sequence Number relates to outgoing data; the Acknowledgement Number refers to incoming data. During the connection establishment phase the Sequence Numbers for both hosts are set up independently; hence these two numbers will never bear any resemblance to each other.
Obviously there is a need to get some sort of acknowledgment back to ensure that there is a guaranteed delivery. This technique, called positive acknowledgment with retransmission, requires the receiver to send back an acknowledgment message within a given time. The transmitter starts a timer so that if no response is received from the destination node within a given time, another copy of the message will be transmitted. An example of this situation is given in Figure 7.2.
The sliding window form of positive acknowledgment is used by TCP, as it is very time consuming waiting for individual acknowledgments to be returned for each packet transmitted. Hence the idea is that a number of packets (with cumulative number of bytes not exceeding the window size) are transmitted before the source may receive an acknowledgment to the first message (due to time delays, etc). As long as acknowledgments are received, the window slides along and the next packet is transmitted.
During the TCP connection phase each host will inform the other side of its permissible window size. The Window size typically defaults to between 8192 and 32767 bytes. The number is contained in the registry and can be altered manually or by means of ‘tweaking’ software. For example, a Window size of 8192 bytes means that, using Ethernet, five full data frames comprising 5 × 1460 = 7300 bytes can be sent without acknowledgment. At this stage the Window size has shrunk to less than 1000 bytes, which means that unless an ACK is generated, the sender will have to suspend its transmission or send a smaller packet.
The Window field in the TCP header is 16 bits long, giving a maximum window size of 65535 bytes. This is insufficient for satellite links and a solution to this problem will be discussed in chapter 18.
A three-way SYN/ SYN_ACK/ACK handshake (as indicated in Figure 7.3) is used to establish a TCP connection. As this is a full duplex protocol it is possible (and necessary) for a connection to be established in both directions at the same time.
As mentioned before, TCP generates pseudo-random sequence numbers by means of a 32-bit software counter. The host establishing the connection reads a value ‘x’ from the counter where x can vary between 0 and (232 –1) and inserts it in the Sequence Number field. It then sets the SYN flag to ‘1’ and transmits the header (no data yet) to the appropriate IP address and port number. Assuming that the chosen sequence number was 132, this action would then be abbreviated as SYN 132.
The receiving host (e.g. the server) acknowledges this by incrementing the received Sequence Number by one, and sending it back to the originator as an Acknowledgment Number. It also sets the ACK flag to ‘1’ in order to indicate that this is an acknowledgment. This results in an ACK 133. The first byte sent by the client would therefore be numbered 133. Before acknowledging, the server obtains its own Sequence Number (y), inserts it in the header, and sets the SYN flag in order to establish a connection in the opposite direction. The header is then sent off to the originator (the client), conveying this message e.g. SYN 567. The composite ‘message’ contained within the header would thus be ACK 133, SYN 567.
The client, upon receiving this, notes that its own request for a connection has been complied with, and also proceeds to acknowledge the server’s request with an ACK 568. Two-way communication is now established.
An existing connection can be terminated in several ways.
One of the hosts can request to close the connection by setting the FIN flag. The other host can acknowledge this with an ACK, but does not have to close immediately as it may need to transmit more data. This is known as a half-close. When the second host is also ready to close, it will send a FIN that is acknowledged with an ACK. The resulting situation is known as a full close.
Alternatively, either of the nodes can terminate its connection with the issue of RST. However, RST is generally used when either node suspects a faulty connection, for example when it receives a frame with a Sequence Number that clearly does not belong to the current connection.
Both situations are depicted in Figure 7.4.
TCP normally breaks the data stream into what it regards are appropriately sized segments, based on some definition of efficiency. However, this may not be swift enough for an interactive keyboard application. Hence the Push instruction (PSH bit in the flag field) used by the application program forces delivery of bytes currently in the stream and the data will be immediately delivered to the process at the receiving end.
Both the transmitting and receiving nodes need to agree on the maximum size segments they will transfer. This is specified in the options field.
On the one hand TCP ‘prefers’ IP not to perform any fragmentation as this leads to a reduction in transmission speed due to the fragmentation process, and a higher probability of loss of a packet and the resultant retransmission of the entire packet.
On the other hand, there is an improvement in overall efficiency if the data packets are not too small and a maximum segment size is selected that fills the physical packets that are transmitted across the network. The current specification recommends a maximum segment size of 536 (this is the 576 byte default size of an X.25 frame minus 20 bytes each for the IP and TCP headers). If the size is not correctly specified, for example too small, the framing bytes (headers etc) consume most of the packet size resulting in considerable overhead. Refer to RFC 879 for a detailed discussion on this issue.
Generally speaking, the MSS is always 40 bytes less than the Maximum Transmission Unit (MTU) specified for a link, the difference being the combined size of the default-size IP and TCP headers.
The TCP frame consists of a header plus data and is structured as follows:
The various fields within the header are as follows:
Source Port: 16 bits
The source port number.
Destination Port: 16 bits
The destination port number.
Sequence Number: 32 bits
The number of the first data byte in the current segment, except when the SYN flag is set. If the SYN flag is set, a connection is still being established and the sequence number in the header is the initial sequence number (ISN). The first data byte to be sent will then be numbered ISN+1.
Refer to the discussion on Sequence Numbers.
Acknowledgment Number: 32 bits
If the ACK flag is set, this field contains the value of the next sequence number the sender of this message is expecting to receive. Once a connection has been established this is always sent.
Refer to the discussion on Acknowledgment Numbers.
Data Offset: 4 bits
The number of 32- bit words in the TCP header. (Similar to IHL in the IP header.) This indicates where the data begins. The size of the TCP header (even if it includes options) is always a multiple of 32 bits.
Reserved: 6 bits
Reserved for future use. Must be zero.
Control bits (flags): 6 bits
(From left to right)
Checksum: 16 bits
The checksum field is the 16-bit one’s complement of the one’s complement sum of all 16-bit words in the header and data. If a segment contains an odd number of header and data bytes to be check-summed, the last byte is padded on the right with zeros to form a 16-bit word for checksum purposes. The pad is not transmitted as part of the segment. While computing the checksum, the checksum field itself is replaced with zeros. This is known as the ‘standard Internet checksum’, and is the same as the one used for the IP header.
The checksum also covers a 96-bit ‘pseudo header’ conceptually appended to the TCP header. This pseudo header contains the source IP address, the destination IP address, the protocol number for TCP (0x06), and the TCP PDU length. This pseudo header is only used for computational purposes and is NOT transmitted. This gives TCP protection against misrouted segments.
Window: 16 bits
The number of data bytes, beginning with the one indicated in the acknowledgement field, that the sender of this segment is willing or able to accept.
Refer to the discussion on Sliding Windows.
Urgent Pointer: 16 bits
Urgent data is placed in the beginning of a frame, and the urgent pointer points at the last byte of urgent data (relative to the sequence number i.e. the number of the first byte in the frame). This field is only being interpreted in segments with the URG control bit set.
Options: variable length
Options may occupy space at the end of the TCP header and are a multiple of 8 bits in length. All options are included in the checksum.
The second protocol that occupies the Host-to-Host layer is UDP. As in the case of TCP, it makes use of the underlying IP to deliver its PDUs.
UDP is a connectionless or non-connection-oriented protocol and does not require a connection to be established between two machines prior to data transmission. It is therefore said to be an ‘unreliable’ protocol – the word ‘unreliable’ used here as opposed to ‘reliable’ in the case of TCP.
As in the case of TCP, packets are still delivered to sockets or ports. However, since no connection is established beforehand, UDP cannot guarantee that packets are retransmitted if faulty, received in the correct sequence, or even received at all. In view of this one might doubt the desirability of such an unreliable protocol. There are, however, some good reasons for its existence.
Sending a UDP datagram involves very little overhead in that there are no synchronization parameters, no priority options, no sequence numbers, no retransmit timers, no delayed acknowledgement timers, and no retransmission of packets. The header is small; the protocol is fast and streamlined functionally. The only major drawback is that delivery is not guaranteed. UDP is therefore used for communications that involve broadcasts, for general network announcements, or for real-time data. A particularly good application is with streaming video and streaming audio where low transmission overheads are a prerequisite, and where retransmission of lost packets is not only unnecessary but also definitely undesirable.
The format of the UDP frame as well as the interpretation of its fields are described RFC 768. The frame consists of a header plus data and contains the following fields:
Source Port: 16 bits
This is an optional field. When used, it indicates the port of the sending process, and may be assumed to be the port to which a reply should be addressed in the absence of any other information. If not used, a value of zero is inserted.
Destination Port: 16 bits
As for source port
Message Length: 16 bits
This is the length, in bytes, of the datagram and includes the header as well as the data. (This means that the minimum value of the length is eight.)
Checksum: 16 bits
This is the 16-bit one’s complement of the one’s complement sum of all the bytes in the UDP Pseudo Header, the UDP header and the data, all strung together sequentially. The bytes are added together in chunks of two bytes at a time and, if necessary, padding in the form of ‘0’s are added to the end to make the whole string a multiple of two bytes.
The Pseudo Header is used for checksum generation only and contains the source IP address from the IP header, the destination IP address from the IP header, the protocol number from the IP header, and the Length from the UDP header. As in the case of TCP, this header is used for computational purposes only, and discarded after the computation. This information gives protection against misrouted datagrams. This checksum procedure is the same as is used in TCP.
If the computed checksum is zero, it is transmitted as all ‘1’s (the equivalent of ‘0’ in one’s complement arithmetic). An all-zero transmitted checksum value means that the transmitter generated no checksum. This is done for debugging or for higher level protocols that do not care.
UDP is designated protocol number 0x17 (23 decimal or 00010001 binary- see PROTO field in the UDP pseudo header) when used in conjunction with IP.
When you have completed study of this chapter you should have a basic understanding of the application and operation of the following application layer protocols:
This chapter examines the uppermost (Application) layer of the TCP/IP stack. Protocols at this layer act as intermediaries between user applications such as clients or servers (external to the stack) and the Host-to-Host layer protocols such as TCP or UDP. An example is HTTP, which acts as an interface between a web server (e.g. Apache) or a web client (e.g. IE6) and TCP.
The list of protocols supplied here is by no means complete, as new protocols are developed all the time. Using a developer’s toolkit such as WinSock, software developers can interface their own Application layer protocols to the TCP/IP protocol stack.
FTP (File Transfer Protocol) is specified in RFC 765. File transfer requires a reliable transport mechanism, and therefore TCP connections are used. The FTP process running making the file transfer request is called the FTP client, while the FTP process receiving the request is called the FTP server.
The process involved in requesting a file is as follows:
The control connection can now be used for another data transfer, or it can be closed
These commands are exchanged between the FTP client and FTP server. Each internal protocol command comprises a four-character ASCII sequence terminated by a new-line (<CRLF>) character. Some commands also require parameters. The use of ASCII character sequences for commands allows the user to observe and understand the command flow, and aids the debugging process. The user can communicate directly with the server program by using these codes, but in general it is much easier to use dedicated client software (e.g. CuteFTP) for this purpose.
FTP commands can be divided into three categories namely service commands, transfer parameter commands and access control commands. There is also a series of reply codes. Here follows a brief summary of the commands and reply codes.
Service commands
These commands define the operation required by the requester. The format of the pathname depends on the specific FTP server being used.
RETR<SP><pathname><CRLF> | Retrieve a copy of the file from the server |
STOR<SP><pathname><CRLF> | Store data at the server |
STOU<CRLF> | Store unique |
APPE<SP><pathname><CRLF> | Append |
ALLO<SP><decimal integer> | |
[<SP>R<SP><decimal integer>]<CRLF> | Allocate storage |
REST<SP><marker><SP> | Restart transfer at checkpoint |
RNFR<SP><pathname><CRLF> | Rename from |
RNTO<SP><pathname><CRLF> | Rename to |
ABOR<CRLF> | Abort previous service command |
DELE<SP><pathname><CRLF> | Delete file at server |
RMD<SP><pathname><CRLF> | Remove directory |
MKD<SP><pathname><CRLF> | Make directory |
PWD<CRLF> | Print working directory |
LIST<SP><pathname><CRLF> | List files or text |
NLST<SP><pathname><CRLF> | Name list |
SITE<SP><string><CRLF> | Site parameters |
SYST<CRLF> | Determine operating system |
STAT<SP><pathname><CRLF> | Status |
HELP[<SP><string>]CRLF | Help information |
NOOP<CRLF> | No operation |
Transfer parameter commands
These commands are used to alter the default parameters used to transfer data on an FTP connection.
PORT<SP><host-port><CRLF> | Specify the data port to be used. |
PASV<CRLF> | Request server DTP to listen on a data port |
TYPE<SP><type code><CRLF> | Representation type: ASCII, EBCDIC, image, or local. |
STRU<SP><structure code><CRLF> | File structure: file, record or page. |
MODE<SP><mode code><CRLF> | Transmission mode: stream, block or compressed |
Access control commands
These commands are invoked by the server and determine which users may access a particular file.
USER<SP><username> <CRLF> | User name |
PASS<SP><password><CRLF> | User password |
ACCT<SP><acc. information><CRLF> | User account |
CWD<SP><pathname><CRLF> | Change working directory |
CDUP<CRLF> | Change to parent directory |
SMNT<SP><pathname><CRLF> | Structure mount |
REIN<CRLF> | Terminate user and re-initialize |
QUIT<CRLF> | Logout |
<SP> | Space character |
<CRLF> | Carriage return, line feed characters |
Reply codes
FTP uses a three-digit return code ‘xyz’ followed by a space to indicate transfer conditions. The first digit (value 1–5) indicates whether a response is good, bad or incomplete. The second and third digits are encoded to provide additional information about the reply. The values for the first digit are:
Value | Description |
1yz | Action initiated. Expect another reply before sending a new command. |
2yz | Action completed. Can send a new command. |
3yz | Command accepted but on hold due to lack of information. |
4yz | Command not accepted or completed. Temporary error condition exists. Command can be reissued. |
5yz | Command not accepted or completed. Don’t reissue – reissuing the command will result in the same error. |
The second digit provides more detail about the condition indicated by the first digit:
Value | Description |
x0z | Syntax error or illegal command |
x1z | Reply to request for information |
x2z | Reply that refers to connection management |
x3z | Reply for authentication command |
x5z | Reply for status of server |
The third digit of the reply code also provides further information about the condition, but the meanings vary between implementations.
Although designed for use by applications, FTP software usually also provides interactive access to the user, with a range of commands that can be used to control the FTP session. There are several dozen commands available to the user, but for normal file transfer purposes very few of them ever need to be used.
Command | Description |
ASCII | Switch to ASCII transfer mode |
binary | Switch to binary transfer mode |
cd | Change directory on the server |
cdup | Change remote working directory to parent directory |
close | Terminate the data connection |
del | Delete a file on the server |
dir | Display the server directory |
get | Get a file from the server |
help | Display help |
ls | List contents of remote directory |
lcd | Change directory on the client |
mget | Get several files from the server |
mput | Send several files to the server |
open | Connect to a server |
put | Send a file to the server |
pwd | Display the current server directory |
quote | Supply a file transfer protocol (FTP) command directly |
quit | Terminate the file transfer protocol (FTP) session |
trace | Display protocol codes |
verbose | Display all information |
To execute a command, the user types the commands at the ftp prompt, e.g.
ftp>close
A list of available user commands can be viewed by typing help at the ftp prompt, e.g.
ftp> help close
After logging into another machine using FTP, the user is still logically connected to the (local) client machine. This is different to TELNET, where the user is logically connected to the (remote) server machine. References to directories and movements of files are therefore relative to the client machine. For example, getting a file involves moving it from the server to the client; putting a file involves moving it from the client to the server.
It may be wise to create a special directory on the client computer just for the transfer of files into and out of the client’s system. This helps guard against accidental file deletion, and allows easier screening of incoming files for viruses.
Most FTP clients have a GUI-based interface that displays the file systems of the local and the remote machines in two separate windows and allows file transfers from one machine to another by mouse movements on the screen.
Most UNIX machines act as FTP servers by default. A daemon process watches the TCP command port (21) continuously for the arrival of a request for a connection and calls the necessary FTP processes when one arrives. Windows does not include FTP server software, but Internet Explorer has a built-in FTP client.
Anonymous FTP access allows a client to access publicly available files using the login name ‘anonymous’ and the password ‘guest’. Alternatively the password may be required to resemble an e-mail address. Public files are often placed in separate directories on FTP servers.
Trivial File Transfer Protocol (RFC 1350) is less sophisticated than FTP, and caters for situations where the complexity of FTP and the reliability of TCP is neither desired nor required. TFTP does not log on to the remote machine; so it does not provide user access and file permission controls.
TFTP is used for simple file transfers and typically resides in the ROM of diskless machines such as PLCs that use it for bootstrapping or to load applications.
The absence of authorization controls can be overcome by diligent system administration. For example, on a UNIX system, a file may only be transferred if it is accessible to all users on the remote machine (i.e. both read and write permissions are set).
TFTP does not monitor the progress of the file transfer so does not need the reliable stream transport service of TCP. Instead, it uses UDP, with time-out and retransmission mechanisms to ensure data delivery. The UDP source and destination port fields are used to create the socket at each end, and TFTP Transfer IDentifiers (TIDs) ranging between 0 and 65,535 are created by TFTP and passed to UDP to be placed in the UDP header field as a source port number. The destination (server) port number is the well-known port 69, reserved for TFTP.
Data is relayed in consecutively numbered blocks of 512 bytes. Each block must be acknowledged, using the block number in the message header, before the next block is transmitted. This system is known as a ‘flip-flop’ protocol. A block of less than 512 bytes indicates the end of the file. A block is assumed lost and re-sent if an acknowledgment is not received within a certain time period. The receiving end of the connection also sets a timer and, if the last block to be received was not the end of file block, the receiver will re-send the last acknowledgment message once the time-out has elapsed.
TFTP can fail for many reasons and almost any kind of error encountered during the transfer will cause complete failure of the operation. An error message sent either in place of a block of data or as an acknowledgment terminates the interaction between the client and the server.
There are five TFTP package types, distinguished by an operation code (‘opcode’) field. They are:
Opcode | Operation |
1 | Read request (RRQ) |
2 | Write request (WRQ) |
3 | Data (DATA) |
4 | Acknowledgment (ACK) |
5 | Error (ERROR) |
The frames for the respective operations are constructed as follows:
RRQ/WRQ frames
The various fields are as follows:
DATA frames
The file name does not need to be included as the IP address and UDP protocol port number of the client are used as identification.
The fields are as follows:
ACK frames
These frames are sent to acknowledge each block that arrives. TFTP uses a ‘lock-step’ method of acknowledgment, which requires each data packet to be acknowledged before the next can be sent.
The fields are as follows:
Error frames
An error message causes termination of the operation.
The fields are:
TELNET (TELecommunications NETwork) is a simple remote terminal protocol that enables virtual terminal capability across a network. That is, a user on machine A can log in to a server on machine B across a network. If the remote machine has a TELNET server installed, user A can log into it and execute keyboard commands on machine B as if he is operating on his own machine.
The procedure for connecting to a remote computer depends on how the user access permissions are set up. The process is generally menu driven. Some remote machines require the user to have an account on the machine and will request a username and password. However, many information resources are available to the user without an account and password. TELNET is also often used to gain access to managed devices such as switches and routers.
TELNET achieves a connection to the remote server via its well-known port number, using either the server’s domain name or its IP address, and then passes keystrokes to the remote server and receives output back from it.
TELNET treats both ends of the connection similarly, so that software at either end of a connection can negotiate the parameters that will control their interaction. It provides a set of options such as type of character set to be used (7-bit or 8-bit), type of carriage-return character to be recognized (e.g. CR or LF) etc, which can be negotiated to suit the client and the server. It is possible for a machine to act as both client and server simultaneously, enabling the user to log into other machines while other users log into his machine.
In the case of a server capable of managing multiple concurrent connections, TELNET will listen for new requests and then create a new instantiation (or ‘slave’) to deal with each new connection.
The TELNET protocol uses the concept of a Network Virtual Terminal (NVT) to define each end of a connection. NVT uses standard 7-bit US ASCII codes to represent printable characters and control codes such as ‘move right one character’, ‘move down one line’, etc. 8-bit bytes with the high order bit set are used for command sequences. Each end has a virtual keyboard that can generate characters (it could represent the user’s keyboard or some other input stream such as a file) and a logical printer that can display characters (usually a terminal screen). The TELNET programs at either end handle the translation from virtual terminal to physical device. As long as this translation is possible, TELNET can interconnect any type of device. When the connection is first established and the virtual terminals are set up, they are provided with codes that indicate which operations the relevant physical devices can support.
An operating system usually reserves certain ASCII keystroke sequences for use as control functions. For example, an application running on UNIX operating systems will not receive the Ctrl-C keystroke sequence as input if it has been reserved for interrupting the currently executing program. TELNET must therefore define such control functions so that they are interpreted correctly at both ends of the connection. In this case, Ctrl-C would be translated into the TELNET IP command code.
TELNET does not use ASCII sequences to represent command codes. Instead, it encodes them using an escape sequence. This uses a reserved byte, called the ‘Interpret As Command’ (IAC) byte, to indicate that the following byte contains a control code. The actual control code can be represented as a decimal number, as follows:
Command | Decimal Value | Meaning |
---|---|---|
EOR | 239 | End of record |
SE | 240 | End of option sub-negotiation |
NOP | 241 | No operation |
DMARK | 242 | Data mark – the data stream part of a SYNCH (always marked by TCP as urgent) |
BRK | 243 | Break |
IP | 244 | Interrupt process – interrupts or terminates the active process |
AO | 245 | Abort output – allows the process to run until completion, but does not send the end of record command |
AYT | 246 | Are you there – used to check that an application is functioning at the other end |
EC | 247 | Erases a character in the output stream |
EL | 248 | Erases a line in the output stream |
GA | 249 | Go ahead – indicates permission to proceed when using half-duplex (no echo) communications |
SB | 250 | Start of option sub-negotiation |
WILL | 251 | Agreement to perform the specified option or confirmation that the specified option is now being performed |
WON’T | 252 | Refusal to perform the specified option or confirmation that the specified option will no longer be performed |
DO | 253 | Asks for the other end to perform the specified option, or acknowledges that the other end will perform the specified option |
DON’T | 254 | Demands that the other end stops performing the specified option, or confirmation that the other end is no longer performing the specified option |
IAC | 255 | Interpret as command i.e. interpret the next octet as a command. When the IAC octet appears as data, the 2-octet sequence sent will be IAC-IAC |
The IAC character to have the above meanings must precede the control code. For example, the two-octet sequence IAC-IP (or 255-244) would induce the server to abort the currently executing program.
The following command options are used by TELNET:
Option Code | Meaning |
---|---|
0 | Transmit binary – change transmission to 8-bit binary |
1 | Echo |
2 | Reconnection |
3 | Suppress go ahead – i.e. no longer send go-ahead signal after data |
4 | Approximate message size negotiation |
5 | Status request – used to obtain the status of a TELNET option from the remote machine. |
6 | Request timing mark – used to synchronize the two ends of a connection |
7 | Remote controlled transmission and echo |
8 | Output line width |
9 | Output page length |
10 | Output carriage-return action |
11 | Output horizontal tab stop setting |
12 | Output horizontal tab stop action |
13 | Output form feed action |
14 | Output vertical tab stop setting |
15 | Output vertical tab stop action |
16 | Output line feed action |
17 | Extend ASCII characters |
18 | Logout |
24 | Terminal type – used to exchange information about the make and model of a terminal being used |
25 | End of record – sent at end of data |
28 | Terminal location number |
31 | Window size |
34 | Line-mode – uses local editing and sends complete lines instead of individual characters. |
The two-byte sequence may be followed by a third byte containing optional parameters.
An optional code of 1 indicates ‘ECHO’; therefore, the three octets sequence 255-251-1 means ‘WILL ECHO’ and instructs the other end to begin echoing back the characters that it receives. A command sequence of 255-252-1 indicates that the sender either will not echo back characters or wants to stop echoing.
The negotiation of options allows clients and servers to optimize their interaction. It is also possible for newer versions of TELNET software that provide more options to work with older versions, as only the options that are recognized by both ends are negotiated.
If the server application malfunctions and stops reading data from the TCP connection, the operating system buffers will fill up until TCP eventually indicates to the client system a window size of zero, thus preventing further data flow from the client. In such a situation TELNET control codes will not be read and therefore will have no effect. To bypass the normal flow control mechanism, TELNET uses an ‘out of band’ signal. Whenever it places a control signal in the data stream, it also sends a SYNCH command and appends a data mark octet. This induces TCP to send a segment with the URGENT DATA flag set, which reaches the server directly and causes it to read and discard all data until it finds the data mark in the data stream, after which it returns to normal processing.
TELNET programs are freely available and can be downloaded via the Internet. Windows 95/98 included a simple Windows-type interface called Microsoft TELNET 1.0 but, in later versions of Windows, Microsoft has reverted to command-line instructions.
The options for the TELNET command line input is as follows:
telnet [-a][-e escape char][-f log file][-l user][-t term][host[port]]
-a | Attempt automatic logon. Same as –l option except uses the currently logged on user’s name. |
-e | Escape character to enter telnet client prompt. |
-f | File name for client side logging. |
-l | Specifies the user name to log in with on the remote system. Requires that the remote system support the TELNET ENVIRON option |
-t | Specifies terminal type. Supported term types are vt100, vt52, ansi and vtnt only. |
host | Specifies the hostname or IP address of the remote computer to connect to. |
port | Specifies the port number or service name. |
In its simplest form, the TELNET command for connecting to the POP3 server on host 192.168.0.1 would be
C:\>telnet 192.168.0.1 110
On a small TCP/IP network each individual host can (optionally) have a list of names by which it refers to other hosts. These given names can differ from host to host and are not to be confused with the ‘computer name’ (NetBIOS name) entered on each machine. They can be considered nicknames (e.g. ‘Charlie’ instead of ‘192.176.44.56’) and the mapping between these names and their associated IP addresses is maintained as a ‘flat’ database in the Hosts file on each host. The resolver process on each host translates host names into IP addresses by a simple lookup procedure. On Windows XP this file is located at C:\windows\systen32\drivers\etc\hosts. The correct Hosts file has no extension, unlike the sample Hosts file stored as hosts.sam.
In a large network the HOSTS files would have to be identical on all machines. The maintenance of these files to reflect additions and changes can become quite a tedious task. On the Internet, with its millions of names, this becomes impossible.
The Domain Name System (DNS) provides a network-wide (and in the case of the Internet – a world-wide) directory service that maps host names against IP addresses. For most users this is a transparent process and it is to them irrelevant whether the resolution takes place via a hosts file or via DNS.
When the IP address of a specific destination host has to be resolved, the DNS resolver on the requesting host contacts its designated Local (DNS) Name Server, the IP address of which is contained in the IP configuration of the requesting host. The Local Name Server then, in turn, contacts the Root Domain Name Server (maintained by InterNIC) in order to locate the Primary Name Server that contains the required domain name records. The Root Domain Name Server returns the IP address of the appropriate Primary Name Server to the Local Name Server, which now contacts the appropriate Primary Name Server directly and requests the IP address for the given name. If the Primary Name Server cannot find the name, the request goes to the Secondary Name Server.
The Primary and Secondary name servers maintain a tree-structured directory database. The collective database stored on all the DNS Name Servers forms a global namespace of all the hosts that can be referenced anywhere on the Internet.
The Internet naming scheme hierarchical namespace
The original Internet namespace was ‘flat’ i.e. it had no hierarchical structure. At this stage it was still administered by the Network Information Center (NIC). The task eventually became too large because of the rapidly increasing number of hosts and a hierarchical (tree-structured) namespace was adopted. At present, the ultimate responsibility for the maintenance of this namespace is vested in the ICANN.
In a domain name the most local domain is written first and the most global domain is written last. The domain name purdue.edu might identify Purdue University. This domain name is registered against a specific IP address. The administrator of this domain name may now create sub-domains such as, say, cs.purdue.edu for the computer science department at Purdue University. The administrator of the computer science department, in turn, may assign a Fully Qualified Domain Name (FQDN) to an individual host, such as computer1.cs.purdue.edu.
If a user is referring to a specific host within a local network, a FAQN is not needed, as the DNS resolver will automatically supply the missing high-level domain name qualifier.
The following commands are therefore equivalent when an ftp client and ftp server are located on the same network:
Standard domain names
The original namespace contained a set of standard top-level domains without any reference to a specific country. Since the original Internet was not envisaged to exist beyond the borders of the United States, the absence of any reference to a country implies an organization within the USA.
The following are some of the common top-level domains administered by ICANN. More detailed information can be obtained from www.icann.org.
Domain names for the .com, .net and .org domains can be obtained from various accredited re-sellers.
Country codes
As the Internet backbone was extended into countries other than the USA, the top-level domain names were extended with a two-letter country code as per ISO 3166 (e.g. uk for the United Kingdom, au for Australia, za for South Africa, ca for Canada). The complete list of all Country Code Top-Level Domains (CCTLDs) can be obtained from the ICANN website. This site also contains the basic information for each CCTLD such as the governing agency, administrative and technical contact names telephone and fax numbers, and server information. This information can also be obtained from the Network Solutions web site.
DNS clients and servers
Each host on a network that uses the DNS system runs the DNS client software, which implements the resolver function. Designated servers, in turn, implement the DNS name server functions. In processing a command that uses a domain name instead of an IP address, the TCP/IP software automatically invokes the DNS resolver function.
On a small network one name server may be sufficient and the name server software may run on a machine already used for other server purposes (such as a Windows 2003 machine acting as a file server). On large networks it is prudent to run at least two Local Name Servers for reasons of availability, viz. a primary and a secondary name server. On large internetworks it is also common to use multiple Primary Name Servers, each of which contains a portion of the namespace. It is also possible to replicate portions of the namespace across several servers in order to increase availability.
A network connected to the Internet needs access to at least one Primary Name Server and one Secondary Name Server, both capable of performing naming operations for the registered domain names on the Internet. In the case of the Internet, the number of domain names is so large that the namespace is distributed across multiple Primary and Secondary servers in different countries. For example, all the co.za domain names are hosted across several name servers located in South Africa.
The DNS client resolver software can implement a caching function by storing the results from the name resolution operation. In this way the resolver can resolve a future query by looking up the cache rather than actually contacting the name server. Cache entries are given a Time To Live so that they are purged from the cache after a given amount of time.
DNS frame format
The message format for DNS messages is as follows.
The DNS message contains a query type field, since the name server database consists of many types of information. The following list shows some of the types:
WINS is not a general TCP/IP Application layer protocol, but rather a Microsoft Windows-specific utility with the primary role of NetBIOS name registration and resolution on TCP/IP. In many respects WINS is like DNS. However, while DNS resolves TCP/IP host names to static IP addresses, WINS resolves NetBIOS names to dynamic addresses assigned by DHCP.
WINS maintains a database on the WINS server. This database provides a computer name to IP address mapping, allowing computers on the network to interconnect on the basis of machine names. It also prevents two machines from registering the same name. With traditional NetBIOS name resolution techniques that rely on broadcast messages, it is not possible to browse across an IP router. WINS overcomes this problem by providing name resolution regardless of host location on the network. Since it reduces the number of the broadcast packets normally used to resolve NetBIOS names, it can also improve the network performance.
WINS uses a client/server model and, in order to run it on a network, at least one WINS server is needed. The WINS server must have a statically assigned IP address, entered into the TCP/IP configuration for all machines on the network that want to take advantage of the WINS server for name resolution and name registration.
WINS is configured on XP client machines by selecting Control Panel ->Network Connections ->Local Area Connection ->Properties ->Internet Protocol (TCP/IP) ->Properties ->Advanced ->WINS.
When a WINS client is turned on for the first time it tries to register its NetBIOS name and IP address with the WINS server by sending a name registration request via UDP. When the WINS server receives the request it checks its database to ensure the requested NetBIOS name is not in use on the network. If the name registration is successful, the server sends a name registration acknowledgment back to the client. This acknowledgment includes the Time To Live for the name registration. The TTL indicates how long the WINS server will keep the name registration before canceling it. It is the responsibility of the WINS client to send a name refresh request to the WINS server before the name expires, in order to keep the name.
If the client tries to register a name that is already in use, the WINS server sends a denial message back to the client. The client than displays a message informing the user that the computer name is already in use on the network.
When a WINS client shuts down it sends a name release request to the WINS server, releasing its name from the WINS database.
When a WINS-enabled client needs to resolve the NetBIOS name to IP address, it uses a resolution method called ‘h-node name resolution’, which includes the following procedures:
WINS proxy agents are used to allow non-WINS-enabled clients to interact with a WINS service. A WINS proxy agent listens to the local network for clients trying to use broadcasting to resolve NetBIOS names. The WINS proxy agent picks these requests off the network and forwards them to the WINS server, which responds with the resolved IP address. The WINS proxy agent then provides this information to the client requesting the name resolution.
The advantage of this system is that there is no need to make any changes to the existing non-WINS-enabled clients, and in fact they are completely unaware that the name resolution has been provided by the WINS service.
The Simple Network Management Protocol (SNMP) is an Application layer protocol that facilitates the exchange of management information between network devices. It allows network administrators to manage network performance, find and solve network problems, and plan for network growth.
There are three versions of SNMP namely SNMPv1, SNMPv2 and SNMPv3). They all have a number of features in common, but SNMPv2 includes enhancements such as additional protocol operations, and SNMPv3 adds security and administration.
An SNMP managed network consists of three key components namely Managed Devices, Agents and Network Management Systems:
Managed devices are monitored and controlled using four basic SNMP commands namely read, write, trap, and traversal operations:
A Management Information Base (MIB) is a collection of information that is organized hierarchically. MIBs are accessed using a network management protocol such as SNMP. They comprise managed objects and are identified by object identifiers.
A managed object (sometimes called a ‘MIB object’, an ‘object’, or a ‘MIB’) is one of any number of specific characteristics of a managed device. Managed objects comprise one or more ‘object instances’, which are essentially variables.
There are two types of managed objects, namely scalar and tabular. Scalar objects define a single object instance. Tabular objects define multiple related object instances that are grouped together in MIB tables. An example of a managed object is ‘at Input’, which is a scalar object containing a single object instance viz. the integer value that indicates the total number of input Novell Netware packets on a router interface. An object identifier (or object ID) uniquely identifies a managed object in the MIB hierarchy. The MIB hierarchy can be depicted as a tree with a nameless root, the levels of which are assigned by different organizations.
The top-level MIB object IDs belong to different standards organizations, while lower-level object IDs are allocated by associated organizations. Vendors can define private branches that include managed objects for their own products. MIBs that have not been standardized are typically positioned in the experimental branch. The managed object at Input can be uniquely identified either by the object name viz. iso.identified-organization. dod.internet.private.enterprise.cisco.temporary_variables.Novell.atInput, or by the equivalent object descriptor 1.3.6.1.4.1.9.3.4.1.
SNMP is a simple request–response protocol. The network-management system issues a request, and managed devices return responses. This behavior is implemented by using one of four protocol operations viz. Get, GetNext, Set, and Trap.
The Get, GetNext, and Set operations used in SNMPv2 are exactly the same as that used in SNMPv1. SNMPv2, however, adds and enhances some operations. The SNMPv2 Trap operation, for example, serves the same function as that used in SNMPv1. However, it uses a different message format and is designed to replace the SNMPv1 Trap. SNMPv2 also defines two additional protocol operations: GetBulk and Inform:
SNMPv1 and SNMPv2 are, however, not secure, with the result that hackers can easily exploit them to gain unauthorized access to SNMP-enabled devices. These weaknesses have been addressed in SNMPv3, which supports authentication, privacy and access control.
TCP/IP defines an electronic messaging protocol named Simple Mail Transfer Protocol or SMTP. SMTP is used by e-mail clients such as Outlook Express or Eudora to send messages and files to an e-mail server on a remote network, usually that of an ISP.
SMTP defines the interchange between the user’s e-mail client and the ISP’s mail server. It does not define how the mail is to be delivered to the ultimate recipient. Although mail is normally forwarded to an SMTP server by an e-mail client, it is also possible to log directly into the server via TELNET and interact with command-line entries.
The first step in the transmission of the data is the connection setup, whereby the SMTP client opens a TCP connection to the remote SMTP server at port 25. The client then sends an optional ‘Helo’ (sic) command and the SMTP server sends a reply indicating its ability to receive mail. TELNET users will have to enter the IP address or the domain name of the server (e.g. smtp1.attglobal.net), the relevant port number (25) and possibly a terminal type (e.g. VT-100). The first two items are necessary for TCP to create the socket.
The second step in the process involves the actual mail transfer. Mail transfer begins with a ‘Mail From’ command containing the name of the sender, followed by a ‘Receipt’ command indicating the recipient. A ‘Data’ command is followed by the actual message. SMTP can be considered a reliable delivery service in that the underlying TCP protocol ensures correct delivery to the SMTP server. SMTP, however, neither guarantees nor offers mechanisms for reliable delivery from the SMTP server to the recipient.
When the message transfer is complete another message can be sent, the direction of transfer changed, or the connection closed. Closing the connection involves the SMTP client issuing a ‘Quit’ command. Both sides then execute a TCP close operation in order to release the connection.
SMTP commands begin with a four-byte command code (in ASCII), which can be followed by an argument. The SMTP replies use the same format as FTP, i.e. a 3-digit numeric value followed by a text string. Here follows some SMTP commands.
In the following example a simple ASCII string is sent to a recipient via a TELNET connection. The ‘period’ (full stop), in a separate line, is required to send the message. Some lines have been printed in italics to indicate that they are responses from the server.
220 prserv.net – Maillenium ESMTP/ MULTIBOX out4 #30
MAIL FROM: john@hotmail.com
250 ok
RCPT TO: deb@iinet.net.au
250 ok; forward to <deb@iinet.net.au)
DATA
354 ok
‘This is only a test message.’
.
250 ok, id = 2000042800030723903cb80fe
QUIT
The current version of the Post Office Protocol is POP3. POP3 uses the well-known port 110. Like SMTP, it involves a client running on a local machine and a server running on a remote machine. POP3 is very much the opposite of SMTP in that its function is to retrieve mail from a remote POP3 server to a local POP3 client.
It was developed to ease the load on mail servers. Instead of multiple clients logging in for long periods to a remote mail server (as is the case, for example, with Hotmail) the POP3 client makes a quick connection to the mail server, retrieves the mail (optionally deleting it from the server), then breaks the connection. As in the case of SMTP, it uses a TCP connection for this purpose. Unlike SMTP, proper authentication with a user name and a password is required.
POP3 commands include the following.
The following example shows the interaction with a POP3 server via a
TELNET connection. Server responses are printed in italics.
+OK POP Server Version 1.0 at pop1a.prserv.net
USER auinet.deb
+OK Password required for deb
PASS geronimo
+OK deb has 2 messages (750 octets)_
LIST
+OK 2 messages (75 octets)
1 374
2 376
TOP 1 10
+OK 274 octets
Received from out4.prserv.net [32.97.166.34] by in2.prserv.net id 956880005.59882-1:
Fri, 28 Apr 2000- 00:00:00 +0000
Received: from <unknown domain> ([129:37:1675:208] by prserv.net (out4) with SMTP
Id <2000042723591123901fj001e; Thu, 27 Apr 2000 23:59:48: +0000
This is only a test message.’
QUIT
The Bootstrap Protocol BOOTP (RFC 951) is a modern alternative to RARP. When a diskless workstation (for example a PLC) is powered up, it broadcasts a BOOTP request on the network.
A BOOTP server hears the request, looks up the requesting client’s MAC address in its BOOTP file, and responds by telling the requesting client machine the server’s IP address, its NetBIOS name, and the fully qualified name of the file that is to be loaded into the memory of the requesting machine and executed at boot-up
Although BOOTP is an alternative to RARP, it operates in an entirely different way. RARP operates at the Data Link layer and the RARP packets are contained within the local network (e.g. Ethernet) frames; hence it cannot cross any routers. With BOOTP the information is carried by UDP via IP, hence it can operate across routers and the server can be several hops away from the client. Although BOOTP uses IP and UDP, it is still small enough to fit within a bootstrap ROM on a client workstation.
Figure 8.9 depicts the BOOTP message format.
RFC 1532 and RFC 1533 contain subsequent clarifications and extensions to BOOTP.
The Dynamic Host Configuration Protocol, as defined by RFC 1533, 1534, 1541 and 1542, was developed out of BOOTP in order to centralize and streamline the allocation of IP addresses. DHCP’s purpose is to centrally control IP-related information and eliminate the need to manually keep track of the allocation of individual IP addresses.
When TCP/IP starts up on a DHCP-enabled host, a request is broadcast requesting an IP address and a subnet mask. The DHCP server, upon hearing the request, checks its internal database and replies with an IP address. DHCP can also respond with a default gateway address, DNS address(es), or the address of a NetBIOS name server such as a WINS server. When the IP offer is accepted, it is extended to the client for a specified period of time, called a lease. If the DHCP server runs out of IP addresses, no IP addressing information can be offered to the clients, causing TCP/IP initialization to fail.
DHCP’s advantages include the following:
It does, however, have some drawbacks such as:
IP lease request
This is the first step in obtaining an IP address under DHCP. It is initiated by a TCP/IP host, configured to obtain an IP address automatically when booting up. Since the requesting host is not aware of its own IP address, or that belonging to the DHCP server, it will use 0.0.0.0 and 255.255.255.255 respectively. This is known as a ‘DHCP discover’ message. The broadcast is created with UDP ports 67 (BOOTP client) and 68 (BOOTP server). This message contains the MAC address and NetBIOS name for the client system to be used in the next phase of sending a lease offer. If no DHCP server responds, the client may try several times before giving up and resorting to other tactics, such as APIPA (Automatic IP Address Allocation) if supported by the Operating System. APIPA will allow the machine to assume an IP address in the range 169.254/16.
IP lease offer
The second phase involves the actual information given by all DHCP servers that have valid addressing information to offer. Their offers consist of an IP address, subnet mask, lease period (in seconds), and the IP address of the proposing DHCP server. These offers are sent to the requesting client’s MAC address. The pending IP address offer is reserved temporarily to prevent it from being taken simultaneously by other machines, which would otherwise create chaos. Since multiple DHCP servers can be configured, it also adds a degree of fault tolerance, should one of the DHCP servers go down.
IP lease selection
During this phase, the client machine selects the first IP address offer it receives. The client replies by broadcasting an acceptance message, requesting to lease IP information. Just as in stage one, this message will be broadcast as a DHCP request, but this time, it will additionally include the IP address of the DHCP server of which the offer was accepted. All other DHCP servers will then revoke their offers.
IP lease acknowledgment
The accepted DHCP server proceeds to assign an IP address to the client and then sends an acknowledgement message, called a DHCPACK, back to the client. Occasionally, a negative acknowledgment, called a DHCPNACK, is returned. This type of message is most often generated if the client is attempting to re-lease its old IP address, which has since been reassigned elsewhere. Negative acceptance messages can also mean that the requesting client has an inaccurate IP address, resulting from physically changing locations to an alternate subnet.
After this final phase has been successfully completed, the client machine integrates the new IP information into its TCP/IP configuration. It is then usable with all utilities, as if it has been manually entered into the client host.
Lease renewal:
Regardless of the length of time an IP address is leased, the leasing client will send a DHCPREQUEST to the DHCP server when its lease period has elapsed by 50%. If the DHCP server is available, and there are no reasons for rejecting the request, a DHCP acknowledge message is sent to the client, updating the configuration and resetting the lease time. If the server is unavailable, the client will receive an ‘eviction’ notice stating that it had not been renewed. In this event, that client would still have a remaining 50% lease time, and would be allowed full usage privileges for its duration. The rejected client would react by sending out an additional lease renewal attempt when 87.5% of its lease time had elapsed. Any available DHCP server could respond to this DHCPREQUEST message with a DHCPACK, and renew the lease. However, if the client received a DHCPNACK (negative) message, it would have to stop using the IP address immediately, and start the leasing process over, from the beginning.
Lease release
If the client elects to cancel the lease, or is unable to contact the DHCP server before the lease elapses, the lease is automatically released.
Note that DHCP leases are not automatically released at system shutdown. A system that has lost its lease will attempt to re-lease the same address that it had previously used.
The DHCP message format is based on the BOOTP format, and is illustrated in Fig 8.11.
The fields are as follows:
When you have completed study of this chapter you should able to apply the following utilities:
The TCP/IP utilities are mentioned throughout the book. This section is designed to bring them all together in one section for ease of reference, as they are very important in network management and troubleshooting.
Most of the older utilities are DOS-based. However, many sophisticated Windows-based utilities are available nowadays, many of them as freeware or shareware.
PINGing is one of the easiest ways to test connectivity across the network and confirm that an IP address is reachable. The DOS ping utility (ping.exe) uses ICMP to forward an Echo Request message to the destination address. The destination then responds with an ICMP Echo Reply message. Although the test seems trivial at first sight, it is a powerful diagnostic tool and can demonstrate correct operation between the Internet layers of two hosts across a WAN regardless of the distance and number of intermediate routers involved.
Technically speaking the PING utility can only target an IP address and not, for example, a NetBIOS name or domain name. This is due to the fact that the ICMP messages are carried within IP datagrams, which require the source and destination IP addresses in their headers. Without this feature it would have been impossible to ping across a router. If, therefore, the user does not know the IP address of the target host, the name resolver on the local host has to look it up e.g. via the Domain Name System or in the HOSTS file.
The IP datagram, in turn, is transported by means of a Network Interface layer frame (e.g. Ethernet), which requires, in its header, the MAC addresses of the sender and the next recipient on the local network. If this is not to be found in the ARP cache, the ARP protocol is invoked in order to obtain the MAC address. The result of this action (the mapping of MAC address against IP address) is then stored in the ARP cache. The easiest way to get an overall impression of the process is to capture the events described here by means of a protocol analyzer.
If the IP address is known, the following format can be used:
If the IP address is unknown, one of the following ways can be used to define the target machine:
There are several options (switches) available under the ping command, as shown below:
C:\WINDOWS>ping
Usage: ping [-t] [-a] [-n count] [-l size] [-f] [-i TTL] [-v TOS] [-r count] [-s count] [[-j host-list][-k host-list]][-w timeout] destination-list
|
Ping the specified host until stopped To see statistics and continue – type Control-Break To stop – type Control-C |
||
|
Resolve addresses to hostnames | ||
|
count | Number of echo requests to send | |
|
size | Send buffer size | |
|
Set don’t fragment flag in packet | ||
|
TTL | Time To Live | |
|
TOS | Type of service | |
|
count | Record route for count hops | |
|
count | Time-stamp for count hops | |
|
host-list | Loose source route along host-list | |
|
host-list | Strict source route along host-list | |
|
timeout | Timeout in milliseconds to wait for each reply |
The following examples show how some of the ping options can be applied:
Here are some examples of what could be learnt by using the ping command.
Example 1: A host with IP address 207.194.66.100/24 is being ‘pinged’ by another host on the same subnet, i.e. with the same NetID. Note that the screen display differs between operating systems, although the basic parameters are the same.
The following response is obtained:
C:\WINDOWS>ping 207.194.66.100
Pinging 207.194.66.100 with 32 bytes of data:
Reply from 207.194.66.100: bytes=32 time<10ms TTL=128
Reply from 207.194.66.100: bytes=32 time=1ms TTL=128
Reply from 207.194.66.100: bytes=32 time=1ms TTL=128
Reply from 207.194.66.100: bytes=32 time=1ms TTL=128
Ping statistics for 207.194.66.100:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milliseconds:
Minimum = 0ms, Maximum = 1ms, Average = 0ms
C:\WINDOWS>
From the result, the following can be observed:
Example 2: A host with IP address 207.194.66.101/24 is now ‘pinged’. Although this host is, in fact, nonexistent, it seems legitimate since the NetIDs match. The originating host will therefore attempt a ping, but a timeout will occur.
C:\WINDOWS>ping 207.194.66.101
Pinging 207.194.66.101 with 32 bytes of data:
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Ping statistics for 207.194.66.101:
Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
Approximate round trip times in milliseconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms
C:\WINDOWS>
Example 3. As before, but this time the NetID differs i.e. the target host is assumed to reside on a different network. Since, in this case, no default gateway has been specified, the originating host does not even attempt to issue an ICMP message, and immediately issues a ‘host unreachable’ response.
C:\WINDOWS>ping 208.194.66.100
Pinging 208.194.66.100 with 32 bytes of data:
Destination host unreachable.
Destination host unreachable.
Destination host unreachable.
Destination host unreachable.
Ping statistics for 208.194.66.100:
Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
Approximate round trip times in milliseconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms
C:\WINDOWS>
The (DOS) PING command is not particularly ‘user friendly’. It is, for example, not possible to ping a large number of hosts sequentially. There are, however, several Windows-based Ping utilities available as freeware or shareware, of which TJPingPro is an example.
The following example shows how a block of contiguous IP addresses can be pinged with a single ‘click’, after setting up ‘start’ and ‘end’ IP addresses on the options screen.
The ARP utility (arp.exe) is used to display the ARP cache that holds the IP to MAC address translation table of hosts on the local subnet. This utility is not to be confused with the Address Resolution Protocol that actually determines the IP to MAC address translation. The ARP utility can also be used to manually add entries to the cache, using the -s option.
C:\WINDOWS>arp
Displays and modifies the IP-to-physical address translation tables used by address resolution protocol (ARP).
ARP -s | inet_addr eth_addr [if_addr] ARP -d inet_addr [if_addr] ARP -a [inet_addr] [-N if_addr] |
-a | Displays current ARP entries by interrogating the current protocol data. If inet_addr is specified, the IP and physical addresses for only the specified computer are displayed. If more than one network interface uses ARP, entries for each ARP table are displayed. |
-g | Same as -a. |
inet_addr | Specifies an Internet address. |
-N if_addr | Displays the ARP entries for the network interface specified by if_addr. |
-d | Deletes the host specified by inet_addr. |
-s | Adds the host and associates the Internet address inet_addr with the physical address eth_addr. The physical address is given as 6 hexadecimal bytes separated by hyphens. The entry is permanent. |
eth_addr | Specifies a physical address. |
if_addr | If present, this specifies the Internet address of the interface whose address translation table should be modified. If not present, the first applicable interface will be used. |
> arp -s 157.55.85.212 00-aa-00-62-c6-09 …Adds a static entry.
> arp –a …. Displays the arp table.
The following shows a typical display in response to the arp -a command. Note the third column, which indicates type. Entries in the arp cache can be entered manually as static entries, but that poses a problem as IP addresses can be changed and physical network cards (and hence MAC addresses) can be swapped, rendering the stored IP to MAC address mapping useless unless updated. For this reason the ARP protocol binds IP addresses and physical (MAC) addresses in a temporary (i.e. dynamic) way. Dynamic entries are deleted from the cache after a few minutes, unless used.
C:\WINDOWS>arp -a
Interface: 0.0.0.0 on Interface 0x1000002
Internet Address Physical Address Type 192.100.100.7 00-00-c6-f6-34-43 static 192.100.100.99 00-00-fe-c6-57-a8 dynamic C:\WINDOWS>
This is used for obtaining protocol statistics and current active connections using TCP/IP. Nowadays there are many Windows-based utilities that can do much more; yet in an emergency NETSTAT is certainly better than nothing at all. Here follows the NETSTAT options.
C:\WINDOWS>netstat /?
Displays protocol statistics and current TCP/IP network connections.
NETSTAT [-a] [-e] [-n] [-s] [-p proto] [-r] [interval]
-a | Displays all connections and listening ports. |
-e | Displays Ethernet statistics. This may be combined with the -s option. |
-n | Displays addresses and port numbers in numerical form. |
-p proto | Shows connections for the protocol specified by proto; proto may be TCP or UDP. If used with the -s option to display per-protocol statistics, proto may be TCP, UDP, or IP. |
-r | Displays the routing table. |
-s | Displays per-protocol statistics. By default, statistics are shown for TCP, UDP and IP; the -p option may be used to specify a subset of the default. |
interval | Re-displays selected statistics, pausing interval seconds between each display. Press CTRL+C to stop re-displaying statistics. If omitted, netstat will print the current configuration information once. |
C:\WINDOWS> |
In response to the netstat -e command the following packet and protocol statistics are displayed. This is a summary of events on the network since the last re-boot.
C:\WINDOWS>netstat -e | ||
Interface Statistics | ||
Received | Sent | |
Bytes | 2442301 | 1000682 |
Unicast packets | 4769 | 3776 |
Non-unicast packets | 113 | 4566 |
Discards | 0 | 0 |
Errors | 0 | 0 |
Unknown protocols | 19 | |
C:\WINDOWS> |
The ‘-p’ option is also helpful as it shows the current connections for a given protocol. For example, C:\>netstat –p tcp shows the current TCP connections
This provides protocol statistics and current TCP/IP connections using NBT (NetBIOS over TCP/IP).
C:\WINDOWS>nbtstat /?
Displays protocol statistics and current TCP/IP connections using NBT (NetBIOS over TCP/IP).
NBTSTAT [-a RemoteName] [-A IP address] [-c] [-n] [-r]
[-R] [-s] [S] [interval] ]
-a | (adapter status) Lists the remote machine’s name table given its name |
-A | (Adapter status) Lists the remote machine’s name table given its IP address. |
-c | (cache) Lists the remote name cache including the IP addresses |
-n | (names) Lists local NetBIOS names. |
-r | (resolved) Lists names resolved by broadcast and via WINS |
-R | (Reload) Purges and reloads the remote cache name table |
-S | (Sessions) Lists sessions table with the destination IP addresses |
-s | (sessions) Lists sessions table converting destination IP addresses to host names via the hosts file. |
RemoteName | Remote host machine name. |
IP address | Dotted decimal representation of the IP address. |
Interval | Re-displays selected statistics, pausing interval seconds between each display. Press Ctrl+C to stop re-displaying statistics. |
C:\WINDOWS> |
This shows the entire TCP/IP configuration present in a host. It also has the additional versatility of interfacing with a DHCP server to renew a leased IP address.
IPCONFIG will return, amongst other things, the host’s IP address, its subnet mask and default gateway.
C:\WINDOWS>ipconfig /?
Windows 98 IP Configuration
Command line options:
/All – Display detailed information. /Batch [file] – Write to file or ./WINIPCFG.OUT /renew_all – Renew all adapters. /release_all – Release all adapters. /renew N – Renew adapter N. /release N – Release adapter N. C:\WINDOWS>
An option often used is ipconfig /all. This command will display the configuration details of all NICs as well as that of the dial-up connection.
Note that IPCONFIG will list the generic name of the adapter. Therefore, a 3010 3Com US Robotics 56K modem is simply listed as a PPP adapter, while a Linksys Ethernet 10BaseT/10Base2 Combo PCMCIA card is listed as a generic Novell 2000 adapter.
C:\WINDOWS>ipconfig /all
Windows 98 IP Configuration
Host Name . . . . . . . . . : COMPUTER100 DNS Servers . . . . . . . . : Node Type . . . . . . . . . : Broadcast NetBIOS Scope ID. . . . . . : IP Routing Enabled. . . . . : No WINS Proxy Enabled. . . . . : No NetBIOS Resolution Uses DNS : No
0 Ethernet adapter :
Description . . . . . . . . : PPP Adapter. Physical Address. . . . . . : 44-45-53-54-00-00 DHCP Enabled. . . . . . . . : Yes IP Address. . . . . . . . . : 0.0.0.0 Subnet Mask . . . . . . . . : 0.0.0.0 Default Gateway . . . . . . : DHCP Server . . . . . . . . : 255.255.255.255 Primary WINS Server . . . . : Secondary WINS Server . . . : Lease Obtained. . . . . . . : Lease Expires . . . . . . . :
1 Ethernet adapter :
Description . . . . . . . . : Novell 2000 Adapter. Physical Address. . . . . . : 00-E0-98-71-57-AF DHCP Enabled. . . . . . . . : No IP Address. . . . . . . . . : 207.194.66.100 Subnet Mask . . . . . . . . : 255.255.255.224 Default Gateway . . . . . . : Primary WINS Server . . . . : Secondary WINS Server . . . : Lease Obtained. . . . . . . : Lease Expires . . . . . . . :
C:\WINDOWS>
WINIPCFG (Windows IP Configuration for Windows 95/98) and WNTIPCFG (Windows NT IP Configuration for NT, 2000 and XP) provide the same information as IPCONFIG, but in a Windows format. Like IPCONFIG, it is capable to force a DHCP server into releasing and re-issuing leased IP addresses.
Click the ‘more details’ tab for an expanded view.
This is often used to trace a specific TCP/IP communications path. The spelling of the command varies slightly. For UNIX it is traceroute, for Windows it is tracert.
The following figure shows the TRACERT options.
C:\WINDOWS>tracert
Usage: tracert [-d] [-h maximum_hops] [-j host-list] [-w timeout] target_name
-d | Do not resolve addresses to hostnames. |
-h maximum_hops | Maximum number of hops to search for target. |
-j host-list | Loose source route along host-list. |
-w timeout | Wait timeout milliseconds for each reply. |
C:\WINDOWS>
Here follows a route trace from Perth, Australia, to a server in the USA.
C:\WINDOWS>tracert www.idc-online.com
Tracing route to www.idc-online.com [216.55.154.228] over a maximum of 30 hops:
1 | 169ms 160ms 174ms slip202-135-15-3-0.sy.au.ibm.net [202.135.15.30] |
2 | 213ms 297ms 296ms 152.158.248.250 |
3 | 624ms 589ms 533ms sfra1sr1-2-0-0-5.ca.us.prserv.net [165.87.225.46] |
4 | 545ms 535ms 628ms sfra1sr2-101-0.ca.us.prserv.net [165.87.33.185] |
5 | 564ms 562ms 573ms 165.87.160.193 |
6 | 558ms 564ms 573ms 114.ATM3-0.XR1.SFO1.ALTER.NET [146.188.148.210] |
7 | 574ms 701ms 555ms 187.at-2-10.TR1.SAC1.ALTER.NET [152.63.50.230] |
8 | 491ms 480ms 500ms 127.at-6-10.TR1.LAX9.ALTER.NET [152.63.5.101] |
9 | 504ms 534ms 511ms 297.ATM7-0.XR1.LAX2.ALTER.NET [152.63.112.149] |
10 | 500ms 478ms 491ms 195.ATM9-0-0.GW2.SDG1.ALTER.NET [146.188.249.81] |
11 | 491ms 564ms 584ms anet-gw.customer.ALTER.NET [157.130.224.154] |
12 | 575ms 554ms 613ms www.idc-online.com [216.55.154.228] |
Trace complete.
C:\WINDOWS>
As is often the case, the DOS approach is not the most user-friendly option. Notice the result when the same type of trace (albeit to a different address) is done with TJPingPro. The same TCP/IP protocols are still used, but now they are accessed through a third-party application program.
The most comprehensive tracing is, however, done via application programs such as Neotrace. The following figures give some of the results of a trace to the same location used for the previous two examples.
Even a single-homed host needs to make routing decisions. These are made on the basis of information contained in the route table. This table is automatically built by Windows on the basis of the hosts’s IP configuration. The ROUTE command can be used to manually configure network routing tables on TCP/IP hosts. This may be a tedious task but is sometimes necessary for reasons of security or because a specific route has to be added.
The following shows the route command options (switches).
C:\WINDOWS>route /?
Manipulates network routing tables.
ROUTE [-f] [command [destination] [MASK netmask] [gateway] [METRIC metric]]
-f Clears the routing tables of all gateway entries. If this is used in conjunction with one of the commands, the tables are cleared prior to running the command. command Must be one of four: Prints a route ADD Adds a route DELETE Deletes a route CHANGE Modifies an existing route destination Specifies the destination host. MASK Specifies that the next parameter is the ‘netmask’ value. netmask Specifies a subnet mask value to be associated with this route entry. If not specified, it defaults to 255.255.255.255. METRIC Specifies that the next parameter metric’ is the cost for this destination
All symbolic names used for destination are looked up in the network database file NETWORKS. The symbolic names for gateway are looked up in the host name database file HOSTS.
If the command is PRINT or DELETE, wildcards may be used for the destination and gateway, or the gateway argument may be omitted.
Diagnostic notes:
Invalid MASK generates an error, that is when (DEST & MASK) != DEST.
Example> route ADD 255.0.0.0 157.0.0.0 MASK 155.0.0.0
157.55.80.1
The route addition failed: 87
Examples:
> route PRINT
> route ADD 157.0.0.0 MASK 255.0.0.0 157.55.80.1 METRIC 3
^destination ^mask ^gateway ^metric
> route PRINT
> route DELETE 157.0.0.0
> route PRINT
C:\WINDOWS>
The route table resides on all TCP/IP hosts and is not to be confused with the routing table maintained by individual routers, as the exact details of these depend on the routing protocol used by the router, such as RIP or OSPF. An individual entry is read from left to right as follows: ‘If a message is destined for network 157.0.0.0, with subnet mask 255.0.0.0, then route it through to the gateway (router) address 157.55.80.1’. Remember that a HostID equal to 0, as used here, does not refer to a specific host but rather to a network as a whole. The ‘metric’ is a yardstick for making routing decisions. The calculation thereof is a topic on its own, but in the simplest case it could represent ‘hops’. Any destination on the local network is one hop, with one hop added for each additional router to be traversed.
Routes can also be added with the route add and route delete commands.
Route add 192.100.100.0 mask 255.255.255.0 192.100.100.1 will add a route and route delete 192.100.100.0 will delete a particular route. Manual adding of routes are sometimes necessary, for example in the case where the installation of dial-up proxy server software on a given host overwrites the existing default gateway setting on that host in order to ‘point’ to the ISP’s default gateway. This makes it impossible for the host to reach an existing adjacent network across the intermediate router, unless a manual entry is made. If said entry ‘does the job’ but disappears when the host is re-booted, the appropriate route command needs to be included in the autoexec.bat file or made persistent if that option is available for the operating system being used, through the use of the [-p] switch.
The following response was obtained from the route print command.
Active routes: | ||||
Network Address | Netmask | Gateway Address | Interface | Metric |
127.0.0.0 | 255.0.0.0 | 127.0.0.1 1 | 27.0.0.1 | 1 |
207.194.66.96 | 255.255.255.224 | 207.194.66.100 | 207.194.66.100 | 1 |
207.194.66.100 | 255.255.255.255 | 127.0.0.1 | 127.0.0.1 | 1 |
207.194.66.255 | 255.255.255.255 | 207.194.66.100 | 207.194.66.100 | 1 |
224.0.0.0 | 224.0.0.0 | 207.194.66.100 | 207.194.66.100 | 1 |
255.255.255.255 | 255.255.255.255 | 207.194.66.100 | 0.0.0.0 | 1 |
C:\WINDOWS.000> |
The first column, Network Address, is the destination and can contain the host address, the subnet address, the network address or the default gateway. The search order is in this sequence, from host address (most unique route) to default gateway (most generic). In the example above:
The netmask dictates what portion of the destination IP address must match the Network Address for a specific route to be used. When the Network Address and Netmask are written in binary, then where the Netmask is a ‘1’ the destination address must match the Network address, but where the Netmask is ‘o’ they need not match.
The HOSTS file is used on UNIX and Windows systems to resolve the mapping of a ‘name’ (any given name) to an IP address.
The following is an example of a typical Windows hosts file. The file (‘hosts’) is located in c:\WINNT\system32\drivers\etc for Windows 2000 and C:\Windows\drivers\etc for XP. If a user is uncertain about the correct format of the entries, a sample file is stored in the same folder as hosts.sam. As a matter of convenience the hosts.sam file can be edited as in the accompanying example, but it MUST then be saved as hosts only, i.e. without the ‘.sam’ extension.
In the example, host 192.100.100.2 can be interrogated by typing ping john.
The following is the contents of the Windows XP hosts file.
When you have completed this chapter you should be able to:
In the design of an Ethernet system there are a number of different components to be used. These include:
The lengths of LAN segments are limited due to physical and/or collision domain constraints and there is often a need to increase this range. This can be achieved by means of a number of interconnecting devices, ranging from repeaters to gateways. It may also be necessary to partition an existing network into separate networks for reasons of security or traffic overload.
In modern network devices the functions mentioned above are often mixed:
These examples are not meant to confuse the reader, but serve to emphasize the fact that the functions should be understood, rather than the ‘boxes’ in which they are packaged.
A repeater operates at the Physical layer of the OSI model (layer 1) and simply retransmits incoming electrical signals. This involves amplifying and re-timing the signals received on one segment onto all other segments, without considering any possible collisions. All segments need to operate with the same media access mechanism and the repeater is unconcerned with the meaning of the individual bits in the packets. Collisions, truncated packets or electrical noise on one segment are transmitted onto all other segments.
Repeaters are only used in legacy (CSMA/CD) Ethernet networks as they are not required in modern switched networks.
Repeaters are packaged either as stand-alone units (i.e. desktop models or small cigarette package-sized units) or 19-inch rack-mount units. Some of these can link two segments only, while larger rack-mount modular units (called concentrators) are used for linking multiple segments. Regardless of packaging, repeaters can be classified either as local repeaters (for linking network segments that are physically in close proximity), or as remote repeaters for linking segments that are some distance apart.
Several options are available:
Remote repeaters, on the other hand, have to be used in pairs with one repeater connected to each network segment and a fiber-optic link between the repeaters. On the network side they typically offer 10/100Base-T options. On the interconnecting side the choices include ‘single pair Ethernet’, using telephone cable up to several hundred meters in length, or single mode/multi-mode optic fiber, with various connector options. With 10BaseFL (backwards compatible with the old FOIRL standard), this distance can be up to 1.6 km bit with current technology single mode fiber links available up to several tens of kilometers.
Although repeaters are probable the cheapest way to extend a network, they do so without separating collision domains, or network traffic. They simply extend the physical size of the network. All segments joined by repeaters therefore share the same bandwidth and collision domain.
Media converters are essentially repeaters, but interconnect mixed media viz. copper and fiber. An example would be 10BaseT/ 10BaseFL. As in the case of repeaters, they are available in single and multi-port options, and in stand-alone or chassis type configurations. The latter option often features remote management via SNMP.
Models may vary between manufacturers, but generally Ethernet media converters support:
Distances supported are typically 2 km for 10BaseFL, 5 km for 100Base-FX (multimode) and 40 Km for 100Base-SX (single mode). An added advantage of the fast and Gigabit Ethernet media converters is that they support full duplex operation, which effectively doubles the available bandwidth.
Bridges operate at the Data Link layer of the OSI model (layer 2) and are used to interconnect two separate networks to form a single large continuous LAN. From a TCP/IP perspective, however, the overall network still remains one network with a single NetID. The bridge only divides the network up into two segments, each with its own collision domain and each retaining its full (say, 10 Mbps) bandwidth. Broadcast transmissions are seen by all nodes, on both sides of the bridge.
The bridge exists as a node on each network, although it does not have a MAC address of its own, and passes only valid messages across to destination addresses on the other network. The decision as to whether or not a frame should be passed across the bridge is based on the layer 2 address, i.e. the media (MAC) address. The bridge examines the destination MAC addresses of all frames on both sides to determine whether they should be forwarded across the bridge or not.
Bridges can be classified either as MAC or LLC bridges, the MAC sub-layer being the lower half of the Data Link layer and the LLC sub-layer being the upper half. For MAC bridges the media access control mechanism on both sides must be identical; thus it can bridge only Ethernet to Ethernet, Token Ring to Token Ring and so on. For LLC bridges, the Data Link layer protocol must be identical on both sides of the bridge (e.g. IEEE 802.2 LLC); however, the physical layers or MAC sub-layers do not necessarily have to be the same. Thus the bridge isolates the media access mechanisms of the networks. Data can therefore be transferred, for example, between Ethernet and Token Ring LANs. In this case, collisions on the Ethernet system do not cross the bridge nor do the tokens.
Bridges can be used to extend the length of a network (as with repeaters) but in addition they improve network performance. For example, if a shared (i.e. CSMA/CD) network is demonstrating fairly slow response times, the nodes that mainly communicate with each other can be grouped together on one segment and the remaining nodes can be grouped together in the other segment. The busy segment may not see much improvement in response rates (as it is already quite busy) but the lower activity segment may see quite an improvement in response times. Bridges should be designed so that 80% or more of the traffic remains within the segment and only 20% cross the bridge. Stations generating excessive traffic across the bridge should be identified by a protocol analyzer and relocated to another LAN.
Because bridges are store-and-forward devices, they truncate the collision domains i.e. collisions on one side of the bridge have no relevance on the other side.
Intelligent bridges (also referred to as transparent or spanning-tree bridges) are the most commonly used bridges because they are very efficient in operation and do not need to be taught the network topology. A transparent bridge learns and maintains two address lists corresponding to the networks it is connected to. When the bridge observes a frame from the one Ethernet segment, its source MAC address is added to the list of source addresses for that segment. The destination address is then compared with those on the address list for each segment and a decision is made whether to transmit the frame onto the other segment or not. If no address corresponding with the destination node is recorded in either of these two lists, the message is re-transmitted across the bridge to ensure that the message is delivered to the correct network. When the destination host replies, the bridge becomes aware of its location and updates the lists accordingly. Over a period of time the bridge learns about all the addresses on the network and thus avoids transmitting unnecessary traffic to adjacent segments. The bridge also maintains time-out data for each entry to ensure the table is kept up to date and old entries purged.
Transparent bridges cannot have loops as these could cause endless circulation of packets. If the network contains bridges that could form a loop as shown in Figure 10.3, any redundant paths need to be deactivated.
Refer to Fig 10.1. Assume that bridges A and B have not yet determined the whereabouts of node 2. Node 1 now sends a packet to node 2. Both bridges receive this packet on their ‘1’ inputs (A1 and B1). Since neither of them know where the destination node is, they simultaneously pass the packet to their ‘2’ side (A2 and B2). A2 now detects a broadcast message from B2 and passes it across to A1, since it does not know where the destination is. B2, in similar fashion, detects the broadcast message from A2 and passes it on to B1. The process is repeated and leads to an exponential increase in the number of packets on the network.
In order to solve the problem, the bridges need to communicate amongst themselves and decide on an optimum path between nodes A and B, and disable any redundant path(s). This is achieved by means of the Spanning Tree Algorithm (IEEE 802.1d). The bridges exchange configuration information via Bridge Protocol Data Units (BPDUs) and agree on the optimum path.
Local bridges have two network ports and hence interconnect two adjacent network segments at one point. This function is currently often performed by switches, being essentially intelligent multi-port bridges.
A very useful type of local bridge is a 10/100 Mbps Ethernet bridge, which allows interconnection of 10BaseT, 100Base-TX and 100Base-FX networks, thereby performing the required speed translation. These bridges typically provide full duplex operation on 100BastTX and 100Base-FX, and employ internal buffers to prevent saturation of the 10BaseT port.
Remote bridges, on the other hand, operate in pairs with some form of interconnection between them. This interconnection can be with or without modems, and include
RS-232, V.35, RS-422, RS-530, X.21, 4-wire, or fiber (both single and multi-mode). The distance between bridges can typically be up to 40 Km.
Wireless Ethernet bridges are available from several vendors. They typically transmit at1.5 Mbps and use the 900 MHz band, unlicensed in the USA, Israel and Australia.
Hubs are used to interconnect hosts in a physical star configuration. This section will deal with Ethernet hubs, which are of the 10/100Base-T variety. They are available in many configurations, some of which will be discussed below.
Smaller desktop units are intended for stand-alone applications, and typically have 5 to 8 ports. Stackable hubs, on the other hand, typically have up to 24 ports and can be stacked and interconnected to act as one large hub without any repeater count restrictions. These stacks are often mounted in 19-inch cabinets.
Shared hubs interconnect all ports on the hub in order to form a logical bus. This is typical of the cheaper workgroup hubs. All hosts connected to the hub share the available bandwidth since they all form part of the same collision domain.
Although they physically look alike, switched hubs (better known as switches) allow each port to retain and share its full bandwidth only with the hosts connected to that port. Each port (and the segment connected to that port) functions as a separate collision domain. This attribute will be discussed in more detail in the section on switches.
Because of the speed advantages of switches over shared hubs, and the decreasing cost of switches, hubs are seldom used in new applications.
Managed hubs have an on-board processor with its own MAC and IP address. Once the hub has been set up via a PC on the hub’s serial (COM) port, it can be monitored and controlled via the network using SNMP or RMON. The user can perform activities such as enabling/disabling individual ports, performing segmentation (see next section), monitoring the traffic on a given port, or setting alarm conditions for a given port.
On a non-segmentable shared hub, all hosts share the same bandwidth. On a segmentable hub, however, the ports can be grouped under software control into several shared segments. All hosts on each segment then share the full bandwidth on that segment, which means that a 24-port 10Base-T hub segmented into four groups effectively supports 40 Mbps. The configured segments are internally connected via bridges, so that all ports can still communicate with each other if needed.
Some hubs offer dual-speed ports, e.g. 10BaseT/100Base-T. These ports are auto-configured, i.e. each port senses the speed of the NIC connected to it, and adjusts its own speed accordingly. All the 10BaseT ports form a common low-speed internal segment, while all the 100Base-T ports form a common high-speed internal segment. The two internal segments are interconnected via a speed-matching bridge.
Some stackable hubs are modular, allowing the user to configure the hub by plugging in a separate module for each port. Ethernet options typically include both 10 and 100 Mbps, with either copper or fiber. These hubs are sometimes referred to as chassis hubs.
Stackable hubs are best interconnected by means of special stacking cables attached to the appropriate connectors on the back of the chassis.
An alternative method for non-stackable hubs is by ‘daisy-chaining’ an interconnecting port on each hub by means of a UTP patch cord. Care has to be taken not to connect the Transmit pins on the ports together (and, for that matter, the receive pins) – it simply will not work. This is similar to interconnecting two COM ports with a ‘straight’ cable i.e. without a null modem. Connect transmit to receive and vice versa by (a) using a crossover cable and interconnecting two ‘normal’ ports, or (b) using a normal (‘straight’) cable and utilizing a crossover port on one of the hubs. Some hubs have a dedicated uplink (crossover) port while others have a port that can be manually switched into crossover mode.
A third method that can be used on older hubs with a 10Base2 port is to create a backbone. Attach a BNC T-piece to each hub, and interconnect the T-pieces with RG-58 coax cable. The open connections on the extreme ends of the backbone obviously have to be terminated.
Fast Ethernet hubs need to be deployed with caution because the inherent propagation delay of the hub is significant in terms of the 5.12 microsecond collision domain size. Fast Ethernet hubs are classified as Class I, II or II+, and the class dictates the number of hubs that can be interconnected. For example, Class II dictates that there may be no more than two hubs between any given pair of nodes, that the maximum distance between the two hubs shall not exceed 5 m, and that the maximum distance between any two nodes shall not exceed 205m. The safest approach, however, is to follow the guidelines of the manufacturer.
Ethernet switches are an expansion of the concept of bridging and are, in fact, intelligent (self-learning) multi-port bridges. They allow frame transfers to be accomplished between any pair of devices on a network on a per-frame basis. Only the two ports involved ‘see’ the specific frame. Illustrated below is an example of an 8 port switch, with 8 hosts attached (one attached node per switch port). This comprises a physical star configuration, but it does not operate as a logical bus as an ordinary hub does. Since each port on the switch represents a separate segment with its own collision domain, it means that in this case there are only two devices on each segment, namely the host and the switch port. Hence, in this particular case, there can be no collisions on any segment!
In the sketch below hosts 1 & 7, 3 & 5 and 4 & 8 need to communicate at a given moment, and are connected directly for the duration of the frame transfer. For example, host 7 sends a packet to the switch, which determines the destination address, and directs the package to port 1 at 10 Mbps.
If host 3 wishes to communicate with host 5, the same procedure is repeated. Provided that there are no conflicting destinations, a 16-port switch could allow 8 concurrent frame exchanges at 10 Mbps, rendering an effective bandwidth of 80 Mbps. If the switch allowed full duplex operation, this figure could be doubled.
Switches have two basic architectures, viz. cut-through and store-and-forward. In the past, cut-through switches were faster because they examined the packet destination address only before forwarding the frame to the destination segment. A store-and-forward switch, on the other hand, accepts and analyses the entire packet before forwarding it to its destination. It takes more time to examine the entire packet, but it allows the switch to catch certain packet errors and keep them from propagating through the network. The speed of modern store-and-forward switches has caught up with cut-through switches so that the speed difference between the two is minimal. There are also a number of hybrid designs that mix the two architectures.
Since a store-and-forward switch buffers the frame, it can delay forwarding the frame if there is traffic on the destination segment, thereby adhering to the CSMA/CD protocol. In the case of a cut-through switch this could be a problem, since a busy destination segment means that the frame cannot be forwarded, yet it cannot be stored either unless the switch can buffer the packet. On older switches the solution was to force a collision on the source segment, thereby enticing the source host to retransmit the frame.
Layer 2 switches operate at the Data Link layer of the OSI model and derive their addressing information from the destination MAC address in the Ethernet header. Layer 3 switches, on the other hand, obtain addressing information from the Network layer, namely from the destination IP address in the IP header. Layer 3 switches are used to replace routers in LANs as they can do basic IP routing (supporting protocols such as and RIPv2) at almost ‘wire-speed’; hence they are significantly faster than routers. They do not, however, replace ‘real’ routers as Gateways to the WAN.
An additional advantage is full duplex Ethernet, where a device can simultaneously transmit AND receive data over one Ethernet connection. This requires an Ethernet NIC as well as a switch that supports full duplex. This enables two devices to transmit and receive simultaneously via a switch. The node automatically negotiates with the switch and uses full duplex if both devices can support it.
Full duplex is useful in situations where large amounts of data are to be moved around quickly, for example between graphics workstations and file servers.
Switches are very efficient in providing a high-speed aggregated connection to a server or backbone. Apart from the normal lower-speed (say, 100Base-TX) ports, switches have a high-speed uplink port (e.g. 1000Base-T). This port is simply another port on the switch, accessible by all the other ports, but features a speed conversion from 10 Mbps to 100 Mbps.
Assume that the uplink port was connected to a file server. If all the other ports (say, eight times 100Base-T) wanted to access the server concurrently, this would necessitate a bandwidth of 800 Mbps in order to avoid a bottleneck and subsequent delays. With a 100Base-TX uplink port this would create a serious problem. However, with a 1000Base-T uplink there is still 200 Mbps of bandwidth to spare.
Switches are very effective in backbone applications, linking several hub-based (CSMA/CD) LANs together as one, yet segregating the collision domains. An example could be a switch located in the basement of a building, linking the networks on different floors of the building. Since the actual ‘backbone’ is contained within the switch, it is known in this application as a ‘collapsed backbone’.
Provided that a LAN is constructed around switches that support VLANs, individual hosts on the physical LAN can be grouped into smaller Virtual LANs (VLANs), totally invisible to their fellow hosts. Unfortunately, the ‘standard’ Ethernet/ IEEE802.3 header does not contain sufficient information to identify members of each VLAN; hence, the frame had to be modified by the insertion of a ‘tag’ between the Source MAC address and the Type/Length fields. This modified frame is known as an IEEE 802.1Q tagged frame and is used for communication between the switches.
The IEEE802.1p committee has defined a standard for packet-based LANs that supports layer 2 traffic prioritization in a switched LAN environment. IEEE802.1p is part of a larger initiative (IEEE802.1p/Q) that adds more information to the Ethernet header (as shown in Fig 10.11) to allow networks to support VLANs and traffic prioritization.
802.1p/Q adds 32 bits to the header. Sixteen are used for a unique number (0x8100) to identify the frame as a ‘tagged’ frame, three are for a Priority tag, one is a Canonical Format Indicator (CFI) to indicate whether the MSB of the VLAN ID is on the left or the right, and twelve are for the VLAN ID number. This allows for eight discrete priority layers from 0 (low) to 7 (high), which supports different kinds of traffic in terms of their delay-sensitivity. Since IEEE802.1p/Q operates at layer 2, it supports prioritization for all traffic on the VLAN, both IP and non-IP. This introduction of priority layers enables so-called deterministic Ethernet where, instead of contending for access to a bus, a source node can pass a frame almost immediately to a destination node on the basis of its priority, and without risk of any collisions.
The alternative method for creating a VLAN is a ‘port based VLAN’. On a port-based VLAN switch the user must configure each port to accept packets (only) from certain other ports. Assume ports 1, 2 and 3 on a switch have to be set up as one VLAN. The switch will then be configured as follows:
EPSR is a reliable switching technology to protect the access network aligned with failures in the link, node and equipment of a network. EPSR selects primary and alternate paths for the service traffic. It takes less than 50 msec for failure detection and will switch over to alternate facilities.
EPSR, much like STP (Spanning Tree Protocol) provides a polling mechanism to detect ring-based faults and failover accordingly. It uses a fault detection scheme that alerts the ring about the break and indicates that it must take action. During a fault, the ring automatically heals itself by sending traffic over a protected reverse path.
Master Node is the controlling node for an EPSR Domain, responsible for status polling, collecting error messages, and controlling the flow of traffic in the Domain. All other nodes are transit nodes. Transit nodes generate failure notices and receive control messages from the Master.
Unlike bridges and layer 2 switches, routers operate at layer 3 of the OSI model, viz. the Network layer (or, the Internet layer of the DoD model). They therefore ignore address information contained within the Data Link layer header (the MAC addresses) and, instead, delve deeper into each frame and extract the address information contained in the Network layer. For IP this is the IP address.
Like bridges or switches, routers appear as hosts on the networks to which they are connected. They are connected to each participating network through a NIC, each with its own MAC address and IP address. Each NIC has to be assigned an IP address with the same NetID as that of the network it is connected to. The IP address of the router on each network is known as the Default Gateway for that network and each host on the internetwork requires at least one Default Gateway (but could have more). The Default Gateway is the IP address to which any host on that network must forward a packet if it finds that the NetID of the destination host and the local NetID does not match, which implies remote delivery of the packet.
A second major difference between routers and bridges or switches is that routers will not act autonomously, but instead have to be given the frames that need to be forwarded. Consequently, if the Default Gateway is not configured on a particular host, it is unable to access the router.
Because routers operate at the Network layer, they are used to transfer data between two networks that have the same Internet layer protocols (such as IP) but not necessarily the same Physical layer or Data Link layer protocols. Routers are therefore said to be protocol dependent, and have to be able to handle all the Internet layer protocols present on a particular network. A network using Novell Netware therefore requires routers that can accommodate IPX (Internet Packet exchange) – the Network layer component of SPX/IPX. If this network has to handle Internet access as well, it can only do this via IP, and hence the routers will need to be upgraded to handle both IPX and IP.
Routers maintain tables of the networks that they are connected to and of the optimum path to reach a particular destination network. They then direct the message to the next router along that path.
Multi-port routers have a modular construction and can interconnect several networks. The most common type of router is, however, a 2-port router. Since these are invariably used to implement WANs, they connect LANs to a ‘communications cloud’; the one port will be a local LAN port e.g. 100Base-TX, but the second port will be a WAN port such as X.25.
Routers within an Autonomous System normally communicate with each other using an Interior Gateway Protocol such as RIP. However, routers on the perimeter of the Autonomous System, which also communicate with remote Autonomous Systems, need to do that via an Exterior Gateway Protocol such as BGP-4. Whilst doing this, they still have to communicate with other routers within their own Autonomous System, e.g. via RIPv2. These routers are referred to as Border Routers.
It sometimes happens that a router is confronted with a layer 3 (Network layer) address it does not understand. In the case of an IP router, this may be a Novell IPX address. A similar situation will arise in the case of NetBIOS/NetBEUI, which is non-routable. A ‘brouter’ (bridging router) will revert to a bridge if it cannot understand the layer 3 protocol, and in this way make a decision as to how to deal with the packet. Most modern routers have this function built in.
Gateways are network interconnection devices, not to be confused with Default Gateways which are the IP addresses of the routers to which packets are forwarded for subsequent routing (indirect delivery).
A gateway is designed to connect dissimilar networks and could operate anywhere from layer 4 to layer 7 of the OSI model. In a worst case scenario, a gateway may be required to decode and re-encode all seven layers of two dissimilar network protocol stacks, for example when connecting an Ethernet network to an IBM SNA network. Gateways thus have the highest overhead and the lowest performance of all the internetworking devices. The gateway translates from one protocol to the other and handles differences in physical signals, data format, and speed.
Since gateways are, per definition, protocol converters, it so happens that a 2-port (WAN) router could also be classified as a gateway since it has to convert both layer 1 and layer 2 on the LAN side (say, Ethernet) to layer 1 and Layer 2 on the WAN side (say, X.25). This leads to the confusing practice of referring to routers as gateways.
Print servers are network nodes through which printers can be made available to all users. Typical print servers cater for both serial and parallel printers. Some also provide concurrent multi-protocol support, which means that they support multiple protocols and will execute print jobs on a first-come first-served basis regardless of the protocol used. Protocols supported could include SPX/IPX, TCP/IP, AppleTalk/EtherTalk, NetBIOS/NetBEUI, or DEC LAT.
Terminal servers connect multiple serial (e.g. RS-232) devices such as system consoles, data entry terminals, bar code readers, scanners, and serial printers to a network. They support multiple protocols such as TCP/IP, SPX/IPX, NetBIOS/NetBEUI, AppleTalk and DEC LAT, which means that they not only can handle devices which support different protocols, but can also provide protocol translation between ports.
Thin servers are essentially single-channel terminal servers. They provide connectivity between Ethernet (e.g. 10BaseT/100Base-TX) and any serial devices with RS-232 or RS-485 ports. They implement the bottom 4 layers of the OSI model with Ethernet and layer 3/4 protocols such as TCP/IP, SPX/IPX and DEC LAT.
A special version, the Industrial Thin Server, is mounted in a rugged DIN rail package. It can be configured over one of its serial ports, and managed via Telnet or SNMP. A software redirector package enables a user to remove a serial device such as a weigh-bridge from its controlling computer, locate it elsewhere, then connect it via a Thin Server to an Ethernet network through the nearest available switch or hub. All this is done without modifying any software. The port redirector software makes the computer ‘think’ that it is still communicating to the weighbridge via the COM port while, in fact, the data and control messages to the device are routed via the network.
A Remote Access Server (RAS) is a device that allows users to dial into a network via analog telephone or ISDN. Typical RASs support between 1 and 32 dial-in users via PPP or SLIP. User authentication can be done, for example, via PAP, CHAP, Radius, Kerberos or SecurID. Some offer dial-back facilities whereby the user authenticates to the server’s internal table, after which the server dials back to the user so that the cost of the connection is carried by the network and not by the remote user.
Network time servers are stand-alone devices that compute the correct local time by means of a GPS receiver, and then distribute it across the network by means of the Network Time Protocol (NTP).
After studying this chapter you will:
Initially LANs were just that: networks serving users in a small area, usually a single department or a workgroup with one server and a few clients. Very often individual floors in an office each had their own LAN. Sometimes these LANs were interconnected by routers in order to obtain a larger network (Figure 1.1).
The routers interconnecting the LANs allowed communication between LANs, but they were often slow and expensive. With the advent of Ethernet switches a more elegant solution was devised. Switches were now used to interconnect LANs, while the routers were pushed to the periphery of the network, and used to access the communication links to the external world (e.g. WAN or the Internet). See Figure 11.2.
A switched LAN, however, also has some disadvantages. While switches are able to segment unicast traffic (one node to another), multicast or broadcast traffic is allowed to pass through the switch, which is not the case with routers. In other words, while each segment is a collision domain, the entire LAN is a broadcast domain. This can become a bottleneck restricting LAN throughput.
The other factor that affects performance is due to changes in the business environment. Very often personnel involved in a particular project, or those belonging to a particular department, are not confined to a given physical area and they are spread across a building or campus. Product design teams may be cross-functional groups and often exist for short periods of time. In these cases, grouping users into one physical segment is not feasible. Refer to Figure 11.3. VLANs offer a solution to these problems.
A VLAN logically groups switch ports into workgroups. Since the number of broadcasts and multicasts between the users of a workgroup is likely to be high, a VLAN that includes only members of a given workgroup limits broadcast traffic to the particular workgroup. Thus a VLAN performs like a virtual broadcast domain. Figure 11.4 shows the logically defined VLAN network.
VLANs offer a number of advantages over the traditional switched LAN implementation. In networks with a high proportion of broadcast traffic, a VLAN can improve network performance by withholding broadcast messages from unintended destinations. This function could be performed by a router, but the greater amount of processing required for a router increases latency and therefore reduces the performance of the network. A VLAN is simply more efficient.
A VLAN allows a network administrator to allocate a group of users to a virtual broadcast domain, irrespective of their physical location. The only alternative to a VLAN would be to physically move the users. Given that present-day organizations are generally more dynamic, frequent changes in workgroups have also become necessary. This can be achieved by a VLAN without the need of physical relocation every time a change takes place.
A considerable part of the network administration efforts can be attributed to additions, movements and changes; all of which involve reconfiguration of hubs, routers, station addressing and sometimes even re-cabling. It is estimated that roughly 70% or more of administration time is spent on such activities. A VLAN reduces the need for these changes and, with good management tools, all that needs to be done is a simple drag and drop action to change a user from one VLAN to another. The ease of administration, the avoidance of physical relocation and cabling changes, and the elimination of expensive routers to contain broadcast activity keep the costs lower.
Through the ability to contain all transmissions (including broadcasts) within a workgroup, access to sensitive data is limited to members of a specific VLAN. This improves the data security of the network.
VLANs are not without problems. When resources such as printers have to be shared within a logical group, it may inconvenience printer users physically located far away from the printer.
The present trend is to group different servers in a common server farm with increased physical security, environmental control and fire prevention measures. These servers may have to be accessed by members of more than one VLAN. If it is not possible to assign the server to more than one VLAN, such access is impossible unless the servers are put in a separate VLAN and connected to the other VLANs via a router. This can affect network throughput.
Initial VLAN implementations were by and large proprietary, which hampered inter-operability since VLAN switches for a specific system had to be obtained from one vendor. Standards such as IEEE 802.1Q, supported by multiple vendors, have simplified the implementation of VLANs.
The switch is the core component of any VLAN implementation, and serves as the infrastructure for frames to or from each node. How VLANs will be grouped is decided by the network administrator, and the switch simply implements this decision. The intelligence required to identify the group to which a frame belongs is provided by the switch, and can be carried out either by packet filtering or by packet identification (also called tagging).
Packet filtering is similar to the technique used by routers. In this method, each packet received by the switch is examined and compared with a filtering table (also called a filtering database). The filtering table, developed for each switch, contains information on user groups based either on port address, MAC address, protocol type or application type. The switch takes appropriate action based on the comparison. Packet filtering thus provides an additional layer of processing for deciding as to how the packet should be handled, which increases the switch latency. It is also necessary for all the switches in the network to maintain and synchronize their filtering tables, which involves administrative overheads.
Packet identification is a relatively new method and consists of placing a unique identifier in the header of each packet before it travels through the switch fabric. The identifier is examined by each switch prior to any broadcast or transmission to other switches, routers or workstations in the network. When a packet exits the switch fabric (routed to the target end station) the switch removes the identifier from the header. Packet identification functions in layer 2 of the OSI model and involves little overhead, apart from the extra four bytes inserted in the Ethernet frame header.
The method of adding an identifier to the frame header for identification purposes is called ‘explicit’ tagging. On the other hand, the method of packet filtering is said to involve ‘implicit’ tagging.
In explicit tagging, the switch needs to know whether the target device is VLAN-aware or not, i.e. whether the target device has the intelligence to interpret the tag information. The identifier is forwarded to a device only if it is VLAN-aware.
The membership of an end user device in a VLAN can be implemented in several ways.
In general, they will fall under one of the following five categories:
We will review each of the approaches.
In this method, the membership to a VLAN depends on the switch port to which it is connected. Two types of grouping are possible.
In the first, grouping can be within a single switch, say ports 1, 3, 5 and 7 make up one group and 2, 4, 6 and 8 make up a second group. (Refer to Figure 11.5).
In the other method, which will involve more than one switch, a VLAN can include specific ports of different switches. For example: ports 1, 2, 3 and 4 of Switch 1 and ports 1,2, 3 and 4 of Switch 2 form VLAN A and ports 5, 6, 7 and 8 of Switch 1 and ports 5, 6, 7 and 8 of Switch 2 form VLAN B. (Refer to Figure 11.6).
The limitation of the method using switch ports to indicate grouping is that whenever a particular user moves to a new physical location involving a different switch/port, the network administrator has to reconfigure the VLAN membership information.
This grouping also works best for nodes that are directly connected to a switch port. If a host is connected to a VLAN switch port through a hub, all the machines connected to the hub have to be members of the same VLAN as one physical segment cannot be assigned to multiple VLANs. However, hubs with a multi-backplane connection feature overcome this limitation to some extent, since each backplane of the hub can be allocated to a particular VLAN.
The switch port grouped implementation works well in an environment where network moves are done in a controlled fashion and where robust VLAN management software is available to configure the ports.
Port based VLANs are set up as follows. The user must configure each port to accept packets (only) from certain other ports. Assume that ports 1, 2 and 3 on a switch have to be set up as members of one VLAN. The switch will then be configured as follows:
The problem of manual reconfiguration is overcome by grouping different machines based on their MAC addresses. These addresses are machine-specific and are stored on the host’s NIC. This ensures that, whenever a user moves from one switch port to another, the grouping is kept intact and, wherever the move, the user’s VLAN membership is retained without change. This method can therefore be thought of as a user-based VLAN implementation.
This method has some drawbacks. The MAC address-based implementation requires that, initially, each MAC address is assigned to at least one VLAN. This means that to begin with, there will be one large VLAN containing a very large number of users. This may lead to some performance degradation. The problem is overcome by some vendors providing automated tools to create groups based on the initial network configuration. In other words, a MAC address based VLAN is created automatically for each subnet of the LAN.
Serious performance issues can arise if members of different VLANs co-exist in a shared media environment (within a single segment). In the case of large networks the method of communicating the VLAN membership information between different switches to keep them synchronized can cause performance degradation as well.
The use of MAC addresses to identify VLAN membership can also be problematic in a network where a large number of laptop computers are connected to the network by means of docking stations. The NIC, and therefore the MAC address, is a part of the docking station which usually remains on a particular desk. In a situation where the user and the laptop moves around to different desks/docking stations, their MAC address changes when they move to a different desk. This makes tracking groups based on MAC addresses difficult, since reconfiguration is needed whenever a user moves to a different docking station.
This method is also referred to as layer 3-based VLAN implementation. Here, grouping is based on information such as the protocol type, or in many cases Network-layer addresses such as the subnet address in IP networks. The switches examine the information regarding the subnet address contained in each packet to decide to which VLAN the user belongs, and makes the forwarding decision on this basis. It should be noted that even though this method uses layer 3 information, it does not constitute routing. This is so because no route calculation is done, frames are usually bridged according to the Spanning Tree Algorithm, and connectivity within any VLAN is seen as a flat bridged topology. See Figure 11.7 for an example of this type of implementation.
Some vendors incorporate varying amounts of intelligence in their switches so that they carry out functions normally associated with layer 3 devices. Some of these layer 3-aware devices have the packet forwarding or routing function built into ASIC chips on the switch, which makes them faster than CPU-based routers. However, the process of examining layer 3 addresses in a packet is definitely slower than looking at MAC addresses in frames since it is located further into the frame. These implementations are therefore slower than those that use layer 2 information.
VLANs using layer 3 information are particularly effective in dealing with networks using TCP/IP. This is not the case with protocols such as IPX or DECnet that do not involve manual configuration at the desktop. In particular, end stations using un-routable protocols such as NetBIOS cannot be differentiated and therefore cannot be included in these VLANs.
This represents a different approach to VLANs although the concept of a VLAN as a broadcast domain is still applicable. When an IP packet is sent via multicast, it is sent to an address that is a proxy for a group of IP addresses dynamically defined. Each workstation is given a chance to join this group for a certain period of time by sending an affirmative response to a broadcast. They are, however, members of the multicast group only for a certain period of time. This approach is thus highly flexible and exhibits a high degree of application sensitivity. Such groups can also span routers and therefore WAN connections.
VLANs implemented in this way are defined on the basis of applications or services or both. For example, FTP applications can belong to one VLAN and TELNET applications to another. Such complex VLAN implementations need a high degree of automation in their configuration and management.
Devices in a VLAN can be connected in different ways depending on whether the device is ‘VLAN-aware’ or ‘VLAN-unaware’. As we saw earlier, a VLAN-aware device can interpret explicitly tagged frames and decide to which VLAN the frame is to be directed.
These connections are referred to as:
We will now discuss these in more detail.
A Trunk link only carries frames in which the tagging information is present. This is normally the case with the links that interconnect switches, but some higher-end NICs are also capable of interpreting the tags. All devices connected to a Trunk link should be ‘VLAN-aware’, including workstations, and all frames on a Trunk link should be explicitly tagged. Figure 11.8 below shows an example.
An Access link connects a VLAN-unaware device (e.g. a NIC that is incapable of reading the tag information in the frame) to a VLAN-aware switch. VLAN-unaware LAN segments are generally found in legacy LANs. Since VLAN-unaware devices cannot handle explicitly tagged frames, the frames must be implicitly tagged, or in other words, the switch must strip out the tag information before forwarding the frame to the device. Refer to Figure 11.9 for an illustration.
When VLAN- unaware end stations are attached to a Trunk link, the resultant link is called a Hybrid link. A Hybrid link is one with VLAN-aware as well as VLAN-unaware devices attached to it. A Hybrid link can therefore carry tagged as well as untagged frames. However, all the frames egressing from a switch to a specific device must be either tagged or untagged, depending on whether that particular device is VLAN-aware or VLAN-unaware. Refer to Figure 11.10 below.
All three types of links can be present in a single network.
The filtering table in each switch is a database, which stores the VLAN membership information permitting the switch to decide on how a packet is to be handled. There are essentially two types of entries: static and dynamic.
Static entries are created, modified or deleted by the network administrator and are not automatically deleted by ageing, unlike dynamic entries. Static entries can either be filtering entries or registration entries. A filtering entry is made for each port, indicating whether frames to a specific address in a VLAN should be forwarded, discarded or whether they should be subject to a dynamic entry. Static registration entries specify whether frames for a specific VLAN should be explicitly tagged and which ports are members of which VLAN.
Dynamic entries are automatically created by the switch and cannot be added or changed by the administrator. Unlike static entries they are automatically removed from the table after a certain time that is decided by the administrator. The learning process takes place by observing the packets sent by a port, identifying the address and VLAN membership. Entries are made accordingly.
The entries can be of the dynamic filtering type, dynamic registration type or group registration type. Dynamic filtration entries specify whether a packet to be sent to a specific address in a particular VLAN should be forwarded or discarded. Dynamic registration entries specify which ports are registered in a specific VLAN. Addition or deletion of entries is done using the Generic Attribute Registration Protocol (GARP) or the VLAN Registration Protocol (GVRP). Group registration entries indicate, for each port, whether frames to be sent to a group MAC address should be filtered or discarded. These entries are added or deleted using Group Multicast Registration Protocol or GMRP. GVRP has another function: communicating information to other VLAN-aware switches. Such switches register and propagate VLAN membership to all the ports that are part of the currently active topology of the VLAN.
Explicit tagging is the process of adding information to the header of a frame (e.g. Ethernet) to indicate the specific VLAN membership for that frame. Unfortunately the ‘standard’ Ethernet/ IEEE 802.3 header does not contain sufficient information to identify VLAN membership; hence the frame had to be modified by the insertion of a ‘tag’ between the two MAC addresses and the Type/Length field. This modified frame is known as an IEEE 802.1Q tagged frame.
The IEEE 802.1p committee defined a standard for packet-based LANs to support layer 2 traffic prioritization in a switched LAN. IEEE 802.1p is part of the larger IEEE VLAN initiative (IEEE 802.1p/Q) that adds more information to the Ethernet header (as shown in Figure 10.11) in order to allow both VLAN operation and traffic prioritization.
802.1p/Q adds 4 bytes to the Ethernet header. The first field (two bytes) is known as the ‘Tag Identifier’ and assumes the value of 0x8100 for Ethernet. This simply indicates that the frame is tagged and therefore not a standard Ethernet frame.
The following field (2 bytes) contains the tag itself and is subdivided into three fields:
Large, multi-site businesses have always emphasized centralized information systems as an important objective in their IT policy framework. With the explosion of communication technologies, many of them have achieved this objective by establishing WANs via leased communication channels. This has made possible the access of corporate information by all branch offices in a continuous and almost real-time manner, which, in turn, led to better decision-making. WANs are, however, expensive to set up and maintain, given the high cost of leased lines. They are therefore beyond the reach of many small and medium businesses.
Another aspect of corporate networks is remote access. Employees who work from home or are constantly on the move often have to access their corporate LAN to access their e-mail and to obtain other pertinent information needed for their business interactions. This type of communication was traditionally done via dial-up connections using modems and public telephone networks, often over long distances or sometimes even through international circuits. This is more expensive and less secure compared with dedicated leased circuits. Figure 12.1 shows a typical corporate network with inter-site WAN connectivity and remote user access.
There has been a paradigm shift in the approach to corporate networking since the advent of the Internet as a business communication medium. Given the ubiquitous nature and low access cost of the Internet, other more attractive networking solutions have started arriving in the IT marketplace. Virtual Private Network (VPN) technology has started replacing conventional leased circuit-based WANs. In this chapter we will explore the fundamental concepts of this technology and see how it is helping businesses to redefine their IT approach while simultaneously effecting considerable savings in their communication budgets.
The Internet has affected computing in more than one way. Information dissemination within the organization through Intranets uses the same standards as the Internet. Web-enabled applications use multi-tier architectures for database access and are deployed both within the corporation and for external customers. This has enabled the adoption of a unified application architecture for employees, business partners and the public domain realized by the Intranet, Extranet and Internet respectively. A VPN is the infrastructure that makes such deployment possible in a cost-effective manner. A typical view of an Enterprise network using VPN technology is shown in Figure 12.2.
A VPN is basically a corporate network built around the communication infrastructure of the Internet instead of leased lines or RAS using direct dial-in. Since the Internet is a public medium where the traffic is prone to interception or modification, security issues play an important role in the implementation of a VPN. A VPN is, however, a highly cost effective proposition as dedicated lines are required only to connect the corporate network to an ISP (usually located within the same city). Remote users of a VPN also connect to the network through the Internet using local dialup access numbers (Points of Presence) of the ISP. This offers a very high degree of availability and Quality of Service (QoS) as opposed to long distance dialing. The actual savings will depend on many factors including the geographical spread of enterprise branch locations, the number of remote users, locations from where remote access is generally made and the average time of use by the remote users. It has been reported that a saving of up to 50% is possible by changing over from a conventional WAN to a VPN. Capital expenditure payback periods for VPN implementation can be as low as four months.
VPN solutions are essentially of three distinct types.
While all three of these types of connectivity are essential from the enterprise viewpoint, most of the savings result from Remote Access VPN. This is because:
Any enterprise planning to implement a VPN system must carefully evaluate the various issues of importance. A 5-tier model proposed by the Gartner Group summarizes these issues and can be a starting point. See Figure 12.3 below.
The five tiers are security, scalability, manageability, simplicity and quality of service. Security is a factor decided by the corporate policy. Scalability, manageability and simplicity are functional requirements and will depend on present and perceived future needs, particularly the issue of scalability. QoS will be primarily dependent on the ISP whose infrastructure will be used for the VPN. The task of implementation can thus be divided into:
We will discuss these aspects in further detail in the following sections.
In any network, certain policy guidelines are established by the IT department to ensure appropriate and optimal use of network resources. Most enterprises would have developed such guidelines even before considering a VPN. The main issues to be addressed are:
The access rights and cost threshold issues are quite straightforward and are defined as a part of organizational structure of the enterprise. Class of Service issues such as bandwidth commitments to different functional groups require a more thorough evaluation. In addition, such facilities can only be realized using specialized VPN implementations.
It is, however, the issues concerning security that normally need very careful consideration. We will go through this aspect in detail later in this chapter.
While planning to implement a VPN, it is necessary to understand the needs and expectations of users as well as network administrators in order to choose a suitable system. While network administrators will look for scalability and manageability, the users will need simplicity of usage. The VPN should satisfy all these three basic parameters namely scalability, manageability and simplicity.
It is an extremely difficult task to decide the scalability of VPN systems. Scalability is determined by the growth of user population and growth of usage in terms of both time and bandwidth. Growth itself is driven by the following factors.
In short, continued improvement in connectivity at a lower cost will significantly boost the use of VPN as more employees will tend to work away from the office, in increasing numbers and for increased hours.
User population is a critical design element. If there are 600 remote users in a corporation, each user will require a dedicated channel (called a Tunnel in VPN terminology) to access the central host. Since not all users are expected to connect at the same time, it is usual to have 200 to 300 tunnels for 600 users (over-subscription by a ratio of 2:1 to 3:1). However, as the time of usage increases, the ratio must become lower, if one is to obtain a connection on demand. Increase in user population also puts increasing demands on Internet connectivity resources, such as bandwidth of the Internet connectivity at the central site as well as its routers, firewalls, authentication systems and server hardware. All these must scale upwards to keep up with the increased demand. Any inadequacy in these resources will increase the response time and will result in poor network performance and user dissatisfaction.
It is advisable to plan for a little excess capacity in the beginning so that capacity constraints do not become an issue immediately after the initial implementation.
There are two ways in which a server cluster can be used. One method (Figure 12.4) is to have a pure cluster with a single Internet access point. The other method (Figure 12.5) is to have parallel servers, each with its own Internet connection.
An alternative to the clustering of multiple servers is to have distributed servers by placing the tunnel servers closer to different application servers. While the type of automated load sharing as done in clustered servers is not possible, the architecture itself has the effect of balancing the load and provides adequate redundancy.
A VPN system should have quantifiable scalability attributes and should have effective data compression. It should also provide for protocol extensions such as variable window size and selective packet re-transmission. Such capabilities will ensure better performance.
The next important functional requirement from the administrator’s point of view is end-to-end manageability from a single control point. Manageability should include information on user population, login status, system status, traffic loading, managing of remote user desktops, Internet connection management, etc. We will review some of the important requirements.
A VPN system is essentially quite simple in concept. However, for a user who is not a systems specialist, the procedures of access and connectivity can be daunting. By designing a simple and easy-to-use interface, these complexities can be hidden from the user and all that he/she will see is a consistent screen irrespective of the method of connectivity. In addition, all ISP/route/cost decisions should be automated. Access telephone numbers and dial-up codes at remote locations should be based on table look-up and should be transparent to the user. The user should not be required to remember multiple passwords irrespective of the ISP and location of POP.
The user interface should have the necessary intelligence to gather information on connectivity performance, QoS issues such as difficulty in access, lower than committed speeds, etc. and report them to the centralized VPN management system for analysis and corrections.
Dynamic fault recovery is another prerequisite. Rather than just detecting a fault and reporting it to the user, the client application should resolve the problem by itself to the extent that this is feasible. It should also automatically download and update drivers as well as look-up tables containing access information whenever there are changes to these data or files. Automatic reconnection in case of sudden disconnection is another desirable feature.
A simple and automated user interface will result in fewer help desk calls and result in better user satisfaction and cost savings.
In almost all cases where a VPN implementation is planned, an existing network with both Internet access and RAS is likely to be in operation. The VPN implementation should integrate properly into this legacy environment. Figure 12.7 shows such a typical legacy environment.
Any VPN must be designed to complement the existing equipment. For example, the Internet router will still be needed for Internet access. Usually such routers also act as layer 3 firewalls, a function that VPN equipment do not provide. Similarly, corporate LAN routers that enable multi-protocol routing should also be retained, as VPNs do not have LAN routing capability. The firewall hardware has to be retained as well because a VPN’s basic access control is, by itself, not an adequate protection measure.
Figure 12.8 shows the possible ways in which the tunnel server can be placed in an existing corporate network.
In the arrangement shown in Figure 12.8A, the inbound VPN traffic is also made to go through the firewall before accessing the tunnel server. This may require the firewall to be upgraded as VPN traffic increases. In figure 12.8B, the VPN traffic is handled directly by the tunnel server whereas non-VPN traffic is sent through the firewall. This arrangement is more scalable.
Security is a vital concern in the planning of a VPN. As the Internet is a notoriously uncontrolled and public medium, any traffic through the Internet poses many security risks. These are addressed using the following protective measures.
Authentication identifies a user to the system, the most common method being the use of a password. Passwords must follow a clear policy. Users should take care not to make the password too easy to guess and not to share the password with others. The system may enforce a minimum length of passwords, use of one or more special characters in the password; force periodic changes of password and disallow password repetition. If proper password policies are not adopted and enforced, the security of any network can be seriously compromised. The authentication system compares the password that the user provides with the password database in the server and, if they match, permits access.
Some of the protocols used for VPNs are based on PPP, which was traditionally used for RAS access with authentication by the Password Authentication Protocol (PAP).
Sending the password in clear text over a public network is risky, as passwords can be revealed to those who eavesdrop on the network. A variety of challenge-response protocols is therefore used. One example is the Challenge Handshake Authentication Protocol or CHAP. In this system, the authentication of users is done by the following steps.
It is possible to use stronger authentication models, which require that, in addition to the password, the user also has to identify himself through some other means. This may be a hardware-based token card or identification based on scanning of a thumbprint or iris scan. This combines something that the user knows (password), with something that he has (a card or thumbprint) as means of identification. This is, however, subject to the support by the VPN protocol being used.
Many VPNs support SecurID by Security Dynamics, a token card that combines secret key encryption with a one-time password. The password is automatically generated by encrypting a timestamp with a secret key. This one-time password will be valid for a short interval, usually 30 to 60 seconds.
Digital certificates are also becoming more prevalent as an authentication mechanism for VPNs. A digital certificate (based on the X.509 standard) is an electronic document that is issued to an individual by a ‘Certificate Authority’ (CA) that can vouch for an individual’s identity. It essentially binds the identity of an individual to a public key. A digital certificate will contain a public key, information specific to the user (name, company, etc.), information specific to the issuer, a validity period and additional management information. This information will be used to create a message digest, which is encrypted with the CA’s private key to ‘sign’ the certificate.
By using the digital signature verification procedure described above, participants in a conversation can mutually authenticate each other. Although this process sounds simple, it involves a complex system of key generation, certification, revocation and management, all part of a Public Key Infrastructure (PKI). A PKI is a broad set of technologies that are used to manage public keys, private keys and certificates. The deployment of a PKI solution should be done with due care as there are major issues such as scalability and interoperability.
Manual authentication provides a means for authenticating network devices including firewalls, gateways and desktop clients by the CA. The process allows the registration authority to control the certificate lifecycle including approval and renewal of certificates. The sequence goes something like this:
For remote users, other methods such as pass code authentication and automatic authentication can be used to obtain the digital certificate.
Increasingly, disparate remote access and VPN systems are managed by a central user account and policy database. This requires a schema that can be used irrespective of the VPN vendor and the authentication method that the vendor deploys. The Remote Access Dial-In User Services (RADIUS) protocol is one such initiative for unifying devices from multiple vendors into one authentication scheme. Any VPN or RAS device that supports RADIUS can authenticate against a central RADIUS server which not only defines the user names and passwords, but can also maintain remote user policies such as IP addresses, allowed length of call and number of concurrent logins.
Step 1 | Client requests a connection |
Step 2 | Management server sends random challenge to client |
Step 3 | Client calculates a hash based on the challenge and its password and returns the hash to the Management server |
Step 4 | Management server sends challenge and hash to RADIUS server in an Access-Request message |
Step 5 | RADIUS server calculates the hash and compares it to the client’s hash, if they match then it responds with an Access-Accept message |
Step 6 | Management server grants connection to client |
Figure 12.9 shows a typical RADIUS scheme in operation.
Microsoft and Novell have released RADIUS support for their respective network operating systems in order to provide remote user access management.
Authentication does not stop with user identification. The packets that go through the Internet are vulnerable to several threats such as spoofing, session hijacking, sniffing, and man-in-the-middle attacks. It is therefore necessary to examine the packets and make sure that they originate from the purported sender and not someone masquerading as the sender. The checks include:
We will discuss these aspects further when we review VPN protocols.
Encryption refers to the alteration of data using a key (which is a fixed length string of bits) so that the data is meaningless to anyone intercepting the message unless he has the key. The reverse process of getting the data back from the encrypted form is called decryption and is done by the receiver using the same key. It is essential that both sender and receiver share the key. This method where a shared key is used by both sender and receiver is called symmetrical encryption or Private Key encryption. It is possible to attack an encrypted message using ‘brute force’ method through an automated process of trying out all possible key combinations. Therefore the longer the key length, the more difficult it is to break. A 128-bit string will take thousands of years of computational time to break by ‘brute force’ methods and is therefore considered safe.
The other method that is increasingly used requires a set of two keys, a public key and a private key. Diffie-Hellman (DH) and Rivest-Shamir-Adleman (RSA) are two of the popular algorithms in common use.
In this method every user has a set of two keys, a public key that is revealed to anyone who wishes to contact him and a private key that is not shared with anyone. When a message is sent by A to B it is encrypted by using the public key of B. Once it is received by B, he decrypts it using his private key. Decryption is not possible without knowing this key. Refer to Figure 12.10, which illustrates this operation.
In addition to user authentication, Certification Authorities (CAs) also provide digital certificates containing the Public key. An enterprise can implement its own certification using Certification Servers, or use third party services such as Verisign for issue and sharing of their public key. This process of managing public keys is known as the Public Key Infrastructure (PKI).
VPN works over the Internet but it is possible that the communication protocol used in the corporate LAN is different and not the TCP/IP protocol used on the Internet. It may be IPX (Novell) or NetBEUI (Microsoft). If these protocols have to work between a remote client and the central server, there must be a mechanism to encapsulate their protocol packets within IP packets for transportation and reassemble them into the native LAN protocol at the receiving end. This process is called encapsulation.
In addition, the internal network addresses should be preferably hidden from public view, particularly when TCP/IP is used as the LAN protocol, by encrypting the address along with the data. This way, the packet travels from sender to receiver through the Internet in a secure tunnel and comes out at the other end to be processed further. Thus, a VPN can be implemented in such a way that no change is required in the LAN or application architecture.
A tunnel is required for each remote user and the tunnel server should be capable of providing the required number of tunnels on the basis of simultaneous remote connections at any given point in time.
Four protocols are generally used in VPNs. They are the Point-to-Point Tunneling Protocol (PPTP), Layer-2 Forwarding protocol (L2F), Layer 2 Tunneling Protocol (L2TP), and the IP Security Protocol (IPSec).
PPTP, L2F and L2TP are aimed at remote users connecting through dial-up access. IPSec is aimed at LAN-to-LAN VPNs.
PPTP
PPTP is derived from the popular Point-to-Point Protocol (PPP) used for remote access to the Internet. PPTP has added the functionality of tunneling through the Internet to a destination site. The encapsulation of PPP packets is done using a modified Generic Routing Encapsulation (GRE) protocol. PPTP can be used for any protocol including IP, IPX and NETBEUI. It has, however, some limitations such as its inability to support stronger encryption and token based user authentication.
With PPTP in a tunneling implementation, the dial-in user has the ability to choose the PPTP tunnel destination after the initial PPP negotiation. This is important if the tunnel destination will change frequently, and no modifications are needed by mechanics in the transit path. It is also a significant advantage that the PPTP tunnels are transparent to the service provider, and no advance configuration is required between the NAS operator and the overlay dial access VPN (VPDN). In such a case, the service provider does not house the PPTP server, and simply passes the PPTP traffic along with the same processing and forwarding policies as all other IP traffic.
In fact, this could be considered a benefit, since configuration and support of a tunneling mechanism within the service provider network need not be managed by the service provider, and the PPTP tunnel can transparently span multiple service providers without any explicit configuration.
The other advantage is that the subscriber does not have to pay a higher subscription fee for a VPN service. The situation changes dramatically if the service provider houses the PPTP servers, or if the subscriber resides within a sub-network in which the parent organization wants the network to make the decision concerning where tunnels are terminated. The major part of PPTP-based VPDN user base is one of a roaming client type, where the clients of the VPDN use a local connection to the public Internet data network, and then overlay a private data tunnel from the client’s system to the desired remote service point.
L2F
L2F was developed in the early stages of VPN and also uses tunneling. Like PPTP, L2F also uses PPP for authentication but also supports Terminal Access Controller Access Control System (TACACS+) and RADIUS. Unlike PPTP, L2F tunneling is not dependent on IP but can work directly with other media such as ATM and Frame Relay. In addition, L2F allows tunnels to support more than one connection. This protocol has now given way to the later L2TP.
L2TP
L2TP, designed by the IETF working group to succeed PPTP and L2F, uses PPP to provide dial-up access that can be tunneled to a corporate site through the Internet, but defines its own tunneling protocol. Like L2F, it can also handle multiple media and offers support to RADIUS authentication. L2TP can support multiple, simultaneous tunnels for each user.
With L2TP in a ‘compulsory’ tunneling implementation, the service provider controls where the PPP session is terminated. This can be extremely important in situations where the service provider to whom the subscriber is actually dialing into (let’s call this the ‘modem pool provider’ network) must transparently hand off the subscriber’s PPP session to another network (let’s call this network the ‘content provider’).
To the subscriber, it appears as though he is directly attached to the content provider’s network, when in fact he has been passed transparently through the modem pool provider’s network to the service to which he is subscribed. Very large content providers may outsource the provisioning and maintenance of thousands of modem ports to a third-party provider, who in turn agrees to pass the subscriber’s traffic back to the content provider. This is generally called ‘wholesale dial.’ The major motivation for such L2TP-based wholesale dial lies in the typical architecture of the PSTN (Public Switched Telephone) Network.
The PSTN network is typically constructed in a hierarchical fashion, where a local PSTN exchange directly connects a set of PSTN subscribers, which is in turn connected via a trunk bearer to a central office or metropolitan exchange, which may be connected to a larger regional office or major exchange. A very efficient means of terminating large volumes of data PSTN calls is to use a single common call termination point within the local or central exchange to terminate all local data PSTN calls, and then hand the call data over to a client service provider using high volume data transmission services. This cost efficiencies that can result from this architecture form a large part of the motivation for such L2TP-based VPDNs, so a broad characterization of the demand for this style of VPDN can be characterized as a wholesale/retail dial access structure. Another perspective is to view this approach as ‘static’ VPDN access.
Although L2TP provides cost-effective access, multi-protocol transport, and remote LAN access, it does not provide cryptographically robust security features. For example,
Some of these issued can be addressed by the use of the IPSec protocol.
IPSec
IPSec is the de facto standard for VPNs and is a set of authentication and encryption protocols, developed by the IETF and designed to address the inherent lack of security for IP-based networks. It is designed to address data confidentiality, integrity, authentication and key management, in addition to tunneling.
IPSec, which was originally meant for the next generation of IP (IPv6), can now be used with IPv4 as well. The issue of exchange and management of keys used to encrypt session data is done using the ISAKMP/Oakley scheme, now called Internet Key Exchange (IKE). IPSec allows authentication or encryption of an IP packet, or both. There are two different methods of using IPSec. These are the Transport mode in which only the Transport layer segment of an IP packet is authenticated or encrypted, or the Tunnel mode, where the entire IP packet is authenticated or encrypted.
IPSec uses Diffie-Hellman key exchanges for delivering secret (private) keys on a public network. It uses public-key cryptography for signing Diffie-Hellman exchanges so that the identity of the parties is hidden from the ‘man-in-the-middle’. It encrypts data using the Advanced Encryption standard (AES). It uses the HMAC, MD5 and SHA keyed hash algorithms for packet authentication and validates public keys using Digital Certificates.
However, IPSec is only suitable for networks that use the IP environment and can only handle IP packets. Multi-protocol environments will still need to use PPTP or L2TP.
The IPSec protocol suite typically works on the edges of a security domain. Basically, IPSec encapsulates a packet by wrapping another packet around it. It then encrypts the entire packet. This encrypted stream of traffic forms a secure tunnel across an otherwise unsecured network.
The principal IPSec protocols are:
We will discuss these in the following paragraphs.
Authentication Header
The IP Authentication Header provides connectionless (that is, per-packet) integrity and data origin authentication for IP datagrams, and offers protection against replay. Data integrity is assured by the checksumↃ generated by a message authentication code (for example, MD5); data origin authentication is assured by including a secret shared key in the data to be authenticated and replay protection is provided by use of a sequence number field within the AH Header. In the IPSec vocabulary, these three distinct functions are lumped together and simply referred to as ‘authentication’.
The algorithms used by the AH protocol are known as Hashed Message Authentication Codes (HMAC). HMAC applies a conventional keyed message authentication code twice in succession, as shown in schematic form in Figure 12.11, first to a secret key and the data, and then to the secret key and the output of the first round. Since the underlying message authentication code in Figure 12.11 is MD5, this algorithm is referred to as HMAC-MD5. The AH protocol also supports the use of HMAC-SHA in which the Secure Hash Algorithm (SHA) is used as the base message authentication code instead of MD5.
AH protects the entire contents of an IP datagram except for certain fields in the IP header (called mutable fields) ↃↃthat could normally be modified while the datagram is in transit.
For the purpose of calculating an integrity check value, the mutable fields are treated as if they contained all zeros. The integrity check value is carried in the AH header Ↄfield shown in Figure 12.12. AH can be applied in either of two modes: transport mode or tunnel mode. Figure 12.12 shows how the AH protocol operates on an original IP datagram in each of these two modes.
In transport mode, the original datagram’s IP header is the outermost IP header, followed by the AH header, and then the payload of the original IP datagram. The entire original datagram, as well as the AH Header itself, is authenticated and any change to any field (except for the mutable fields) can be detected. All information in the datagram is in clear text form, and is therefore subject to eavesdropping while in transit.
In tunnel mode, a new IP header is generated for use as the outer IP header of the resultant datagram. The source and destination addresses of the new header will generally differ from those used in the original header. The new header is then followed by the AH header, and then by the original datagram in its entirety, including both IP header and the original payload. The entire datagram (new IP Header, AH Header, IP Header, and IP payload) is protected by the AH protocol. Any change to any field (except the mutable fields) in the tunnel mode datagram can be detected.
ESP
The Encapsulating Security Payload provides data confidentiality (encryption), connectionless (per-packet) integrity, data origin authentication, and protection against replay. ESP always provides data confidentiality, and can optionally provide data origin authentication, data integrity checking, and replay protection. When comparing ESP to AH, it is clear that only ESP provides encryption, while either can provide authentication, integrity checking, and replay protection. ESP’s encryption uses a symmetric shared key: That is, a shared key is used by both parties for encrypting and decrypting the data that is exchanged between them.
When ESP is used to provide authentication functions, it uses the same HMAC algorithms (HMAC-MD5 or HMAC-SHA) as in the case of the AH protocol. However, the coverage is different, as shown in Figure 12.13.
In transport mode, ESP authentication functions protect only the original IP payload, but not the original IP header unlike in AH. ↃIn tunnel mode, ESP authentication protects the original IP header and the IP payload, but not the new IP header.
ESP can be applied in either of two modes: Transport mode or Tunnel mode. ↃIn Transport mode, the datagram’s original IP header is retained. Only the payload of the original IP datagram and the ESP trailer are encrypted. Note that the IP header itself is neither authenticated nor encrypted. Hence, the addressing information in the outer header is visible to an attacker while the datagram is in transit. In tunnel mode, a new IP header is generated. The entire original IP datagram (both IP header and IP payload) and the ESP trailer are encrypted. Because the original IP header is encrypted, its contents are not visible to an attacker while it is in transit. A common use of ESP tunnel mode is thus to hide internal address information while a datagram is tunneledↃ between two firewalls.
IKE
In any IPSec protocol based communication, a Security Association (SA) contains all the relevant information that communicating systems need in order to execute the IPSec protocols, such as AH or ESP. For example, an SA will identify the cryptographic algorithm to be used, the keying information, the identities of the participating parties, etc. ISAKMP defines a standardized framework to support negotiation of SAs, initial generation of all cryptographic keys, and subsequent refresh of these keys. Oakley is the mandatory key management protocol that is required to be used within the IKE framework. IKE supports automated negotiation of security associations, and automated generation and refresh of cryptographic keys. The ability to perform these functions with little or no manual configuration of machines is critical as a VPN grows in size. The secure exchange of keys is the most critical factor in establishing a secure communications environment—no matter how strong the authentication and encryption are, they are worthless if a key is compromised. Since the IKE procedures deal with initializing the keys, they must be capable of running over links where no security can be assumed to exist—that is, they are used to bootstrap Ↄthe IPSec protocols. Hence, the IKE protocols use the most complex and processor-intensive operations in the IPSec protocol suite.
IKE requires that all information exchanges must be both encrypted and authenticated: no one can eavesdrop on the keying material, and the keying material will be exchanged only amongst authenticated parties. In addition, the IKE methods have been designed with the explicit goals of providing protection against several well-known exposures:
After studying this chapter you should be able to describe, in basic terms, the concept of Voice over IP (VoIP) with reference to the following:
The protocols supporting VoIP, in particular Multicast IP, RTP, RTCP, RSVP and RTSP.
The H.323 standard, with reference to:
This chapter deals with the convergence of conventional PSTN networks and IP based internetworks, in particular the Internet. As a result of this convergence, Voice over IP is making major inroads into the telecommunications industry. This chapter will introduce the ITU-T H.232 standard for multimedia (audio, video and data) transmission.
H.323 requires IP as a Network layer protocol and TCP/UDP as Transport layer protocols. These protocols have already been covered in chapters 6 and 7.
The major advantage of VoIP over conventional PSTN usage is cost. It is not uncommon for a large company to recover the installation cost of VoIP equipment in less than a year. The following are a couple of implementations.
In most applications IP is used for unicasting, i.e. messages are delivered on a one-to-one basis. IP Multicasting delivers data on a one-to-many basis simultaneously. To support multicasting, the end stations must support the Internet Group Management Protocol, IGMP (RFC 1112), which enables multicast routers to determine which end stations on their attached subnets are members of multicast groups. IPv4 class D addresses ranging from 224.0.0.0 to 239.255.255.255 were originally reserved for multicasting.
In order to deliver the multicast datagrams, the routers involved have to use modified routing protocols, such as Distance Vector Multicast Routing Protocol (DVMRP) and Multicast Extensions to OSPF (MOSPF).
IP provides a connectionless (unreliable) service between two hosts. It does not confirm delivery at all, and cannot deliver packets on a real-time basis. It is up to a protocol at a higher level to determine whether the packets have arrived at their destination at all. There is an added complication in the case of real-time data such as voice and video, which require a deterministic type of delivery.
This problem is addressed by the real-time transport protocol (RTP), an Application layer protocol described in RFC1889, which provides end-to-end delivery services such as sequence numbering and stamping for data with real-time characteristics such as interactive audio and video. User applications typically run RTP on top of UDP to make use of its multiplexing and checksum services, but RTP may also be used with other suitable underlying network or transport protocols.
RTP itself does not provide any mechanism to ensure timely delivery nor does it provide other quality-of-service guarantees, but it relies on lower-layer services to do so. It does not guarantee delivery or prevent out-of-order delivery, nor does it assume that the underlying network is reliable or that packets are delivered in the correct sequence. The sequence numbers included in RTP allow the receiver to reconstruct the sender’s packet sequence, but sequence numbers might also be used to determine the proper location of a packet, for example in video decoding, without necessarily decoding packets in sequence.
RTP consists of two parts, namely the Real-time Transport Protocol (RTP), which carries real-time data, and the RTP Control Protocol (RTCP), which monitors the quality of service and conveys information about the participants in a session.
The basic RTP specification is, on purpose, not complete. A complete specification of RTP for a particular application will require one or more companion documents such as (a) a profile specification document, which defines a set of payload type codes and their mapping to payload formats (e.g., media encoding), and (b) payload format specification documents, which define how a particular payload, such as an audio or video encoding, is to be carried within RTP.
The following RTP definitions are given in RFC 1889. It is necessary to understand them before trying to understand the RTP header.
The first section (12 bytes fixed) is present in all RTP headers. The second section is optional (and variable in length) and consists of CSRC identifiers inserted by a mixer. The third section is also optional and variable in length, and is used for experimental purposes.
Version (V): 2 bits. This field identifies the version of RTP. The current version is 2.
Padding (P): 1 bit. If the padding bit is set, the packet contains one or more additional padding bytes at the end which are not part of the payload. The last byte of the padding contains a count of how many padding bytes should be ignored. Padding may be needed by some encryption algorithms with fixed block sizes or for carrying several RTP packets in a lower-layer protocol data unit.
Extension (X): 1 bit. If the extension bit is set, the fixed header is followed by exactly one header extension, with a format defined in RFC 1889 Section 5.3.1.
CSRC count (CC): 4 bits. The CSRC count contains the number of CSRC identifiers that follow the fixed header.
Marker (M): 1 bit. The interpretation of the marker is defined by a profile. It is intended to allow significant events such as frame boundaries to be marked in the packet stream.
Payload type (PT): 7 bits. This field identifies the format of the RTP payload and determines its interpretation by the application. A profile specifies a default static mapping of payload type codes to payload formats.
Sequence number: 16 bits. The sequence number increments by one for each RTP data packet sent, and may be used by the receiver to detect packet loss and to restore packet sequence. The initial value of the sequence number is random (unpredictable) to make known plaintext attacks on encryption more difficult, even if the source itself does not encrypt, because the packets may flow through a translator that does.
Timestamp: 32 bits. The timestamp reflects the sampling instant of the first byte in the RTP data packet. The sampling instant must be derived from a clock that increments monotonically and linearly in time to allow synchronization and jitter calculations. The resolution of the clock must be sufficient for the desired synchronization accuracy and for measuring packet arrival jitter (one tick per video frame is typically not sufficient). The clock frequency is dependent on the format of data carried as payload and is specified statically in the profile or payload format specification that defines the format, or may be specified dynamically for payload formats defined through non-RTP means. The initial value of the timestamp is random, as for the sequence number. Several consecutive RTP packets may have equal timestamps if they are (logically) generated at once, e.g., belong to the same video frame. Consecutive RTP packets may contain timestamps that are not monotonic if the data is not transmitted in the order it was sampled, as in the case of MPEG interpolated video frames.
SSRC: 32 bits. The SSRC field identifies the synchronization source. This identifier is chosen randomly, with the intent that no two synchronization sources within the same RTP session will have the same SSRC identifier.
CSRC list: 0 to 15 items, 32 bits each. The CSRC list identifies the contributing sources for the payload contained in this packet. The number of identifiers is given by the CC field. If there are more than 15 contributing sources, only 15 may be identified. CSRC identifiers are the SSRC identifiers of contributing sources. For example, for audio packets the SSRC identifiers of all sources that were mixed together to create a packet are listed, allowing correct talker indication at the receiver.
RTP header extension. An extension mechanism is provided to allow individual implementations to experiment with new payload-format-independent functions that require additional information to be carried in the RTP data packet header.
The Real-Time Control Protocol (RTCP) is based on the periodic transmission of control packets to all participants in the session, using the same distribution mechanism as the data packets. The underlying protocol must provide multiplexing of the data and control packets, for example using separate port numbers with UDP. RTCP performs the following functions.
The primary function is to provide feedback on the quality of the data distribution. This is an integral part of RTP’s role and is related to the flow and congestion control functions of other transport protocols. The feedback may be directly useful for control of adaptive encoding, but experiments with IP multicasting have shown that it is also critical to get feedback from the receivers to diagnose faults in the distribution. Sending reception feedback reports to all participants allows a person who is observing problems to evaluate whether those problems are local or global. With a distribution mechanism like IP multicast, it is also possible for an entity such as an ISP who is not otherwise involved in the session to receive the feedback information and act as a third-party monitor to diagnose network problems. This feedback function is performed by the RTCP sender and receiver reports and is described in RFC 1889 Section 6.3.
RTCP carries a persistent transport-level identifier for an RTP source called the ‘canonical name’ or CNAME. Since the SSRC identifier may change if a conflict is discovered or a program is restarted, receivers require the CNAME to keep track of each participant. Receivers also require the CNAME to associate multiple data streams from a given participant in a set of related RTP sessions, for example to synchronies audio and video.
The first two functions require that all participants send RTCP packets, therefore the rate must be controlled in order for RTP to scale up to a large number of participants. By having each participant send its control packets to all the others, each can independently observe the number of participants. This number is used to calculate the rate at which the packets are sent.
A fourth, optional function is to convey minimal session control information, for example participant identification to be displayed in the user interface. This is most likely to be useful in ‘loosely controlled’ sessions where participants enter and leave without membership control or parameter negotiation. RTCP serves as a convenient channel to reach all the participants, but it is not necessarily expected to support all the control communication requirements of an application.
This specification defines several RTCP packet types to carry a variety of control information, namely:
Each RTCP packet begins with a fixed part, similar to that of RTP data packets, followed by structured elements that may be of variable length according to the packet type but always end on a 32-bit boundary. The alignment requirement and a length field in the fixed part are included to make RTCP packets ‘stackable’. Multiple RTCP packets may be concatenated without any intervening separators to form a compound RTCP packet that is sent in a single packet of the lower layer protocol, for example UDP. There is no explicit count of individual RTCP packets in the compound packet since the lower layer protocols are expected to provide an overall length to determine the end of the compound packet.
For further details on RTCP refer to RFC 1889 and RFC 1890.
The Session Description Protocol, SDP (RFC 2327), is a session description protocol for multimedia sessions. A multimedia session, for these purposes, is defined as a set of media streams that exist for some duration of time. Media streams can be many-to-many and the times during which the session is active need not be continuous.
SDP is used for purposes such as session announcement or session invitation. On the Internet Multicast Backbone (MBone), SDP is used to advertise multimedia conferences and communicate the conference addresses and conference tool-specific information necessary for participation. The MBone is the part of the Internet that supports IP multicast, and thus permits efficient many-to-many communication. It is used extensively for multimedia conferencing.
SDP is purely a format for session description. It does not incorporate a transport protocol and has to use different transport protocols as appropriate, including the Session Announcement Protocol (SAP), Session Initiation Protocol (SIP), Real-Time Streaming Protocol (RTSP), email using the MIME extensions, and the Hypertext Transport Protocol (HTTP).
A common approach is for a client to announce a conference session by periodically multicasting an announcement packet to a well-known multicast address and port using the Session Announcement Protocol (SAP). An SAP packet consists of an SAP header with a text payload, which is an SDP session description no greater than 1 Kbytes in length. If announced by SAP, only one session announcement is permitted in a single packet.
Alternative means of conveying session descriptions include email and the World Wide Web. For both email and WWW distribution, the use of the MIME content type ‘application/sd’ is used. This enables the automatic launching of applications for participation in the session from the browser or mail reader.
SDP serves two primary purposes. It is a means to communicate the existence of a session, as well as a means to convey sufficient information to enable joining and participating in the session. In this regard SDP includes information such as the session name and purpose, the time(s) the session is active, the media comprising the session, information to receive those media (addresses, ports, formats etc.), information about the bandwidth to be used by the conference, and contact information for the person responsible for the session.
Detailed information regarding the media include the type of media (video, audio, etc), the transport protocol (RTP, UDP, IP, H.320, etc), as well as the format of the media (H.261 video, MPEG video, etc).
For an IP multicast session, the multicast address as well as the port number for media are supplied. This IP address and port number are the destination address and destination port of the multicast stream, whether being sent, received, or both. For an IP unicast session, the remote IP address for media as well as the port number for the contact address is supplied.
Sessions may either be bounded or unbounded in time. Whether or not they are bounded, they may be only active at specific times. With regard to timing information, SDP can convey an arbitrary list of start and stop times bounding the session. It can also include repeat times for each bound, such as ‘every Monday at 10am for one hour’. This timing information is globally consistent, irrespective of local time zone or daylight saving time.
The Real-Time Streaming Protocol (RTSP) is an Application layer protocol that establishes and controls either a single or several time-synchronized streams of continuous media such as audio and video. It normally does not deliver the continuous streams itself, although interleaving of the continuous media stream with the control stream is possible. In other words, RTSP acts as a ‘network remote control’ for multimedia servers.
There is no notion of an RTSP connection. Instead, a server maintains a session labeled by an identifier. An RTSP session is not tied to a transport-level connection such as a TCP connection. During an RTSP session, an RTSP client may open and close many reliable transport connections to the server to issue RTSP requests. Alternatively, it may use a connectionless transport protocol such as UDP.
The streams controlled by RTSP may use RTP, but the operation of RTSP does not depend on the transport mechanism used to carry continuous media. The protocol is intentionally similar in syntax and operation to HTTP version 1.1 so that extension mechanisms to HTTP can, in most cases, also be added to RTSP. However, RTSP differs fundamentally from HTTP/1.1in that data delivery takes place ‘out-of-band’ in a different protocol. HTTP is an asymmetric protocol where the client issues requests and the server responds. In RTSP, both the media client and media server can issue requests. RTSP requests are also not stateless; they may set parameters and continue to control a media stream long after the request has been acknowledged.
The need for a Quality of Service (QoS) on the Internet arose because of the increasing number of time-sensitive applications involving voice and video. RSVP (RFC 2205) is a resource reservation set-up protocol designed for that purpose. It is used by a host to request specific qualities of service from the network for particular application data streams or flows. It is also used by routers to deliver QoS requests to all nodes along the path(s) of the flows and to establish and maintain the requested service. RSVP requests will generally result in resources being reserved in each node along the data path.
RSVP requests resources for ‘simplex’ flows, i.e. it requests resources in only one direction. Therefore, RSVP treats a sender as logically distinct from a receiver, although the same application process may act as both a sender and a receiver at the same time. RSVP operates on top of IPv4 or IPv6, occupying the place of a transport protocol in the protocol stack. However, RSVP does not transport application data, but rather an Internet control protocol such as ICMP, IGMP, or routing protocols. Like the implementations of routing and management protocols, an implementation of RSVP will typically execute in the background and not in the data path.
RSVP is not itself a routing protocol. Instead, it is designed to operate with current and future unicast and multicast routing protocols. An RSVP process consults the local routing database(s) to obtain routes. In the multicast case, for example, a host sends IGMP messages to join a multicast group and then sends RSVP messages to reserve resources along the delivery path(s) of that group whereas routing protocols determine where packets get forwarded. RSVP, however, is only concerned with the QoS of those packets that are forwarded in accordance with routing decisions.
QoS is implemented for a particular data flow by mechanisms collectively called ‘traffic control’. These mechanisms include a packet classifier, admission control, and a packet scheduler or some other Data Link layer-dependent mechanism to determine when particular packets are forwarded. The packet classifier determines the QoS class (and perhaps the route) for each packet. For each outgoing interface, the packet scheduler or other Data Link layer-dependent mechanism achieves the promised QoS. Traffic control implements QoS service models defined by the Integrated Services Working Group.
RSVP does the following:
The RSVP message consists of 3 sections namely a Common Header (8 bytes), an Object Header (4 bits) and Object Contents (variable length).
The Version field (4 bits) contains the current version, which is 2.
The Flags field (4 bits) is reserved for future use.
The Message Type (1 byte) indicates 1 of 7 currently defined messages. They are:
The RSVP Checksum field (2 bytes) provides error control.
The Send_TTL field (1byte) is the IP Time-to-Live value with which the message was sent.
The RSVP length field (2 bytes) is the total length of the RSVP message in bytes.
Each Object consists of one or more 32 bit long words with a 1 byte Object Header. The Object contents are fairly complex and beyond the scope of this document and readers are referred to RFC 2205 for details.
The following sketch summarizes the protocol stack implementation. It shows a 20mS voice sample from a codec, preceded by an RTP, UDP, IP and Ethernet Header. Note that the drawing is not upside-down, but has been drawn according to the actual sequence in which the bytes are transmitted. The operation of the coders and decoders (codecs) are discussed in the following paragraphs.
When you have completed study of this chapter you should be able to:
Although people tend to refer to the ‘Internet’ as one global entity, there are in fact three clearly defined subsets of this global network. Four, in fact, if one wishes to include the so-called ‘community network’. It just depends on where the conceptual boundaries are drawn.
This expansion of the Internet into organizations, in fact right down to the factory floor, has opened the door to incredible opportunities. Unfortunately it has also opened the door to pirates and hackers. Therefore, as the use of the Internet has grown, so has the need for security. The TCP/IP protocols and network technologies are inherently designed to be open in order to allow interoperability. Therefore, unless proper precautions are taken, data can readily be intercepted and altered – often without the sending or the receiving party being aware of the security breach. Because dedicated links between the parties in a communication are often not established in advance, it is easy for hackers to impersonate one of the parties involved.
There is a misconception that attacks on a network will always take place from the outside. This is as true of networks as it is true of governments. The growth in network size and complexity has increased the potential points of attack both from outside and from within.
Without going into too much detail, the following list attempts to give an idea of the magnitude of the threat experienced by intranets and extranets:
Here follows a list of some additional threats:
These are not merely theoretical concerns. While computer hackers breaking into corporate computer systems over the Internet have received a great deal of press in recent years, in reality insiders such as employees, former employees, contractors working on- site and other suppliers are far more likely to attack their own company’s computer systems via an intranet. In a 1998 survey of 520 security practitioners in US corporations and other institutions conducted by the Computer Security Institute (CSI) with the participation of the FBI, 44 per cent reported unauthorized access by employees compared with 24 per cent reporting system penetration from the outside!
Such insider security breaches are likely to result in greater losses than attacks from the outside. Of the organizations that were able to quantify their losses, the CSI survey found that the most serious financial losses occurred through unauthorized access by insiders, with 18 companies reporting total losses of $51 million as compared with $86 million for the remaining 223 companies. The following list gives the average losses from various types of attacks as per the CSI/FBI Survey:
Fortunately technology has kept up with the problem, and the rest of this chapter will deal with possible solutions to the threat. Keep in mind that securing a network is a continuous process, not a one-time prescription drug that can be bought over the counter.
Also, remember that the most sensible approach is a defense-in-depth (‘belt-and-braces’) approach as used by the nuclear industry. In other words, one should not rely on a single approach, but rather a combination of measures with varying levels of complexity and cost.
An organization whose LAN (or intranet) is not routed to the Internet mainly faces internal threats to its network. In order to allow access only to authorized personnel, authentication is often performed by means of passwords. A password, however, is mainly used to ‘keep the good guys out’ since it is usually very easy to figure out someone’s password, or to capture un-encrypted passwords with a protocol analyzer as they travel across the network.
To provide proper authentication, two or three items from the following list are required.
There are several password authentication protocols in use, such as the Password Authentication Protocol (PAP) and the Challenge Handshake Authentication protocol (CHAP). However, the most secure method of authentication at this point in time is IEEE 802.1X port-based access control in conjunction with a RADIUS server. This is the case for wireless as well as wired access.
A router can be used as a simple firewall that connects the intranet to the ‘outside world’. Despite the fact that its primary purpose is to route packets, it can also assist in protecting the intranet.
In comparison to firewalls, routers are extremely simple devices and are clearly not as effective in properly securing a network perimeter access point. However, despite their lack of sophistication, there is much that can be done with routers to improve security on a network. In many cases these changes involve little administrative overhead.
There are two broad objectives in securing a router, namely:
The following approaches can be taken:
Routers can be used to block unwanted traffic and therefore act as a first line of defense against unwanted network traffic, thereby performing basic firewall functions. It must, however, be kept in mind that they were developed for a different purpose, namely routing, and that their ability to assist in protecting the network is just an additional advantage. Routers, however sophisticated, generally do not make particularly intricate decisions about the content or source of a data packet. For this reason network managers have to revert to dedicated firewalls.
Firewalls can be one of the following types:
We will review each of these types in detail.
Packet Filter firewalls are essentially routing devices that include access control functionality for system addresses and communication sessions. The access control functionality of a Packet Filter firewall is governed by a set of directives collectively referred to as a rule-set. A typical rule-set is shown in table 6.1 below.
S.No | Source address | Source port | Destination address | Destination port | Action | Description |
---|---|---|---|---|---|---|
1 | Any | Any | 192.168.1.0 | >1023 | Allow | Rule to allow returning TCP connections to internal subnet |
2 | 192.168.1.1 | Any | Any | Any | Deny | Prevent Firewall system itself from directly connecting to anything |
3 | Any | Any | 192.168.1.1 | Any | Deny | Prevent external users from directly accessing the Firewall system. |
4 | 192.168.1.0 | Any | Any | Any | Allow | Internal users can access external servers |
5 | Any | Any | 192.168.1.2 | SMTP | Allow | Allow external users to send email in |
6 | Any | Any | 192.168.1.3 | HTTP | Allow | Allow external Users to access Internet |
7 | Any | Any | Any | Any | Deny | “Catch-all” rule – everything not previously allowed is explicitly denied |
Table 14.2
Typical rule-set of a Packet Filter firewallIn their most basic form, packet firewalls provide network access control based upon several pieces of information contained in a network packet:
Packet Filter firewalls are generally deployed within TCP/IP network infrastructures. In the context of modern network infrastructures, a firewall at layer 2 is used in load balancing and/or high-availability applications in which two or more firewalls are employed to increase throughput or for fail-safe operations. Packet Filtering firewalls and routers can also filter network traffic based upon certain characteristics of that traffic, such as whether the packet’s layer 3 protocol is ICMP. Attackers have used this protocol to flood networks with traffic, thereby creating Denial-Of-Service (DOS) attacks. Packet Filter firewalls also have the capability to block other attacks that take advantage of weaknesses in the TCP/IP suite.
Packet Filter firewalls have two main strengths: speed and flexibility. Since Packet Filters do not usually examine data above layer 3 of the OSI model, they can operate very fast. Likewise, since most modern network protocols can be accommodated using layer 3 and below, Packet Filter firewalls can be used to secure nearly any type of network communication or protocol. This simplicity allows Packet Filter firewalls to be deployed into nearly any enterprise network infrastructure. Their speed and flexibility, as well as their capability to block denial-of-service and related attacks, makes them ideal for placement at the outermost boundary of the trusted network. From that position it can block certain attacks, possibly filter unwanted protocols, perform simple access control, and then pass the traffic on to other firewalls that examine higher layers of the OSI stack.
Packet Filter firewalls also have a few shortcomings:
Packet Filter type firewalls are therefore limited to applications where logging and user authentication are not important issues, but where high speed is essential.
Stateful Inspection firewalls are packet filters that incorporate added awareness of layer 4 (OSI model) protocols. Stateful Inspection evolved from the need to accommodate certain features of the TCP/IP protocol suite that make firewall deployment difficult. When a TCP (connection-oriented) application creates a session with a remote host system, a port is also created on the source system for the purpose of receiving network traffic from the destination system. According to the TCP specifications, this client source port will be some number greater than 1023 and less than 49151. According to convention, the destination port on the remote host will likely be a low-numbered port, less than 1024. This will be 25 for SMTP, for example. Packet Filter firewalls must permit inbound network traffic on all of these high-numbered ports for connection-oriented transport to occur, i.e., to allow return packets from the destination system. Opening this many ports creates an immense risk of intrusion by unauthorized users who may employ a variety of techniques to abuse the expected conventions.
Stateful Inspection firewalls solve this problem by creating a directory of outbound TCP connections, along with each session’s corresponding high-numbered client port. This state table is then used to validate any inbound traffic. The Stateful Inspection solution is more secure because the firewall tracks client ports individually rather than opening all high-numbered ports for external access. An example of such a state table is shown below in table 14.3.
Source address | Source port | Destination address | Destination port | Connection state |
---|---|---|---|---|
192.168.1.100 | 1030 | 210.9.88.29 | 80 | Established |
192.168.1.102 | 1031 | 216.32.42.123 | 80 | Established |
192.168.1.101 | 1033 | 173.66.32.122 | 25 | Established |
192.168.1.106 | 1035 | 177.231.32.12 | 79 | Established |
223.43.21.231 | 1990 | 192.168.1.6 | 80 | Established |
219.22.123.32 | 2112 | 192.168.1.6 | 80 | Established |
210.99.212.18 | 3321 | 192.168.1.6 | 80 | Established |
24.102.32.23 | 1025 | 192.168.1.6 | 80 | Established |
223.212.212 | 1046 | 192.168.1.6 | 80 | Established |
Table 14.3
State table of a Stateful Inspection firewallA Stateful Inspection firewall also differs from a Packet Filter firewall in that Stateful Inspection is useful or applicable only within TCP/IP network infrastructures. Stateful Inspection firewalls can accommodate other network protocols in the same manner as Packet Filters, but the actual Stateful Inspection technology is relevant only to TCP/IP.
Application-Proxy Gateway firewalls are advanced firewalls that combine lower layer access control with upper layer (OSI layer 7) functionality. These firewalls do not require a layer 3 (Network layer) route between the inside and outside interfaces of the firewall; the firewall software performs the routing. In the event of the Application-Proxy Gateway software ceasing to function, the firewall system will be unable to pass network packets through the firewall system, since all network packets that traverse the firewall must do so under software control.
Each individual application proxy, also referred to as a proxy agent, interfaces directly with the firewall access control rule set to determine whether a given piece of network traffic should be permitted to transit the firewall. In addition to the rule set, each proxy agent has the ability to require authentication of each individual network user.
This user authentication can take many forms, including the following:
The advantages of Application-Proxy Gateway firewalls are as follows:
There are some disadvantages too. To inspect the entire content of the packet takes time. This makes the firewall slower and not well suited to high-bandwidth applications. New network applications and protocols are also not well supported in this type of firewall, as each type of network traffic handled by the firewall requires an individual proxy agent.
In order not to compromise the speed of the Application Proxy Firewall, it is customary to have the proxy capability and firewall function segregated by using a dedicated proxy server, so that firewall function can be performed faster.
Dedicated proxy servers are typically deployed behind traditional firewalls. In typical use a main firewall might accept inbound traffic, determine which application is being targeted, and then hand off the traffic to the appropriate proxy server. An example of this would be an HTTP proxy deployed behind the firewall; users would need to connect to this proxy en route to connecting to external web servers. Typically, dedicated proxy servers are used to decrease the workload on the firewall and to perform more specialized filtering and logging that otherwise might be difficult to perform on the firewall itself. Dedicated proxies allow an organization to enforce user authentication requirements as well as other filtering and logging on any traffic that traverses the proxy server. The implications are that an organization can restrict outbound traffic to certain locations or could examine all outbound e-mail for viruses or restrict internal users from writing to the organization’s web server. At the same time, filtering outbound traffic would place a heavier load on the firewall and increase administration costs.
Recent advances in network infrastructure engineering and information security have resulted in current firewall products that incorporate functionalities from several different classifications of firewall platforms. For example, many application-proxy gateway firewall vendors have implemented basic Packet Filter functionality in order to provide better support for UDP based applications.
Likewise, many Packet Filter or Stateful Inspection Packet Filter firewall vendors have implemented basic application-proxy functionality to offset some of the weaknesses associated with their firewall platforms. In most cases, Packet Filter or Stateful Inspection Packet Filter firewall vendors implement application proxies to provide improved network traffic logging and user authentication in their firewalls.
NAT (Network Address Translation) is the method of hiding the internal address scheme of a network behind a firewall. In essence, NAT allows an organization to deploy an addressing scheme of its choosing behind a firewall, while still maintaining the ability to connect to external resources through the firewall. Another advantage of NAT is the mapping of non-routable (private) IP addresses to a smaller set of legitimate addresses, which is useful in the current scenario of IP address depletion.
There are two types of address translations that are possible viz. ‘static’ and ‘hiding’. In static network address translation each internal system on the private network has a corresponding external, routable IP address associated with it. This particular technique is seldom used, due to the scarcity of available IP address resources. By using this method, an external system could access an internal web server of which the address has been mapped with static network address translation. The firewall would perform mappings in either direction, outbound or inbound. Table 14.4 below shows a typical example of translated addresses.
Internal (RFC 1918) IP address | External (globally routable) IP address |
---|---|
192.168.1.100 | 207.119.32.81 |
192.168.1.101 | 207.119.32.82 |
192.168.1.102 | 207.119.32.83 |
192.168.1.103 | 207.119.32.84 |
192.168.1.104 | 207.119.32.85 |
192.168.1.105 | 207.119.32.86 |
192.168.1.106 | 207.119.32.87 |
192.168.1.107 | 207.119.32.88 |
192.168.1.108 | 207.119.32.89 |
192.168.1.109 | 207.119.32.90 |
Table 14.4
Example of static NATWith ‘hiding’ NAT, all systems behind a firewall share the same external, routable IP address. Thus, with a hiding NAT system, five thousand hosts behind a firewall will still look like only one host to the outside world. This type of NAT is fairly common, but it has one glaring weakness in that it is not possible to make resources available to external users once they are placed behind a firewall. Mapping in reverse from outside systems to internal systems is impossible; therefore systems that must be accessible to external systems must not have their addresses mapped statically.
This is another method of address protection. PAT works by using the client port address to identify inbound connections. For example, if a system behind a firewall employing PAT were to TELNET out to a system on the Internet, the external system would see a connection from the firewall’s external interface, along with the client source port. When the external system replies to the network connection, it would use the above addressing information. When the PAT firewall receives the response, it would look at the client source port provided by the remote system. Based on that source port, it would determine which internal system requested the session.
There are two advantages to this approach. First, PAT is not required to use the IP address of the external firewall interface for all network traffic; another address can be created for this purpose. Second, with PAT it is possible to place resources behind a firewall system and still make them selectively accessible to external users.
Firewall applications are available as a part of some operating systems such as Linux or Windows XP or, in some cases, as add-ons. They can be used to secure an individual host only. These are known as host-based firewalls.
Host-based firewall applications typically provide access-control capability for restricting traffic to and from servers running on the host, and there is usually some limited logging available. While a host-based firewall is less desirable for high-traffic, high-security environments, in internal network environments or regional offices they offer greater security, usually at a lower cost. Some host-based firewalls must be administered separately, and after a certain number of installations it becomes easier and less expensive to simply place all servers behind a dedicated firewall configuration.
Home users dialing an ISP may have little firewall protection available to them because the ISP has to accommodate many different security policies. Many of these users are actually using their computers for remote connectivity with their Enterprise networks. Personal firewalls have been developed to fulfill the needs of such users and provide protection for remote systems by performing many of the same functions as larger firewalls.
Two possible configurations are usually employed. One is a Personal Firewall, which is software-based and installed on the system it is meant to protect. This type of firewall does not offer protection to other systems or resources, but only protects the computer system they are installed on.
The second configuration is called a Personal Firewall Appliance, which is in concept more similar to that of a traditional firewall. In most cases, personal firewall appliances are designed to protect small networks such as those in home offices.
They run on specialized hardware and are usually integrated with some other additional components such as a:
Although personal firewalls and personal firewall appliances lack some of the advanced, enterprise-scale features of traditional firewall platforms, they can still form an effective component of the overall security posture of an organization. In terms of deployment strategies, personal firewalls and personal firewall appliances normally address the connectivity concerns associated with remote users or branch offices.
However, some organizations employ these devices on the organizational intranet. Personal firewalls and personal firewall appliances can also be used to terminate VPNs. Many vendors currently offering firewall-based VPN termination offer a personal firewall client as well. Personal firewalls and firewall appliances used by an Enterprise on remote users’ machines pose special problems. While the Enterprise will normally like to extend the firewall policies adopted for its internal network to remote users as well, many remote users who use their computers to connect to an ISP for non-work related use would like to implement a different set of policies. The best solution is, of course, to use separate laptops or desktop computers for work-related and non-work related use; in which case the systems used for work-related tasks can be configured to connect to the enterprise network only.
The following simple rules are applicable in building firewalls in enterprise networks.
A simple design minimizes errors and is easier to manage; Complex designs may lead to configuration errors and are difficult to manage.
It is better to use devices that are meant to be used as firewalls rather than using another device to perform this as an additional function. For example, while a router may have the capacity to function as a firewall, it is better to install a firewall server or appliance designed for this task alone.
Layers of security are better than a single layer so that if there is a breach in one layer, the enterprise network does not become totally exposed and vulnerable.
Sensitive systems, such as those relating to enterprise finances, should be protected by their own firewalls so that they will not be broken into by internal users who work behind the enterprise firewalls.
Intrusion detection is a new technology that enables network and security administrators to detect patterns of misuse within the context of their network traffic. IDS is a growing field and there are several excellent intrusion detection systems available today, not just traffic monitoring devices.
These systems are capable of centralized configuration management, alarm reporting, and attack information logging from many remote IDS sensors. IDSs are intended to be used in conjunction with firewalls and other filtering devices, not as the only defense against attacks.
Intrusion detection systems are implemented as host-based systems and network-based systems.
Host-based IDSs use information from the operating system audit records to watch all operations occurring on the host on which the intrusion detection software has been installed. These operations are then compared with a pre-defined security policy. This analysis of the audit trail, however, imposes potentially significant overhead requirements on the system because of the increased amount of processing power required by the intrusion detection software. Depending on the size of the audit trail and the processing power of the system, the review of audit data could result in the loss of a real-time analysis capability.
Network-based intrusion detection is performed by dedicated devices (probes) attached to the network at several points and passively monitor network activity for indications of attacks. Network monitoring offers several advantages over host-based intrusion detection systems. Because intrusions might occur at many possible points over a network, this technique is an excellent method of detecting attacks which may be missed by host-based intrusion detection mechanisms.
The greatest advantage of network monitoring mechanisms is their independence from reliance on audit data (logs). Because these methods do not require input from any operating system’s audit trail, they can use standard network protocols to monitor heterogeneous sets of operating systems and hosts.
Independence from audit trails also frees network monitoring systems from an inherent weakness caused by the vulnerability of the audit trail to attack. Intruder actions that interfere with audit functions or modify audit data can lead to the prevention of intrusion detection or the inability to identify the nature of an attack. Network monitors are able to avoid attracting the attention of intruders by passively observing network activity and reporting unusual occurrences.
Another significant advantage of detecting intrusions without relying on audit data is the improvement of system performance resulting from the removal of the overhead imposed by the audit trail analysis. Techniques that move audit data across network connections reduce the bandwidth available to other functions.
Certification is the process of proving that the performance of a particular piece of equipment conforms to the existing policies and specifications. Whereas this is easy in the case of electrical wiring and wall sockets, where Underwriters’ Laboratory can certify the product, it is a different case with networks where no official bodies or guidelines exist.
In certifying a network security solution, there are only two options namely trusting someone else’s assumptions about one’s network, or certifying it oneself. It certainly is possible to certify a network by oneself. This exercise will demand some time but will leave the certifier with a deeper knowledge of how the system operates.
The following are needed for self-certification:
To simplify this discussion, we will assume we are certifying a firewall configuration. Let us look at each requirement individually.
One of the biggest weaknesses in security practice is the large number of cases in which a formal vulnerability analysis finds a hole that simply cannot be fixed. Often the causes are a combination of existing network conditions, office politics, budgetary constraints, or lack of management support. Regardless of who is doing the analysis, management needs to clear up the political or budgetary obstacles that might prevent implementation of security.
In this case ‘policy’ means the access control rules that the network security product is intended to enforce. In the case of the firewall, the policy should list:
Many firewalls expose details of TCP/IP application behavior to the end user. Unfortunately, there have been cases where individuals bought firewalls and took advantage of the firewall’s easy ‘point and click’ interface, believing they were safe because they had a firewall. One needs to understand how each service to be allowed in and out operates in order to make an informed decision about whether or not to permit it.
When starting to certify components of a system, one will need to research existing holes in the version of the components to be deployed. The Internet with its search engines is an invaluable tool for finding vendor-provided information about vulnerabilities, hacker-provided information about vulnerabilities, and wild rumors that are totally inaccurate. Once the certification process has been deployed, researching the components will be a periodic maintenance effort.
Research takes time. Management needs to support this and invest in the time necessary to do the job right. Depending on the size or complexity of the security system in question, one could be looking at anything between a day’s work and several weeks.
The ultimate reason for having security policies is to save money.
This is accomplished by:
In the process of developing a corporate security consciousness one will, amongst other things, have to:
The corporate security policies are not limited to minimizing the possibility on internal and external intrusions, but also to:
The topics covered in the security policy document should include:
In the process of implementing security policies one need not re-invent the wheel. Security policy implementation guidelines are available on the Internet, hard copy and CD-ROM. By using a word processing package one can generate or update a professional policy statement in a couple of days.
There are several security advisory services available to systems administrators, for example CSI, CERT and the vendors (e.g. Microsoft) themselves.
CERT (the Computer Emergency Response Team) co-ordination center is based at the Carnegie Mellon Software Engineering Institute and offers a security advisory service on the Internet. Their services include CERT advisories, incident notes, vulnerability notes and security improvement modules. The latter includes topics such as detecting signs of intrusions, security for public web sites, security for information technology service contracts, securing desktop stations, preparing to detect signs of intrusion, responding to intrusions and securing network services. These modules can be downloaded from the Internet in PDF or PostScript versions and are written for system and network administrators within an organization. These are the people whose day-to-day activities include installation, configuration and maintenance of the computers and networks.
A particular case in point is the CERT report on the Melissa virus (CA-99-04-MELISSA-MICRO-VIRUS.HTML) dated March 27, 1999 which dealt with the Melissa virus which was first reported at approximately 2:00 pm GMT-5 on Friday, 26 March 1999. This indicates the swiftness with which organizations such as CERT react to threats.
CSI (the Computer Security Institute) is a membership organization dedicated to serving and training information computer and network security professionals. CSI also hosts seminars on encryption, intrusion, management, firewalls and awareness. They publish surveys and reports on topics such as computer crime and information security program assessment.
The concept of securing messages through cryptography has a long history. Indeed, Julius Caesar is credited with creating one of the earliest cryptographic systems to send military messages to his generals.
Throughout history there has been one central problem limiting widespread use of cryptography. That problem is key management. In cryptographic systems the term ‘key’ refers to a numerical value used by an algorithm to alter information, making that information secure and visible only to individuals who have the corresponding key to recover the information. Consequently, the term key management refers to the secure administration of keys in order to provide them to users where and when required.
Historically, encryption systems used what is known as symmetric cryptography. Symmetric cryptography uses the same key for both encryption and decryption. Using symmetric cryptography, it is safe to send encrypted messages without fear of interception, because an interceptor is unlikely to be able to decipher the message. However, there always remains the difficult problem of how to securely transfer the key to the recipients of a message so that they can decrypt the message.
A major advance in cryptography occurred with the invention of public-key cryptography. The primary feature of public-key cryptography is that it removes the need to use the same key for encryption and decryption. With public key cryptography, keys come in pairs of matched public and private keys. The public portion of the key pair can be distributed openly without compromising the private portion, which must be kept secret by its owner. Encryption done with the public key can only be undone with the corresponding private key.
Prior to the invention of public key cryptography it was essentially impossible to provide key management for large-scale networks. With symmetric cryptography, as the number of users increases on a network, the number of keys required to provide secure communications among those users increases rapidly. For example, a network of 100 users would require almost 5000 keys if only symmetric cryptography was used. Doubling such a network to 200 users increases the number of keys to almost 20 000. When only using symmetric cryptography, key management quickly becomes unwieldy even for relatively small networks.
The invention of public key cryptography was of central importance to the field of cryptography and provided answers to many key management problems for large-scale networks. For all its benefits, however, public-key cryptography did not provide a comprehensive solution to the key management problem. Indeed, the possibilities brought forth by public-key cryptography heightened the need for sophisticated key management systems to answer questions such as the following:
The next section provides an introduction to the mechanics of encryption and digital signatures.
To better understand how cryptography is used to secure electronic communications, a good everyday analogy is the process of writing and sending a check to a bank.
Remember that both the client and the bank are in possession of matching private key/public key sets. The private keys need to be guarded closely, but the public keys can be safely transmitted across the Internet since all it can do is unlock a message locked (encrypted) with its matching private key. Apart from that it is pretty useless to anybody else.
The simplest electronic version of the check can be a text file, created with a word processor, asking a bank to pay someone a specific sum. However, sending this check over an electronic network poses several security problems:
To address these issues, the verification software performs a number of steps hidden behind a simple user interface. The first step is to ‘sign’ the check with a digital signature.
The process of digitally signing starts by taking a mathematical summary (called a hash code) of the check. This hash code is a uniquely identifying digital fingerprint of the check. If even a single bit of the check changes, the hash code will dramatically change.
The next step in creating a digital signature is to sign the hash code with the sender’s private key. This signed hash code is then appended to the check.
This acts like a signature since the recipient (in this case the bank) can verify the hash code sent to it, using the sender’s public key. At the same time, a new hash code can be created from the received check and compared with the original signed hash code. If the hash codes match, then the bank has verified that the check has not been altered. The bank also knows that only the genuine originator could have sent the check because only he has the private key that signed the original hash code.
Once the electronic check is digitally signed, it can be encrypted using a high-speed mathematical transformation with a key that will be used later to decrypt the document. This is often referred to as a symmetric key system because the same key is used at both ends of the process.
As the check is sent over the network, it is unreadable without the key, and hence cannot be intercepted. The next challenge is to securely deliver the symmetric key to the bank.
Public-key encryption is used to solve the problem of delivering the symmetric encryption key to the bank in a secure manner. To do so, the sender would encrypt the symmetric key using the bank’s public key. Since only the bank has the corresponding private key, only the bank will be able to recover the symmetric key and decrypt the check.
The reason for this combination of public-key and symmetric cryptography is simple. Public-key cryptography is relatively slow and is only suitable for encrypting small amounts of information, such as symmetric keys. Symmetric cryptography is much faster and is suitable for encrypting large amounts of information such as files.
Organizations must not only develop sound security measures; they must also find a way to ensure consistent compliance with them. If users find security measures cumbersome and time consuming to use, they are likely to find ways to circumvent them, thereby putting the company’s Intranet at risk.
Organizations can ensure the consistent compliance to their security policy through:
Imagine a company that wants to conduct business electronically, exchanging quotes and purchase orders with business partners over the Internet.
Parties exchanging sensitive information over the Internet should always digitally sign communications so that the sender can securely identify themselves, assuring business partners that the purchase order really came from the party claiming to have sent it (providing a source authentication service), and a trusted third party cannot alter the purchase orders to request hypodermic needles instead of sewing needles (data integrity), If the company is also concerned about keeping particulars of their business private, they may also choose to encrypt these communications (confidentiality).
The most convenient way to secure communications on the Internet is to employ public-key cryptography techniques. But before doing so, the user will need to find and verify the public keys of the party with whom he or she wishes to communicate. This is where a PKI comes into the picture.
A successful public key infrastructure needs to perform the following:
Let us now look at each of these in turn.
Deploying a successful public key infrastructure requires looking beyond technology. As one might imagine, when deploying a full-scale PKI system, there may be dozens or hundreds of servers and routers, as well as thousands or tens of thousands of users with certificates. These certificates form the basis of trust and interoperability for the entire network. As a result the quality, integrity, and trustworthiness of a PKI depend on the technology, infrastructure, and practices of the CA that issues and manages these certificates.
CAs have several important duties. First and foremost, they must determine the policies and procedures that govern the use of certificates throughout the system.
The CA is a ‘trusted third party’, similar to a passport office, and its duties include:
Since the quality, efficiency and integrity of any PKI depends on the CA, the trustworthiness of the CA must be beyond reproach.
On the one end of the spectrum, certain users prefer one centralized CA that controls all certificates. Whilst this would be the ideal case, the actual implementation would be a mammoth task.
At the other end of the spectrum, some parties elect not to employ a CA for signing certificates at all. With no CAs, the individual parties are responsible for signing each other’s certificates. If a certificate is signed by the user or by another party trusted by the user, then the certificate can be considered valid. This is sometimes called a ‘web of trust’ certification model. This is the model popularized by the PGP (Pretty Good Privacy) encryption product.
Somewhere in the middle ground lies a hybrid approach that relies on independent CAs as well as peer-to-peer certification. In such an approach a business may act as its own CA, issuing certificates for its employees and trading partners. Alternatively, trading partners may agree to honor certificates signed by trusted third party CAs. This decentralized model most closely mimics today’s typical business relationships, and it is likely to be the way PKIs will mature.
Building a public key infrastructure is not an easy task. There are a lot of technical details to address, but the concept behind an effective PKI is quite simple: a PKI provides the support elements necessary to enable the use of public key cryptography. One thing is certain: the PKI will eventually, whether directly or indirectly, reach every Internet user.
E-commerce transactions don’t always involve parties that share a previously established relationship. For this reason a PKI provides a means for retrieving certificates. If provided with the identity of the person of interest, the PKI’s directory service will provide the certificate. If the validity of a certificate needs to be verified, the PKI’s certificate directory can also provide the means for obtaining the signer’s certificate.
Occasionally, certificates must be taken out of circulation or revoked. After a period of time a certificate will expire. In other cases, an employee may leave the company or individuals may suspect that their private key has been compromised. In such circumstances simply waiting for a certificate to expire is not the best option, but it is nearly impossible to physically recall all possible copies of a certificate already in circulation. To address this problem, CAs publish certificate revocation lists (CRLs) and compromised key lists (KRLs).
The true value of a PKI is that it provides all the pieces necessary to verify certificates. The certification process links public keys to individual entities, directories supply certificates as needed, and revocation mechanisms help ensure that expired or untrustworthy certificates are not used.
Certificates are verified where they are used, placing responsibility on all PKI elements to keep current copies of all relevant CRLs and KRLs. On-line Certificate Status Protocol (OCSP) servers may take on CRL/KRL tracking responsibilities and perform verification duties when requested.
When you have completed study of this chapter you should be able to:
In the past, Supervisory Control And Data Acquisition (SCADA) functions were primarily performed by dedicated computer-based SCADA systems. Whereas these systems still do exist and are widely used in industry, the SCADA functions can increasingly be performed by TCP/IP/Ethernet-based systems. The advantage of the latter approach is that the system is open; hence hardware and software components from various vendors can be seamlessly and easily integrated to perform control and data acquisition functions.
One of the most far-reaching implications of the Internet type approach is that plants can be controlled and/or monitored from anywhere in the world, using the technologies discussed in this chapter.
Stand-alone SCADA systems are still being marketed. However, many SCADA vendors are now also manufacturing Internet compatible SCADA systems that can be integrated into an existing TCP/IP/Ethernet plant automation system.
Traditionally, automation systems have implemented networking in a hierarchical fashion, with different techniques used for the so-called ‘enterprise’, ‘control’ and ‘device’ layers.
Interfaces between ‘levels’ as well as between different types of network at the same level require intricate data collection and application gateway techniques. The task of configuring these devices, in addition to configuring the PLCs and enterprise layer computers, provides much scope for confusion and delays. In addition to this, the need for different network hardware and maintenance techniques in the three levels complicates spares holding and technician training. In order to overcome this problem, there is a growing tendency to use a single set of networking techniques (such as Ethernet and TCP/IP), to communicate at and between all three levels.
At the enterprise layer the networking infrastructure is primarily used to transfer large units of information on an irregular basis. Examples are sending email messages, downloading web pages, making ad-hoc SQL queries, printing documents, and fetching computer programs from file servers. Deterministic and/or real-time operation is not required here.
A particular problem area is the two lower levels, where there is often an attempt to mix routine scanning of data values with on-demand signaling of alarm conditions, along with transfer of large items such as control device programs, batch reports and process recipes. There are many networks used here, such as ProfiBus (DP and FMS), FIP, Modbus Plus, DeviceNet, ControlNet and Foundation Fieldbus (H-1 and HSE). Even worse, the design characteristics of each are sufficiently different to make seamless interconnection very difficult. All these networks have their own techniques for addressing, error checking, statistics gathering, and configuration. This imposes complications even when the underlying data itself is handled in a consistent way.
One technique commonly used to offset this problem is to divide the information available at each layer into ‘domains’, and make the devices interconnecting these domains responsible for ‘translating’ requests for information. As an example, the PLC might use its device bus to scan raw input values, and then make a subset of them available as ‘data points’ on the ‘control bus’. Similarly, a cell control computer or operator station might scan data points from its various ‘control bus’ segments and make selected data available in response to queries on the ‘enterprise’ network.
Although these techniques can be made to work, they have a number of significant disadvantages:
A typical MTBF (Mean Time Between Failures) of general-purpose computer systems is 50 000 hours for hardware and 14 000 hours for software. A typical MTBF for PLC systems are 100 000 hours. At these rates, a plant with 100 PLCs or computers would expect to experience about one failure requiring hardware replacement PER MONTH. Losing a number of hours’ production each month due to hardware problems is an untenable situation. That is why automation vendors consider the ability to reinstall and restart a PLC or control system from virgin hardware in a rapid and reliable way to be mandatory.
It is widely recognized that the traditional hierarchical structure of factory automation systems can be replaced with a single network, using the Internet as a model. In this model, all stations can conceivably intercommunicate and all stations can also communicate to the outside world via a firewall. Such a network would obviously be segmented for performance and security. The traditional computer-based gateways separating the three layers (enterprise, device, control) can now be replaced with off-the-shelf bridges, switches and routers with a high degree of reliability.
One of the challenges in designing such a network-based solution is the choice of a common interconnection technology. It should ideally be universal and vendor neutral and inexpensive to deploy in terms of hardware, cabling and training time. It should facilitate integration of all equipment within the plant; it should be simple to understand and configure; and it should be scalable in performance to support the future growth of the network.
Five specific areas have to be addressed in order to enable the implementation of fully open (or ‘transparent’) control systems architecture for modern day factories. They are:
The ideal choice here is a TCP/IP. This allows integration with the Internet, enabling access to the plant on a global basis. TCP/IP is a military standard protocol suite, originally designed for end-to-end connection oriented control over long-haul networks. It is tolerant of wide speed variations and poor quality point-to-point links, and is compatible with the firewalls and proxy servers required for network security. It can operate across the Internet as well as across WANs, and is easy to troubleshoot with standard TCP/IP utilities and freeware protocol analyzers.
Considering all the advantages of TCP/IP, the overhead of 40 bytes per packet is a small price to pay.
One solution for a ‘generic’ implementation of the Application layer is Modbus, a vendor neutral data representation protocol. This is the path followed by the Modbus-IDA Group in Europe and North America (see www.modbus-ida.org and www2.modbus-ida.org). Modbus has been referred to as ‘every vendor’s second choice but every integrator’s first choice’. Reasons for this include the fact that the specification is open and freely downloadable, and that the minimal implementation involves only two messages. It is easy to adapt existing serial interface software for Modbus, and also very easy to perform automatic protocol translation.
Although most implementations of the Modbus protocol are used on low-speed point-to-point serial links or twisted pair multidrop networks, it has been adapted successfully for radio, microwave, public switched telephone, infra-red, and almost any other communication mechanism conceivable. Despite the simplicity of Modbus it performs its intended function extremely well. Later in this chapter we will show how Modbus has been adapted to run over TCP.
Alternatively, vendors can modify their existing upper layer stacks to run over TCP/IP. A good example is the ODVA’s DeviceNet, which has been modified to run over TCP/IP and Ethernet and marketed under the name Ethernet/IP. The ‘IP’ or ‘Industrial Protocol” in this case refers to the Application layer protocol, namely the Control and Information Protocol or CIP. CIP has been left untouched, so the operators see no difference.
We will look at two solutions to this problem, viz. (a) the use of embedded web servers and (b) the use of OPC.
Once all computers and control devices are connected via a seamless Internet-compatible network, it becomes possible to use web servers to make plant information available to operators. This can be done in two different ways.
Firstly, a control device can incorporate its own local web server. This means that plant I/O information, accessible only by that device, can be reported in a legible form as a set of web pages, and therefore displayed on any computer on the intranet, extranet, or Internet by means of a web browser.
Alternatively, a general-purpose computer can act as a web server, and gather data for individual web page requests by generating the native Modbus requests used by the control devices. These requests can then be sent out over the Modbus/TCP network, and will interrogate either the control devices directly (if they have TCP/IP interfaces) or via simple protocol converters (such as an off-the-shelf Modbus gateway) to convert requests into a form that legacy equipment would understand.
In both cases, there are two specific obstacles, namely that of reconfiguring a computer that has replaced a defective one, and maintaining the data directory.
In order to solve the problem of reconfiguring a computer after replacing a defective one, a network computer can be used as a web server, which means that the latter would be self-configuring on installation. The network computers then install themselves from a server elsewhere on the network when powered up. This means that if a network computer were ever to fail, a new computer could be installed in its place, powered up, and it would immediately take on the same identity as its predecessor.
Another problem is how to present and maintain the directory that stores and maintains the attributes of all data items. Despite a variety of proprietary solutions to this problem, there is an emerging standard called LDAP (Lightweight Directory Access Protocol), which was originally intended for keeping a registry of email addresses for an organization. Under this scheme, LDAP would maintain a hierarchical ‘picture’ of plant points within machines, machines within locations, and areas within an organization. LDAP makes it easy to reorganize the directory if the organization of the physical machines and data points needs to be modified.
Each plant point could have attributes such as tag name, data type, scale, input limits, units, reference number, size, orientation and physical machine name. In addition to this, each physical machine would have attributes such as IP address and MAC address.
Another solution gaining popularity is OPC. Many modern SCADA systems are built entirely around OPC, and systems such as ProfiNet are inherently OPC compatible.
OPC was originally designed around Microsoft DCOM, although several vendors have implemented their own version of DCOM. COM is Microsoft’s Component Object Model and DCOM is the distributed version thereof, i.e. DCOM objects can reside on different processors, even on a LAN or a WAN. This is a component software architecture that can be used to build software components (black boxes) that can interoperate regardless of where they are situated. DCOM also provides the mechanisms for the communication between these components, shared memory management between them, and the dynamic loading of components if and when required. These objects are written in an object oriented language such as C++ or VB and have software interfaces that can be used to access their functions (or methods as they are called). Interfaces are generally given names starting with an uppercase ‘I’ such as IOPCShutdown. Each object and each interface is considered unique and is given a 128-bit ‘GUID’ (Globally Unique ID) number to identify it.
DCOM, in turn, typically uses a TCP/IP stack with RPC (Remote Procedure Call) in the Application layer for communication between OPC clients and servers across a network.
OPC initially stood for OLE (Object Linking and Embedding) for Process Control. In the meantime OLE has been replaced with ActiveX, but the initialism ‘OPC’ remains. OPC is an open, standard software infrastructure for the exchange of process data. It specifies a set of software interfaces and logical objects as well as the functions (methods) of those objects. Vendors and users can develop their own clients and servers if they wish, with or without programming knowledge. The main reason for the existence of OPC is the need to access process control information regardless of the operating system or hardware involved. In other words, OPC data is accessed in a consistent way regardless of who the vendor of the system is. OPC is built around the client-server concept where the server collects plant data and passes it on to a client for display.
Since OPC was designed around Microsoft DCOM it was initially limited to Windows operating systems. Despite its legacy, DCOM is currently implemented on all major operating systems as well as several imbedded real-time operating systems and is therefore effectively vendor neutral. In fact, much of the non-real-time PROFInet communication is built on DCOM.
OPC effectively creates a ‘software bus’ allowing multiple clients to access data from multiple servers without the need for any special drivers.
The following figure shows an application (a client, for example) obtaining plant data from an OPC server. The OPC server, in turn, obtains its information from some physical I/O or by virtue of a bridging process from a SCADA system which, in turn, gathers physical plant information through some physical I/O.
There are several OP specifications. The most common one is OPC DA (Data Access). This standard defines a set of application interfaces, allowing software developers to develop clients that retrieve data from a server. Through the client, the user can locate individual OPC servers and perform simple browsing in the name spaces (i.e. all the available tags) of the OPC server.
Although OPC DA allows a client to write back data to the server, it is primarily intended to read data from the server and hence to create a ‘window’ into the plant.
Another emerging OPC standard is OPC DX (Data Exchange). OPC DX is an extension of the OPC DA specification and defines a communication standard for the higher-level exchange of non-time-critical user data at system levels between different makes of control systems, e.g. between Foundation Fieldbus HSE, Ethernet/IP and PROFInet.
OPC DX allows system integrators to configure controllers from different vendors through a standard browser interface without any regard for the manufacturer of a specific node. In other words, it allows a common view of all devices.
The advantage of using Ethernet at all levels in the enterprise is that these levels can then be interlinked by means of standard off-the-shelf Ethernet compatible products such as routers and switches.
The use of switching hubs is the key to high performance coupling between the different plant network layers since it becomes easy to mix stations of different speeds. Inserting switches between sub-networks requires no change to hardware or software and effectively isolates the traffic on the two network layers joined by it (i.e. the traffic on the subnets connected via the switch does not ‘leak’ across the switch). In order to preserve bandwidth, it is imperative not to use broadcast techniques.
In terms of inter-layer connection, routers can augment the speed adaptation function by being deployed in series with a switching hub. A throttling router, connected in series with the switch, can impose delays in order to achieve flow control in a situation where the destination network cannot cope with the data flow. Routers also assist in implementing security measures, such as Access Control Lists (ACLs). They can even be combined with firewalls for added security.
Ethernet is becoming the de facto standard for the implementation of the Physical and Data Link layers of the OSI model because of its scalability and relatively low cost. The following paragraphs will briefly deal with the factors that, until recently, have been Ethernet shortcomings namely throughput, determinism and redundancy.
The entry-level IEEE 802.3 Ethernet standard used to be 10BaseT but this can almost be regarded as a legacy technology; so the trend is to use Fast Ethernet (100BaseX) in industrial environments. In theory one could even deploy Gigabit Ethernet (1000BaseX), but IP67 rated connectors to accommodate this speed is still a problem. For the enterprise level, 10 Gigabit Ethernet can be used.
Initially Ethernet (the 10 Mbps versions) operated in half-duplex mode with contention (CSMA/CD) as a medium access control mechanism. Consequently it has been argued that Ethernet does not possess sufficient determinism. This problem was solved to some extent by switches, full duplex operation, and IEEE 802.1p ‘traffic class expediting’ or ‘message prioritization’, which allows the delivery of time-critical messages in a deterministic fashion. Initially designed for multimedia applications, it directly impacts on Ethernet as a control network by allowing system designers to prioritize messages, guaranteeing the delivery of time critical data with deterministic response times. This ability has been used to produce Ethernet systems that typically provide a deterministic scan time in the 2 to 3 millisecond range for one I/O rack with 128 points.
For motor control applications this is still not good enough; hence there are several vendors marketing Ethernet ‘field buses’ that employ the IEEE 1588 clock synchronization standard and are capable of scanning in excess of a thousand I/O points per millisecond, and can synchronize servo axes with accuracies in the microsecond region. Examples are Sercos III, EtherCat and ProfiNet RT.
The older IEEE 802.1d ‘Spanning Tree Protocol’ (STP) provided the ability to add redundant links to a network device such as a bridge or switch. This facilitates automatic recovery of network connectivity when there is a link or component failure anywhere in the network path. This standard obviates the need for custom solutions when redundancy is required as part of the control solution, and allows Ethernet switches to be wired into rings for redundancy purposes. Redundant or even dual-redundant switched Ethernet rings are now commonplace. What’s more, these switches can be interconnected with fiber links of up to 120 km for certain brands.
Unfortunately STP was too slow (30-60 seconds needed for reconfiguration) for Industrial applications. As a result, several vendors including Cisco, Siemens and Hirschmann developed their own versions of, or enhancements to, STP. The newer IEEE 802.1W ‘Rapid Spanning Tree Protocol’ (RSTP) can reconfigure itself in less than 10 seconds if it detects a failure in the redundant system.
A universal thin server is an appliance that network-enables any serial device such as a printer or weighbridge that has an RS-232 port. In addition to the operating system and protocol independence of general thin servers, a universal thin server is application- independent by virtue of its ability to network any serial device.
The universal thin server is a product developed primarily for environments in which machinery, instruments, sensors and other discrete ‘devices’ generate data that was previously inaccessible through enterprise networks. They allow nearly any device to be connected, managed and controlled over a network or the Internet.
In general, thin servers can be used for data acquisition, factory floor automation, security systems, scanning devices and medical devices.
An interesting thin server application regulates cattle feed in stock yards. Cattle wear radio frequency ID tags in their ears that relay data over a TCP/IP network as they step on to a scale. By the time the cattle put their heads in a trough to eat, the system has distributed the proper mix of feed.
Thin servers control video cameras used to monitor highway traffic. The US Border Patrol uses similar cameras to spot illegal border crossings. Food processing companies use the technology to track inventory in a warehouse, or the weight of consumable items rolling off an assembly line.
The IEEE 1451 activities comprise two parts, each managed by its own working group. IEEE 1451.1 targets the interface between the smart device and the network, while 1451.2 focuses on the interface between the sensor/transducer and the on-board microprocessor within the smart device.
IEEE 1451.1 defines a ‘Network Capable Application Processor (NCAP) Information Model’ that allows smart sensors and actuators to interface with many networks including Ethernet. The standard strives to achieve this goal by means of a common network object model and use of a standard API, but it specifically does not define device algorithms or message content.
IEEE 1451.2 is concerned with transducer-to-microprocessor communication protocols and transducer ‘Electronic Data Sheet’ formats. This standard provides an interface that sensor and actuator suppliers can use to connect transducers to microprocessors within their smart device without worrying about what kind of microprocessor is on-board. A second part of the P1451.2 activity is the specification of Electronic Data Sheets and their formats. These data sheets, which amount to physically placing the device descriptions inside the smart sensor, provide a standard means for describing smart devices to other systems. These Transducer Electronic Data Sheets (TEDS) also allow self-identification of the device on the network.
The Modbus Messaging protocol is an Application layer (OSI layer 7) protocol that provides communication between devices connected to different types of buses or networks. It implements a client/server architecture and operates essentially in a ‘request/response’ mode, irrespective of the media access control method used at layer 2. The client (on the controller) issues a request; the server (on the target device) then performs the required action and initiates a response.
The Modbus Messaging protocol needs additional support at the lower layers and in the case of Modbus Serial implementations a master/slave (half-duplex) layer 2 protocol transmits the data in asynchronous mode over RS-232, RS-485 or Bell 202 type modem links. Alternatively, HDLC (a token passing layer 2 protocol) transmits data in synchronous mode over RS-485 for Modbus Plus. This section illustrates the TCP/IP/Ethernet approach, which enables client/server interaction over routed networks albeit at the cost of additional overheads such as processing time and more headers.
In order to match the Modbus Messaging protocol to TCP, an additional sub-layer is required. The function of this sub-layer is to encapsulate the Modbus PDU so that it can be transported as a packet of data by TCP/IP (see Figure 15.6). Strictly speaking this should have been called an APDU (Application Protocol Data Unit) but we will stick with the Modbus designation.
One might well ask why connection-oriented TCP is used, rather than the datagram-oriented UDP. TCP has more overheads, and as a result it is slower than UDP. The main reason for this choice is to keep control of individual ‘transactions’ by enclosing them in connections that can be identified, supervised, and canceled without requiring specific action on the part of the client or server applications. This gives the mechanism a wide tolerance to network performance changes, and allows security features such as firewalls and proxies to be easily added.
The PDU consisting of data and function code is encapsulated by adding a ‘Modbus on TCP Application Protocol’ (MBAP) header in front of the PDU. The resulting Modbus/TCP ADU, consisting of the PDU plus MBAP header, is then transported as a chunk of data via TCP/IP and Ethernet. Once again, this should have been called a TADU (Transport Application Data Unit) but we will use the Modbus designation.
Whereas Modbus Serial forms the ADU by simply appending a 1-bit unit identifier to the PDU, the MBAP header is much larger, although it still contains the 1-byte unit identifier (‘slave address’) for communicating with serial devices. A byte count is included so that in the case of long messages being split up by TCP, the recipient is kept informed of the exact number of bytes transmitted. The ADU no longer contains a checksum (as in the serial implementation), as this function is performed by TCP.
Fields | Length | Description | Client | Server |
---|---|---|---|---|
Transaction Identifier | 2 Bytes | Identification of a Modbus Request/ Response transaction | Initialized by the client | Recopied by the server from the received request |
Protocol Identifier | 2 Bytes | 0 = Modbus protocol | Initialized by the client | Recopied by the server from the received request |
Length | 2 Bytes | Number of following bytes | Initialized by the client (request) | Initialized by the server (response) |
Unit Identifier | 1 Byte | Identification of a remote device (slave) connected on a serial line or on other buses | Initialized by the client | Recopied by the server from the received request |
Table 15.1
MBAP fieldsThe MBAP header is 7 bytes long and comprises the following 4 fields (see Table 15/.1):
The Modbus/TCP ADU is therefore constructed as follows:
All Modbus/TCP ADUs are sent via registered port 502 and the fields are encoded big-endian, which means that if a number is represented by more than one byte, the most significant byte is sent first. TCP/IP transports the entire Modbus ADU as data (as shown in the following figure).
DeviceNet™ and ControlNet™ are two well-known industrial networks based on CIP, the Control and Information Protocol. Both networks have been developed by Rockwell Automation, but are now owned and maintained by the two manufacturers’ organizations viz. ODVA (Open DeviceNet Vendors Association) and CI (ControlNet International). ODVA and CI have recently introduced the newest member of this family viz. EtherNet/IP. This section describes the techniques and mechanisms that are used to implement Ethernet/IP and will attempt to give an overall view of the system, taking into account the fact that layers 1 thru 4 of the OSI model (the bottom three layers of the TCP model) have already been dealt with earlier in this manual.
Ethernet/IP is an open industrial network standard based on Ethernet, using commercial off-the-shelf (COTS) technology and TCP/IP. It allows users to collect, configure and control data, and provides interoperability between equipment from various vendors, of which there are several hundred already.
The system is defined in terms of several open standards, which have a wide level of acceptance. They are Ethernet (IEEE802.3), TCP/IP, and CIP (Control Information Protocol, EN50170 and IEC 61158).
As Figure 15.9 shows, CIP has already been in use with DeviceNet and ControlNet, the only difference between those two systems being the implementation of the four bottom layers. Now TCP/IP has been added as an alternative network layer/transport layer, but CIP remains intact.
TCP is typically used to download ladder programs between a workstation and a PLC, for MMI software that reads or writes PLC data tables, or for peer-to-peer messaging between two PLCs. This type of communication is referred to as ‘explicit’ communication.
Through TCP, Ethernet/IP is able to send explicit (connection oriented) messages, in which the data field carries both protocol information and instructions. Here, nodes must interpret each message, execute the requested task, and generate responses. These types of messages are used for device configuration and various diagnostics.
UDP is typically used for network management functions, applications that do not require reliable data transmission, applications that are willing to implement their own reliability scheme, such a flash memory programming of network devices, and for input/output (I/O) operations.
For connectionless real-time messaging, multicasting, and for sending implicit messages, Ethernet/IP uses UDP. Implicit messages contain no protocol information in the data field, only real-time I/O data. The meaning of the data is predefined in advance; therefore processing time in the node during runtime is reduced. Because these messages are low on overhead and short, they can pass quickly enough to be useful for certain time-critical control applications.
Between TCP/UDP and CIP is an encapsulating protocol, which appends its own encapsulating header and checksum to the CIP data to be sent before passing it on to TCP or UDP. In this way the CIP information is simply passed on by TCP/IP or UDP/IP as if it is a chunk of data.
As shown in Figure 15.9, CIP covers not only layers 5, 6 and 7 of the OSI model, but also the ‘User layer’ (layer 8), which includes the user device profiles. Apart from the common device profiles, CIP also includes the common object library, the common control services and the common routing services. The OSI model has no layer 8, but items such as device profiles do not fit within the conceptual structure of the OSI model, hence vendors often add a ‘layer 8’ above layer 7 for this purpose.
CIP co-exists with the other TCP/IP application layer protocols as shown in Figure 15.11.
This section addresses common faults on Ethernet networks. Ethernet encompasses layers 1 and 2, namely the Physical and Data Link layers of the OSI model. This is equivalent to the bottom layer (the Network Interface layer) in the ARPA (DoD) model. This section will focus on those layers only, as well as on the actual medium over which the communication takes place.
Ethernet hardware is fairly simple and robust, and once a network has been commissioned, providing the cabling has been done professionally and certified, the network should be fairly trouble-free.
Most problems will be experienced at the commissioning phase, and could theoretically be attributed to the cabling, the LAN devices (such as hubs and switches), the NICs or the protocol stack configuration on the hosts.
The wiring system should be installed and commissioned by a certified installer (the suppliers of high-speed Ethernet cabling systems, such as ITT, will not guarantee their wiring if not installed by a certified installer). This effectively rules out wiring problems for new installations.
If the LAN devices such as hubs and switches are from reputable vendors, it is highly unlikely that they will malfunction in the beginning. Care should nevertheless be taken to ensure that intelligent (managed) hubs and switches are correctly set up.
The same applies to NICs. NICs rarely fail and nine times out of ten the problem lies with a faulty setup or incorrect driver installation, or an incorrect configuration of the higher level protocols such as IP.
Apart from a fundamental understanding of the technologies involved, sufficient time, a pair of eyes, patience, and many cups of coffee or tea, the following tools are helpful in isolating Ethernet-related problems.
A simple multimeter can be used to check for continuity and cable resistance, as will be explained in this section.
There are many versions on the market, ranging from simple devices that basically check for wiring continuity to sophisticated devices that comply with all the prerequisites for 1000Base–T wiring infrastructure tests. Testers are available from several vendors such as MicroTest, Fluke, and Scope.
Fiber optic testers are simpler than UDP testers, since they basically only have to measure continuity and attenuation loss. Some UDP testers can be turned into fiber optic testers by purchasing an attachment that fits onto the existing tester. For more complex problems such as finding the location of a damaged section on a fiber optic cable, an alternative is to use a proper Optical Time Domain Reflectometer (OTDR) but these are expensive instruments and it is often cheaper to employ the services of a professional wiring installer (with his own OTDR) if this is required.
A traffic generator is a device that can generate a pre-programmed data pattern on the network. Although they are not strictly speaking used for fault finding, they can be used to predict network behavior due to increased traffic, for example, when planning network changes or upgrades. Traffic generators can be stand-alone devices or they can be integrated into hardware LAN analyzers such as the Hewlett Packard 3217.
An RMON (Remote MONitoring) probe is a device that can examine a network at a given point and keep track of captured information at a detailed level. The advantage of a RMON probe is that it can monitor a network at a remote location. The data captured by the RMON probe can then be uploaded and remotely displayed by the appropriate RMON management software. RMON probes and the associated management software are available from several vendors such as 3COM, Bay Networks and NetScout. It is also possible to create an RMON probe by running commercially available RMON software on a normal PC, although the data collection capability will not be as good as that of a dedicated RMON probe.
Handheld frame analyzers are manufactured by several vendors such as Fluke, Scope, Finisar and PsiberNet, for up to Gigabit Ethernet speeds. These devices can perform link testing, traffic statistics gathering etc. and can even break down frames by protocol type. The drawback of these testers is the small display and the lack of memory, which results in a lack of historical or logging functions on these devices.
An interesting feature of some probes is that they are non-intrusive, i.e. they simply clamp on to the wire and do not have to be attached to a hub or switch port.
Software protocol analyzers are software packages that run on PCs and use either a general purpose or a specialized NIC to capture frames from the network. The NIC is controlled by a so-called promiscuous driver, which enables the NIC to capture all packets on the medium and not only those addressed to it in broadcast or unicast mode.
There are excellent freeware protocol analyzers such as Ethereal or Analyzer, the latter being developed at the Polytechnic of Torino in Italy. Top of the range software products such as Network Associates’ Sniffer or WaveTek Wandel Goltemann’s Domino Suite have sophisticated expert systems that can aid in the analysis of the captured software, but unfortunately this comes at a price.
Several manufacturers such as Hewlett Packard, Network Associates and WaveTek Wandel & Goltemann also supply hardware based protocol analyzers using their protocol analysis software running on a proprietary hardware infrastructure. This makes them very expensive but dramatically increases the power of the analyzer. For fast and gigabit Ethernet, this is probably the better approach. As a compromise one could use their software on a PC, but with a specially supplied NIC. The advantage of a hardware-based approach is that one can capture packets at level 1 of the OSI model. A software-based analyzer can only capture packets once they have been read by the NIC, and can therefore not capture malformed packets
If excessive noise is suspected on a coax or UTP cable, an oscilloscope can be connected between the signal conductor(s) and ground. This method will show up noise on the conductor, but will not necessarily give a true indication of the amount of power in the noise. A simple and cheap method to pick up noise on the wire is to connect a small loudspeaker between the conductor and ground. An operational amplifier can be used as an input buffer, so as not to ‘load’ the wire under observation. The noise will be heard as an audible signal.
The quickest way to get rid of a noise problem, apart from using screened UTP (ScTP), is to change to a fiber-based instead of a copper-based network, for example, by using 100Base–FX instead of 100Base-TX.
Noise can to some extent be counteracted on a coax-based network by earthing the screen AT ONE END ONLY. Earthing it on both sides will create an earth loop. This is normally accomplished by means of an earthing chain or an earthing screw on one of the terminators. Care should also be taken not to allow contact between any of the other connectors on the segment and ground.
Incorrect cable type
This is still used in some legacy systems. The correct cable for thin Ethernet is RG58-A/U or RG-58C/U. This is a 5 mm diameter coaxial cable with 50 ohm characteristic impedance and a stranded center conductor. Incorrect cable used in a thin Ethernet system can cause reflections, resulting in CRC errors, and hence many retransmitted frames.
The characteristic impedance of coaxial cable is a function of the ratio between the center conductor diameter and the screen diameter. Hence other types of coax may closely resemble RG-58, but may have different characteristic impedance.
Loose connectors
The BNC coaxial connectors used on RG-58 should be of the correct diameter, and should be properly crimped onto the cable. An incorrect size connector or a poor crimp could lead to intermittent contact problems, which are very hard to locate. Even worse is the ‘Radio Shack’ hobbyist type screw-on BNC connector that can be used to quickly make up a cable without the use of a crimping tool. These more often than not lead to very poor connections. A good test is to grip the cable in one hand, and the connector in another, and pull very hard. If the connector comes off, the connector mounting procedures need to be seriously reviewed.
Excessive number of connectors
The total length of a thin Ethernet segment is 185 m and the total number of stations on the segment should not exceed 30. However, each station involves a BNC T-piece plus two coax connectors and there could be additional BNC barrel connectors joining the cable. Although the resistance of each BNC connector is small, they are still finite and can add up. The total resistance of the segment (cable plus connectors) should not exceed 10 ohms otherwise problems can surface.
An easy method of checking the loop resistance (the resistance to the other end of the cable and back) is to remove the terminator on one end of the cable and measure the resistance between the connector body and the center contact. The total resistance equals the resistance of the cable plus connectors plus the terminator on the far side. This should be between 50 and 60 ohms. Anything more than this is indicative of a problem.
Overlong cable segments
The maximum length of a thin net segment is 185 m. This constraint is not imposed by collision domain considerations, but rather by the attenuation characteristics of the cable. If it is suspected that the cable is too long, its length should be confirmed. Usually, the cable is within a cable trench and hence it cannot be visually measured. In this case, a TDR can be used to confirm its length.
Stub cables
For thin Ethernet (10Base2), the maximum distance between the bus and the transceiver electronics is 4 cm. In practice, this is taken up by the physical connector plus the PC board tracks leading to the transceiver, which means that there is no scope for a drop cable or ‘stub’ between the NIC and the bus. The BNC T-piece has to be mounted directly on to the NIC.
Users do occasionally get away with putting a short stub between the T-piece and the NIC, but this invariably leads to problems in the long run.
Incorrect terminations
10Base2 is designed around 50-ohm coax and hence requires a 50-ohm terminator at each end. Without the terminators in place, there would be so many reflections from each end that the network would collapse. A slightly incorrect terminator is better than no terminator, yet may still create reflections of such magnitude that it affects the operation of the network.
A 93-ohm terminator looks no different than a 50-ohm terminator; therefore it should not be automatically assumed that a terminator is of the correct value.
If two 10Base2 segments are joined with a repeater, the internal termination on the repeater can be mistakenly left enabled. This leads to three terminators on the segment, creating reflections and hence affecting the network performance.
The easiest way to check for proper termination is by alternatively removing the terminators at each end, and measuring the resistance between connector body and center pin. In each case, the result should be 50 to 60 ohms. Alternatively, one of the T-pieces in the middle of the segment can be removed from its NIC and the resistance between the connector body and the center pin measured. The result should be the value of the two half cable segments (including terminators) in parallel, i.e. 25 to 30 ohms.
Invisible insulation damage
If the internal insulation of coax is inadvertently damaged, for example, by placing a heavy point load on the cable, the outer cover could return to its original shape whilst leaving the internal dielectric deformed. This leads to a change of characteristic impedance at the damaged point resulting in reflections. This, in turn, could lead to standing waves being formed on the cable.
An indication of this problem is when a work station experiences problems when attached to a specific point on a cable, yet functions normally when moved a few meters to either side. The only solution is to remove the offending section of the cable. Because of the nature of the damage, it cannot be seen by the naked eye and the position of the damage has to be located with a TDR. Alternatively, the whole cable segment has to be replaced.
Invisible cable break
This problem is similar to the previous one, with the difference that the conductor has been completely severed at a specific point. Despite the terminators at both ends of the cable, the cable break effectively creates two half segments, each with an un-terminated end, and hence nothing will work.
Often the only way to discover the location of the break is by using a TDR.
Thick coax (RG-8), as used for 10Base5 or thick Ethernet, will basically exhibit the same problems as thin coax yet there are a few additional complications.
Loose connectors
10Base5 uses N-type male screw-on connectors on the cable. As with BNC connectors, incorrect procedures or a wrong sized crimping tool can cause sloppy joints. This can lead to intermittent problems that are difficult to locate.
Again, a good test is to grab hold of the connector and to try and rip it off the cable with brute force. If the connector comes off, it was not properly installed in the first place.
Dirty taps
The MAU transceiver is often installed on a thick coax by using a vampire tap, which necessitates pre-drilling into the cable in order to allow the center pin of the tap to contact the center conductor of the coax. The hole has to go through two layers of braided screen and two layers of foil. If the hole is not properly cleaned, pieces of the foil and braid can remain and cause short circuits between the signal conductor and ground.
Open tap holes
When a transceiver is removed from a location on the cable, the abandoned hole should be sealed. If not, dirt or water could enter the hole and create problems in the long run.
Tight cable bends
The bend radius on a thick coax cable may not exceed 10 inches. If it does, the insulation can deform to such an extent that reflections are created leading to CRC errors. Excessive cable bends can be detected with a TDR.
Excessive loop resistance
The resistance of a cable segment may not exceed 5 ohms. As in the case of thin coax, the easiest way to do this is to remove a terminator at one end and measure the loop resistance. It should be in a range of 50–55 ohms.
The most commonly used tool for UTP troubleshooting is a cable meter or pair scanner. At the bottom end of the scale, a cable tester can be an inexpensive tool, only able to check for the presence of wire on the appropriate pins of a RJ-45 connector. High-end cable testers can also test for noise on the cable, cable length, and crosstalk (such as near end signal crosstalk or NEXT) at various frequencies. It can check the cable against CAT5/5e specifications and can download cable test reports to a PC for subsequent evaluation.
The following is a description of some wiring practices that can lead to problems.
Incorrect wire type (solid/stranded)
Patch cords must be made with stranded wire. Solid wire will eventually suffer from metal fatigue and crack right at the RJ-45 connector, leading to permanent or intermittent open connections. Some RJ-45 plugs, designed for stranded wire, will actually cut through the solid conductor during installation, leading to an immediate open connection. This can lead to CRC errors resulting in slow network performance, or can even disable a workstation permanently. The length of the patch cords or flyleads on either end of the link may not exceed 5m, and these cables must be made from stranded wire.
The permanently installed cable between hub and workstation, on the other hand, should not exceed 90m and must be of the solid variety. Not only is stranded wire more expensive for this application, but the capacitance is higher, which may lead to a degradation of performance.
Incorrect wire system components
The performance of the wire link between a hub and a workstation is not only dependent on the grade of wire used, but also on the associated components such as patch panels, Surface Mount Units (SMUs) and RJ-45 type connectors. A single substandard connector on a wire link is sufficient to degrade the performance of the entire link.
High quality Fast and Gigabit Ethernet wiring systems use high-grade RJ-45 connectors that are visibly different from standard RJ-45 type connectors.
Incorrect cable type
Care must be taken to ensure that the existing UTP wiring is of the correct category for the type of Ethernet being used. For 10BaseT, Cat3 UTP is sufficient, while Fast Ethernet (100Base-TX) requires Cat5 and Gigabit Ethernet requires Cat5e or better. This applies to patch cords as well as the permanently installed (‘infrastructure’) wiring.
Most industrial Ethernet systems nowadays are 100Base-X-based and hence they use Cat5 wiring. For such applications it might be prudent to install screened Cat5 wiring (ScTP) for better noise immunity. ScTP is available with a common foil screen around the 4 pairs or with an individual foil screen around each pair. Better still is Cat5i wiring which has only two wire pairs of a thicker gauge, as well as a braided shield.
A common mistake is to use telephone grade patch (‘silver satin’) cable for the connection between an RJ-45 wall socket (SMU) and the NIC. Telephone patch cables use very thin wires that are untwisted, leading to high signal loss and large amounts of crosstalk. This will lead to signal errors causing retransmission of lost packets, which will eventually slow the network down.
‘Straight’ vs. crossover cable
A 10BaseT or 100Base-TX patch cable consists of 4 wires (two pairs) with an RJ-45 connector at each end. The pins used for the TX and RX signals are typically 1, 2 and 3, 6. Although a typical patch cord has 8 wires (4 pairs), the 4 unused wires are nevertheless crimped into the connector for mechanical strength. In order to facilitate communication between computer and hub, the TX and RX ports on the hub are reversed, so that the TX on the computer and the RX on the hub are interconnected whilst the TX on the hub is connected to the RX on the hub. This requires a ‘straight’ interconnection cable with pin 1 wired to pin 1, pin 2 wired to pin 2, etc.
If the NICs on two computers are to be interconnected without the benefit of a hub, a normal straight cable cannot be used since it will connect TX to TX and RX to RX. For this purpose, a crossover cable has to be used, in the same way as a ‘null’ modem cable. Crossover cables are normally color coded (for example, green or yellow) in order to differentiate them from straight cables.
A crossover cable can create problems when it looks like a normal straight cable and the unsuspecting person uses it to connect a NIC to a hub or a wall outlet. A quick way to identify a crossover cable is to hold the two RJ-45 connectors side by side and observe the colors of the 8 wires in the cable through the clear plastic of the connector body. The sequence of the colors should be the same for both connectors.
Hydra cables
Some 10BaseT hubs feature 50 pin connectors to conserve space on the hub. Alternatively, some building wire systems use 50 pin connectors on the wiring panels but the hub equipment has RJ-45 connectors. In some cases, hydra or octopus cable has to be used. This consists of a 50 pin connector connected to a length of 25-pair cable, which is then broken out as a set of 12 small cables, each with an RJ-45 connector. Depending on the vendor the 50-pin connector can be attached through locking clips, Velcro strips or screws. It does not always lock down properly, although at a glance it may seem so. This can cause a permanent or intermittent break of contact on some ports.
For 10BaseT systems, near end crosstalk (NEXT), which occurs when a signal is coupled from a transmitting wire pair to a receiving wire pair close to the transmitter, (where the signal is strongest) causes most problems. On a single 4 pair cable, this is not a serious problem, as only two pairs are used but on the 25 pair cable, with many signals in close proximity, this can create problems. This can be very difficult to troubleshoot since it will require test equipment that can transmit on all pairs simultaneously.
Excessive untwists
On Cat5 cable, crosstalk is minimized by twisting each cable pair. However, in order to attach a connector at the end the cable has to be untwisted slightly. Great care has to be taken since excessive untwists (more than 1 cm) is enough to create excessive crosstalk, which can lead to signal errors. This problem can be detected with a high quality cable tester.
Stubs
A stub cable is an abandoned telephone cable leading from a punch-down block to some other point. This does not create a problem for telephone systems, but if the same Cat3 telephone cabling is used to support 10BaseT, then the stub cables may cause signal reflections that result in bit errors. Again, a high quality cable tester only will be able to detect this problem.
Damaged RJ-45 connectors
On RJ-45 connectors without protective boots, the retaining clip can easily break off especially on cheaper connectors made of brittle plastic. The connector will still mate with the receptacle but will retract with the least amount of pull on the cable, thereby breaking contact. This problem can be checked by alternatively pushing and pulling on the connector and observing the LED on the hub, media coupler or NIC where the suspect connector is inserted. Because of the mechanical deficiencies of ‘standard’ RJ-45 connectors they are not commonly used on Industrial Ethernet systems.
Since fiber does not suffer from noise, interference and crosstalk problems, there are basically only two issues to contend with namely attenuation and continuity.
The simplest way of checking a link is to plug each end of the cable into a fiber hub, NIC or fiber optic transceiver. If the cable is OK, the LEDs at each end will light up. Another way of checking continuity is by using an inexpensive fiber optic cable tester consisting of a light source and a light meter to test the segment.
More sophisticated tests can be done with an OTDR. OTDRs not only measure losses across a fiber link, but can also determine the nature and location of the losses. Unfortunately, they are very expensive but most professional cable installers will own one.
10BaseFX and 100Base-FX use LED transmitters that are not harmful to the eyes, but Gigabit Ethernet uses laser devices that can damage the retina of the eye. It is therefore dangerous to try and stare into the fiber (all systems are infrared and therefore invisible anyway!).
Incorrect connector installation
Fiber optic connectors can propagate light even if the two connector ends are not touching each other. Eventually, the gap between the fiber ends may be so far apart that the link stops working. It is therefore imperative to ensure that the connectors are properly latched.
Dirty cable ends
Because of the small fiber diameter (8.5–62.5 microns) and the low light intensity, a speck of dust or some finger oil deposited by touching the connector end is sufficient to affect communication. For this reason, dust caps must be left in place when the cable is not in use and a fiber optic cleaning pad must be used to remove dirt and oils from the connector point before installation.
Component aging
The amount of power that a fiber optic transmitter can radiate diminishes during the working life of the transmitter. This is taken into account during the design of the link but in the case of a marginal design, the link could start failing intermittently towards the end of the design life of the equipment. A fiber optic power meter can be used to confirm the actual amount of loss across the link but an easy way to troubleshoot the link is to replace the transceivers at both ends of the link with new ones.
Excessive cable length
The maximum length of the AUI cable is 50m but this assumes that the cable is a proper IEEE 802.3 cable. Some installations use lightweight office grade cables that are limited to 12m in length. If these cables are too long, the excessive attenuation can lead to intermittent problems.
DIX latches
The DIX version of the 15 pin D-connector uses a sliding latch. Unfortunately, not all vendors adhered to the IEEE 802 specifications and some used lightweight latch hardware which resulted in a connector that can very easily become unstuck. There are basically two solutions to the problem. The first solution is to use a lightweight (office grade) AUI cable, providing that distance is not a problem. This places less stress on the connector. The second solution is to use a special plastic retainer such as the ‘ET Lock’ made specifically for this purpose.
SQE test
The Signal Quality Error (SQE) test signal is used on all AUI based equipment to test the collision circuitry. This method is only used on the old 15 pin AUI based external transceivers (MAUs) and sends a short signal burst (about 10 bit times in length) to the NIC just after each frame transmission. This tests both the collision detection circuitry and the signal paths. The SQE operation can be observed by means of an LED on the MAU.
The SQE signal is only sent from the transceiver to the NIC and not on to the network itself. It does not delay frame transmissions but occurs during the inter-frame gap and is not interpreted as a collision.
The SQE test signal must, however, be disabled if an external transceiver (MAU) is attached to a repeater. If this is not done the repeater will detect the SQE signal as a collision and will issue a jam signal. As this happens after each packet, it can seriously delay transmissions over the network. The problem is that it is not possible to detect this with a protocol analyzer.
Basic card diagnostics
The easiest way to check if a particular NIC is faulty is to replace it with another (working) NIC. Modern NICs for desktop PCs usually have auto-diagnostics included and these can be accessed, for example, from the device manager in Windows. Some cards can even participate in a card to card diagnostic. Provided there are two identical cards, one can be set up as an initiator and one as a responder. Since the two cards will communicate at the data link level, the packets exchanged will, to some extent, contribute to the network traffic but will not affect any other devices or protocols present on the network.
The drivers used for card auto-diagnostics will usually conflict with the NDIS and ODI drivers present on the host, and a message is usually generated, advising the user that the Windows drivers will be shut down, or that the user should re-boot in DOS.
With PCMCIA cards, there is an additional complication in that the card diagnostics will only run under DOS, but under DOS the IRQ (interrupt address) of the NIC typically defaults to 5, which happens to be the IRQ for the sound card! Therefore the diagnostics will usually pass every test, but fail on the IRQ test. This result can then be ignored safely if the card passes the other diagnostics. If the card works, it works!
Incorrect media selection
Some older 10 Mbps NICs support more than one medium, for example, 10Base2/10Base5, or 10Base5/10BaseT, or even all three. It may then happen that the card fails to operate since it fails to ‘see’ the attached medium.
It is imperative to know how the selection is done. Some cards have an auto-detect function but the media detection only takes place when the machine is booted up. It does NOT re-detect the medium if it is changed afterwards. If the connection to a machine is therefore changed from 10BaseT to 10Base2, for example, the machine has to be re-booted.
Some older cards need to have the medium set via a setup program, whilst even older cards have DIP switches on which the medium has to be selected.
Wire hogging
Older interface cards find it difficult to maintain the minimum 9.6 microsecond Inter Frame Spacing (IFS) and, as a result of this, nodes tend to return to and compete for access to the bus in a random fashion. Modern interface cards are so fast that they can sustain the minimum 9.6 microsecond IFS rate. As a result of this, it becomes possible for a single card to gain repetitive sequential access to the bus in the face of slower competition and hence ‘hogging’ the bus.
With a protocol analyzer, this can be detected by displaying a chart of network utilization versus time and looking for broad spikes above 50 percent. The solution to this problem is to replace shared hubs with switched hubs (switches) and increase the bandwidth of the system by migrating from 10 to 100 Mbps, for example.
Jabbers
A jabber is a faulty NIC that transmits continuously. NICs have a built-in jabber control that is supposed to detect a situation whereby the card transmits frames longer than the allowed 1518 bytes, and shut the card down. However, if this does not happen, the defective card can bring the network down. This situation is indicated by a very high collision rate coupled with a very low or non-existent data transfer rate. A protocol analyzer might not show any packets, since the jabbering card is not transmitting any sensible data. The easiest way to detect the offending card is by removing the cables from the NICs or the hub one-by-one until the problem disappears, in which case the offending card has been located.
Faulty CSMA/CD mechanism
A card with a faulty CSMA/CD mechanism will create a large number of collisions since it transmits legitimate frames, but does not wait for the bus to become quiet before transmitting. As in the previous case, the easiest way to detect this problem is to isolate the cards one by one until the culprit is detected.
Too many nodes
A problem with CSMA/CD networks is that the network efficiency decreases as the network traffic increases. Although Ethernet networks can theoretically utilize well over 90% of the available bandwidth, the access time of individual nodes increase dramatically as network loading increases. The problem is similar to that encountered on many urban roads during peak hours. During rush hours, the traffic approaches the design limit of the road. This does not mean that the road stops functioning. In fact, it carries a very large number of vehicles, but to get into the main traffic from a side road becomes problematic.
For office type applications, an average loading of around 30% is deemed acceptable while for industrial applications 3% is considered the maximum. Should the loading of the network be a problem, the network can be segmented using switches instead of shared hubs. In many applications, it will be found that the improvement created by changing from shared to switched hubs, is larger than the improvement to be gained by upgrading from 10 Mbps to Fast Ethernet.
Improper packet distribution
Improper packet distribution takes place when one or more nodes dominate most of the bandwidth. This can be monitored by using a protocol analyzer and checking the source address of individual packets. .
Nodes like this are typically performing tasks such as video-conferencing or database access, which require a large bandwidth. The solution to the problem is to give these nodes separate switch connections or to group them together on a faster 100Base-T or 1000Base-T segment.
Excessive broadcasting
A broadcast packet is intended to reach all the nodes in the network and is sent to a MAC address of ff-ff-ff-ff-ff-ff. Unlike routers, bridges and switches forward broadcast packets throughout the network and therefore cannot contain the broadcast traffic. Too many simultaneous broadcast packets can degrade network performance.
In general, it is considered that if broadcast packets exceed 5% of the total traffic on the network, it would indicate a broadcast overload problem. Broadcasting is a particular problem with Netware servers and networks using NetBIOS/NetBEUI.
A broadcast overload problem can be addressed by adding routers, layer 3 switches or VLAN switches with broadcast filtering capabilities.
Bad packets
Bad packets can be caused by poor cabling infrastructure, defective NICs, external noise, or faulty devices such as hubs, devices or repeaters. The problem with bad packets is that they cannot be analyzed by software protocol analyzers.
Software protocol analyzers obtain packets that have already been successfully received by the NIC. That means they are one level removed from the actual medium on which the frames exist and hence cannot capture frames that are rejected by the NIC. The only solution to this problem is to use a software protocol analyzer that has a special custom NIC, capable of capturing information regarding packet deformities or by using a more expensive hardware protocol analyzer.
Faulty packets include:
Runt packets are shorter than the minimum 64 bytes and are typically created by a collision taking place during the slot time.
As a solution, try to determine whether the frames are collisions or under-runs. If they are collisions, the problem can be addressed by segmentation through bridges and switches. If the frames are genuine under-runs, the packet has to be traced back to the generating node that is obviously faulty.
CRC errors occur when the CRC check at the receiving end does not match the CRC checksum calculated by the transmitter.
As a solution, trace the frame back to the transmitting node. The problem is either caused by excessive noise induced into the wire, corrupting some of the bits in the frames, or by a faulty CRC generator in the transmitting node.
Late collisions on half-duplex (CSMA/CD) systems are typically caused when the network diameter exceeds the maximum permissible size. This problem can be eliminated by ensuring that the collision domains are within specified values, i.e. 2500 meters for 10 Mbps Ethernet and 250 m for Fast Ethernet. All Gigabit Ethernet systems are full duplex.
Check the network diameter as outlined above by physical inspection or by using a TDR. If that is found to be a problem, segment the network by using bridges or switches.
Misaligned frames are frames that get out of sync by a bit or two, due to excessive delays somewhere along the path or frames that have several bits appended after the CRC checksum.
As a solution, try and trace the signal back to its source. The problem could have been introduced anywhere along the path.
Faulty auto-negotiation
Auto-negotiation is specified for
It allows two stations on a link segment (a segment with only two devices on it) e.g. an NIC in a computer and a port on a switching hub to negotiate a speed (10/100/1000Mbps) and an operating mode (full/half duplex). If auto-negotiation is faulty or switched off on one device, the two devices might be set for different operating modes and, as a result, they will not be able to communicate.
On the NIC side the solution might be to run the card diagnostics and to confirm that auto-negotiation is, in fact, enabled. Alternatively, disable auto-negotiation and set the operational parameters to match that of the switch.
On the switch side, this depends on the diagnostics available for that particular switch. It might also be an idea to select another port, or to plug the cable into another switch.
10/100 Mbps mismatch
This issue is related to the previous one since auto-negotiation normally takes care of the speed issue.
Some system managers prefer to set the speeds on all NICs manually, for example, to 10 Mbps. If such an NIC is connected to a dual-speed switch port, the switch port will automatically sense the NIC speed and revert to 10 Mbps. If, however, the switch port is only capable of 100 Mbps, then the two devices will not be able to communicate.
This problem can only be resolved by knowing the speed (s) at which the devices are supposed to operate, and then by checking the settings via the setup software.
Full/half-duplex mismatch
This problem is related to the previous two.
A 10BaseT device can only operate in half-duplex (CSMA/CD) whilst a 100Base-TX can operate in full duplex OR half-duplex.
If, for example, a 100Base-TX device is connected to a 10BaseT hub, its auto-negotiation circuitry will detect the absence of a similar facility on the hub. It will therefore know, by default, that it is ‘talking’ to 10BaseT and it will set its mode to half-duplex. If, however, the NIC has been set to operate in full duplex only, communications will be impossible.
Incorrect host setup
Ethernet V2 (or IEEE 802.3 plus IEEE 802.2) only supplies the bottom layer of the DoD model. It is therefore able to convey data from one node to another by placing it in the data field of an Ethernet frame, but nothing more. The additional protocols to implement the protocol stack have to be installed above it, in order to make networked communications possible.
In Industrial Ethernet networks this will typically be the TCP/IP suite, implementing the remaining layers of the ARPA model as follows.
The second layer of the DoD model (the Internet layer) is implemented with IP (as well as its associated protocols such as ARP and ICMP).
The next layer (the Host-to-host layer) is implemented with TCP and UDP.
The upper layer (the Application layer) is implemented with the various application layer protocols such as FTP, Telnet, etc. The host might also require a suitable Application layer protocol to support its operating system in communicating with the operating system on other hosts. On Windows, that is NetBIOS by default.
As if this is not enough, each host needs a network ‘client’ in order to access resources on other hosts, and a network ‘service’ to allow other hosts to access its own resources in turn. The network client and network service on each host do not form part of the communications stack, but reside above it and communicate with each other across the stack.
Finally, the driver software for the specific NIC needs to be installed, in order to create a binding (‘link’) between the lower layer software (firmware) on the NIC and the next layer software (for example, IP) on the host. The presence of the bindings can be observed, for example, on a Windows XP host by clicking ‘Control Panel’ –> ‘networks’ –> ‘configuration,’ then selecting the appropriate NIC and clicking ‘Properties’ –> ‘Bindings.’
Without these, regardless of the Ethernet NIC installed, networking is not possible.
Failure to log in
When booting a PC, the Windows dialog will prompt the user to log on to the server, or to log on to his/her own machine. Failure to log in will not prevent Windows from completing its boot-up sequence but the network card will not be enabled. This is clearly visible as the LEDs on the NIC and hub will not light up.
Faulty individual port
A port on a hub may simply be ‘dead.’ Everybody else on the hub can ‘see’ each other, except the user on the suspect port. Closer inspection will show that the LED for that particular channel does not light up. The quickest way to verify this is to remove the UTP cable from the suspect hub port and plugging it into another port. If the LEDs light up on the alternative port, it means that the original port is not operational.
On managed hubs the configuration of the hub has to be checked by using the hub’s management software to verify that the particular port has not in fact been disabled by the network supervisor.
Faulty hub
This will be indicated by the fact that none of the LEDs on the hub are illuminated and that none of the users on that particular hub are able to access the network. The easiest to check this is by temporarily replacing the hub with a similar one and checking if the problem disappears.
Incorrect hub interconnection
If hubs are interconnected in a daisy chain fashion by means of interconnecting ports with a UTP cable, care must be taken to ensure that either a crossover cable is used or that the crossover/uplink port on one hub ONLY is used. Failure to comply with this precaution will prevent the interconnected hubs from communicating with each other although it will not damage any electronics.
A symptom of this problem will be that all users on either side of the faulty link will be able to see each other but that nobody will be able to see anything across the faulty link. This problem can be rectified by ensuring that a proper crossover cable is being used or, if a straight cable is being used, that it is plugged into the crossover/uplink port on one hub only. On the other hub, it must be plugged into a normal port.
Troubleshooting in a shared network is fairly easy since all packets are visible everywhere in the segment and as a result, the protocol analysis software can run on any host within that segment. In a switched network, the situation changes radically since each switch port effectively resides in its own segment and packets transferred through the switch are not seen by ports for which they are not intended.
In order to address the problem, many vendors have built traffic monitoring modules into their switches. These modules use either RMON or SNMP to build up statistics on each port and report switch statistics to switched management software.
Capturing the packets on a particular switched port is also a problem, since packets are not forwarded to all ports in a switch; hence there is no place to plug in a LAN analyzer and view the packets.
One solution implemented by vendors is port aliasing, also known as port mirroring or port spanning. The aliasing has to be set up by the user; the switch then copies the packets from the port under observation to a designated spare port. This allows the LAN user to plug in a LAN analyzer onto the spare port in order to observe the original port.
Another solution is to insert a shared hub in the segment under observation, that is, between the host and the switch port to which it was originally connected. The LAN analyzer can then be connected to the hub in order to observe the passing traffic.
The most diagnostic software is PC based and it uses a NIC with a promiscuous mode driver. This makes it easy to upgrade the system by simply adding a new NIC and driver. However, most PCs are not powerful enough to receive, store and analyze incoming data rates. It might therefore be necessary to rather consider the purchase of a dedicated hardware analyzer.
Most of the typical problems experienced with fast Ethernet, have already been discussed. These include a physical network diameter that is too large, the presence of Cat3 wiring in the system, trying to run 100Base-T4 on 2 pairs, mismatched 10BaseT/100Base-TX ports, and noise.
Although Gigabit Ethernet is very similar to its predecessors, the packets arrive so fast that they cannot be analyzed by normal means. A Gigabit Ethernet link is capable of transporting around 125 MB of data per second and few analyzers have the memory capability to handle this. Gigabit Ethernet analyzers such as those made by Hewlett Packard (LAN Internet Advisor), Network Associates (Gigabit Sniffer Pro) and WaveTek Wandel & Goltemann (Domino Gigabit Analyzer) are highly specialized gigabit Ethernet analyzers. They minimize storage requirements by filtering and analyzing capture packets in real time, looking for a problem. Unfortunately, they come at a price tag of around US$50,000.
This chapter deals with problems related to the TCP/IP protocol suite. The TCP/IP protocols are implemented in software and cover the second (Internet), third (Host-to-host) and the upper (Application) layers of the DoD (ARPA) model. These protocols need a network infrastructure as well as a medium in order to communicate. In a LAN environment the infrastructure is typically Ethernet.
The tools that can be used are DOS-based TCP/IP utilities, third party (Windows) utilities, software protocol analyzers and hardware protocol analyzers.
The DOS utilities form part of the TCP/IP software. They are not protocols, but simply executable programs (.exe) that utilize some of the protocols in the suite. These utilities include ping, arp and tracert.
Windows based TCP/IP utilities are more powerful and user-friendly, and is often available as freeware or shareware.
Protocol analyzers have already been discussed in the previous chapter.
The following are typical TCP/IP-related problems, with solutions.
The easiest way to confirm this, apart from checking the network configuration via the control panel and visually confirming that TCP/IP is installed for the particular NIC used on the host, is to perform a loopback test by pinging the host itself. This is done by executing ping localhost or ping 127.0.0.1. If a response is received, it means that the stack is correctly installed. If the test fails, then TCP/IP should be uninstalled, the machine rebooted, and TCP/IP re-installed. This is dome from the network control panel.
If it needs to be confirmed that a remote host is available, the particular machine can be checked by pinging it. The format of the command is:
This is an extremely powerful test, since a positive acknowledgment means that the bottom three layers (OSI) of the local and the remote hosts as well as all routers and all communication links between the two hosts are operational. Failure of the remote machine is, however, not always a sign of a failure as the firewall on that hosts could have been set up not to allow pings.
When TCP/IP is configured and the upper radio button is highlighted (indicating that an IP address has to be obtained automatically) that host, upon booting up, will broadcast a request for the benefit of the local Dynamic Host Configuration Protocol (DHCP) server. Upon hearing the request, the DHCP server will offer an IP address to the requesting host. If the host is unable to obtain such IP address, it can mean one of two things.
Reserved IP addresses are IP addresses in the ranges 10.0.0.0/8 – 10.255.255.255/8, 172.16.0.0/12 – 172.16.255.255/12 and 192.168.0.0/16 – 192.168.255.255/16. These IP addresses are only allocated to networks that do not have access to the Internet. These addresses are never allocated by ISPs and all Internet routers are pre-programmed to ignore them. If a user therefore tries to access such an IP address over the Internet, the message will not be transported across the Internet and hence the desired network cannot be reached.
Since an IP address is the Internet equivalent of a postal address, it is obvious that duplicate IP addresses cannot be tolerated. If a host is booted up, it tries to establish if any other host with the same IP address is available on the local network. If this is found to be true, the booting up machine will not proceed with logging on to the network and both machines with the duplicate IP address will display error messages in this regard.
As explained in Chapter 6, an IP address consists of two parts namely a NetID that is the equivalent of a postal code, and a HostID that is the equivalent of a street address. If two machines on the same network have different NetIDs, their ‘postal codes’ will differ and hence the system will not recognize them as coexisting on the same network. Even if they are physically connected to the same Ethernet network, they will not be able to communicate directly with each other via TCP/IP.
As explained in Chapter 6, the subnet mask indicates the boundary between the NetID and the HostID. A faulty subnet mask, when applied to an IP address, could result in a NetID (postal code) that includes bits from the adjacent HostID and hence appears different from the NetID of the machine wishing to send a message. The sending host will therefore erroneously believe that the destination host exists on another network and that the packets have to be forwarded to the local router for delivery to the remote network.
If the local router is not present (no default gateway specified) the sender will give up and not even try to deliver the packet. If, on the other hand, a router is present (default gateway specified), the sender will deliver the packet to the router. The router will then realize that there is nothing to forward since the recipient does in fact, live on the local network, and it will try to deliver the packet to the intended recipient and also send a redirect message to the offending host. Although the packet eventually gets delivered, it leads to a lot of unnecessary packets transmitted as well as unnecessary time delays.
An incorrect or absent default gateway in the TCP configuration screen means that hosts on the network cannot send messages to hosts on a different network. The following is an example:
Assuming that a host with IP address 192.168.0.1 wishes to ping a non-existent host on IP address 192.168.0.2. The subnet mask is 255.255.255.0. The pinging host applies the mask to both IP addresses and comes up with a result of 192.168.0.0 in both cases. Realizing that the destination host resides on the same network as the sender, it proceeds to ping 192.168.0.2. Obviously, there will be no response from the missing machine and hence a time–out will occur. The sending host will issue a time–out message in this regard.
Now consider a scenario where the destination host is 193.168.0.2 and there is no valid default gateway entry. After applying the subnet mask, the sending host realizes that the destination resides on another network and that it therefore needs a valid default gateway. Not being able to find one, it does not even try to ping but simply issues a message to the effect that the destination host is unreachable.
The reason for describing these scenarios is that the user can often figure out the problem by simply observing the error messages returned by the ping utility.
The MAC address of a device such as a PLC is normally displayed on a sticker attached to the casing. If the MAC address of the device is not known, it can be pinged by its IP address (for example, ping 101.3.4.5) after which the MAC address can be obtained by displaying the ARP cache on the machine that did the pinging. This is done by means of the arp –a command.
On a computer this is not a problem since the IP address can simply be looked up in the TCP/IP configuration screen. Alternatively, the IP address can be displayed by commands such as winipcfg, wntipcfg or ipconfig /all.
On a PLC this might not be so easy unless the user knows how to attach a terminal to the COM (serial) port on the PLC and has the configuration software handy. An easier approach to confirm the IP address setup on the PLC (if any) is to attach it to a network and run a utility such as WebBoy or EtherBoy. The software will pick up the device on the network, regardless of its NetID, and display the IP address.
It is possible that all devices on a network could have valid and correct IP addresses but that a specific host fails to respond to a message sent to it by a client program residing on another host. A typical scenario could be a supervisory computer sending a command to a PLC. Before assuming that the PLC is defective, one has to ascertain that the supervisory computer is, in fact, using the correct IP address when trying to communicate with the PLC. This can only be ascertained by using a protocol analyzer and capturing the communication or attempt at communication between the computer and the PLC. The packets exchanged between the computer and PLC can be identified by means of the MAC addresses in the appropriate Ethernet headers. It then has to be confirmed that the IP headers carried within these frames do, in fact, contain the correct IP addresses.
The following are a few TCP-related problems that can easily be resolved by means of a protocol analyzer.
Before any two devices can communicate using TCP/IP, they need to establish a so–called ‘triple handshake’. This will be clearly indicated as a SYN, ACK_SYN, ACK sequence to the relevant port. Without a triple handshake the devices cannot communicate at all.
To confirm this, simply try to establish a connection (for example, by using an FTP client to log into an FTP server) and use a protocol analyzer to capture the handshake. Check the IP addresses and port numbers for the client as well as the server.
TCP identifies different programs (processes) on a host by means of port numbers. For example, an FTP server uses ports 21 and 22, a POP3 server (for e-mail) uses port 110 and a web server (HTTP) uses port 80. Any other process wishing to communicate with one of these has to use the correct port number. The port number is visible right at the beginning of the TCP header that can be captured with a protocol analyzer. Port numbers 1 to 1023 are referred to as ‘well known’ ports. Port numbers are administered by ICANN and detailed in RFC 1700.
TCP is often used over satellite-based communications channels. One of the greatest challenges with satellite-based networks is the high-latency or delays in transmission and the appropriate solutions in dealing with this problem.
This chapter is broken up into the following sections:
Satellite communications has been around for a considerable time with the VSAT (Very Small Aperture Terminal) system being very popular for general use. This system could deliver up to 24 Mbps in a point-to-multi-point link. A point-to-point link can deliver up to 2 Mbps in both directions, although typical Internet speeds for home users are in the range of 256 kbps down /64 kbps up, or 512 kbps down/128 kbps up.
Customers have traditionally bought very specific time slots on a specific satellite. This is where the use of satellite communications distinguishes itself – for predictable communications. Typical applications here have been periodic uplinks by news providers. The more unpredictable Internet usage with surges in demand that often requires a quick response is not very suited to the use of satellites. NASA pioneered a satellite more focused on personal usage such as Internet access with the launch of its Advanced Communications Technology Satellite (ACTS). This is capable of delivering 100 Mbps of bandwidth using a Ka-band (20–30 GHz) spot-beam Geosynchronous Earth Orbit (GEO) satellite system.
When a satellite is used in a telecommunications link there is a delay (referred to as latency) due to the path length from the sending station to the satellite and back to the receiving station at some other location on the surface of the earth. For a GEO satellite, about 36 000 km above the equator, the propagation time for the radio signal to go to the satellite and return to earth is about 240 milliseconds. When the ground station is at the edge of the satellite footprint this one-way delay can increase to as much as 280 milliseconds. Additional delays may be incurred in the on-board processing in the satellite and any terrestrial- or satellite-to-satellite linking. The comparable delay for a 10 000 km fiber optic cable would be about 60 milliseconds.
Low Earth Orbit (LEO) satellites are typically located about 500–700 km above the surface of the earth. At this height the satellites are moving rapidly relative to the ground station and to provide continuous coverage a large number of satellites are used, together with inter-satellite linking. The IRIDIUM system uses about 93 satellites. The propagation delay from the ground station to the satellite varies due to the satellite position, from about 2 milliseconds when the satellite is directly overhead to about 80 milliseconds when the satellite is near the horizon. Mobile LEO users normally operate with quasi-omnidirectional antennas while large feeder fixed earth stations need steerable antenna with good tracking capability. Large users need seamless hand-off to acquire the next satellite before the previous one disappears over the horizon.
The main application for satellite based communications systems is in providing high-bandwidth access to places where the landline system does not provide this type of infrastructure. The greatest challenge with satellite systems, however, is the time it takes for data to get from one point to another. A possible solution to reduce this latency is the use of LEO satellites. However, the problem with LEOs is the need to provide a large number to get the relevant coverage of the earth’s surface. In partnership with Boeing, Teledisc is targeting the provision of 288 LEO satellites. A further practical problem with LEO satellites is they only last 10 to 12 years before they burn up while falling to earth through the atmosphere. GEOs don’t have this particular problem as they are merely ‘parked’ a lot higher up and left. A further challenge with LEOs is tracking these swiftly moving satellites, as they are only visible for up to 30 minutes before passing over the horizon. A phased-array antenna comprising a multitude of smaller antennas solves this antenna problem by tracking several different satellites simultaneously with different signals from each satellite. At least two satellites are kept in view at all times and the antenna initiates a link to a new one before it breaks the existing connection to the satellite moving to the bottom of the horizon.
The focus on GEO and LEO satellites will probably be as follows:
Two other interesting issues for satellites that have not been quite resolved yet are the questions of security and the cost of the service. Theoretically, anyone with a scanner can tune in to the satellite broadcasts, although encryption is being used for critical applications. Vendors also claim that the costs of using satellites will be similar to that of existing landline systems. This is difficult to believe as the investment costs (e.g. in the case of Iridium) tend to be extremely high.
A representation of the different satellite systems in use is indicated in the figure on the next page.
The important issues with satellites are the distance in which they orbit the earth and the radio frequency they use. This impacts on the latency, the power of the signal and the data transfer rate.
The various satellite bands can be listed as per the tables below.
Most TCP/IP traffic occurs over terrestrial networks (such as landlines, cable, telephone, fiber) with bandwidths ranging from 9600 bps to OC-12 with 622 Mbps and even higher. There are a number of opportunities for using satellite networks as a useful supplement to these terrestrial services. It is unlikely that the satellite will ever replace landline-based systems, but they will form a useful supplement.
According to Satellite Communications in the Global Internet – Issues, Pitfalls and Potential, using satellites for the Internet (and by default TCP/IP) has the following advantages:
The following are some typical applications with their features, according to Satellite Communications in the Global Interne – Issues, Pitfalls, and Potential.
Remote control and login
These applications are very sensitive to any delays in communications. If the delay extended beyond 300 to 400 ms the user would notice it and not be very happy. Interestingly enough, one advantage of satellite systems over terrestrial systems is the fact that although the delays can be significant, they are predictable and constant. This can be compared to the Internet where response times can vary dramatically from one moment to the next.
Videoconferencing
Assuming that the videoconferencing application can tolerate a certain amount of loss (i.e. caused by transmission errors), UDP can be used. This has far less overhead than TCP as it does not require any handshaking to transfer data and is more compact. Hence satellites would provide an improvement over normal terrestrial communications with a better quality picture due to a greater bandwidth and a simpler topology. Another benefit that satellites would provide for video transmission is the ability to provide isochronous transmission of frames (i.e. a fixed time relationship between frames and thus no jerky pictures).
Electronic mail
This does not require instantaneous responses from the recipient and hence satellite communications would be well-suited to this form of communications.
Information retrieval
The transmission of computer files requires a considerable level of integrity (i.e. no errors) and hence a reliable protocol such as TCP has to be used on top of IP. This means that, if a particularly fast response is required and a number of small transfers are used to communicate the data, satellite communications will not be a very effective of communication.
Bulk information broadcasting
Bulk data (such as from stock market databases/medical data/web casting/TV programs) can effectively be distributed by satellite with a vast improvement over the typically mesh-like landline-based systems.
Interactive gaming
Computer games played on an interactive basis often require instantaneous reaction times. The inherent latency in a satellite system would mean that this is not particularly effective. Games that require some thought before a response is transmitted (e.g. chess and card games) would however be effective using satellite transmission.
The following diagrams summarize the discussion above.
There are a number of weaknesses with TCP/IP that are exacerbated with the use of high latency satellite links. These are discussed below:
In order to use the bandwidth of a satellite channel more effectively, TCP needs to have a larger window size. If a satellite channel has a round trip delay of, say, 600 ms and the bandwidth is 1.54 Mbps, then the bandwidth-delay product would be 0.924 Mb, which equates to 113 kB (= 924/8). This is considerably larger than the 64 kB maximum window size for TCP/IP.
Due to the significant latency in satellite links, TCP adapts rather slowly to bandwidth changes in the channel. TCP adjusts the window size upwards when the channel becomes congested and downward when the bandwidth increases.
When a segment is lost or corrupted, TCP senders will retransmit all the data from the missing segment onwards, regardless of whether subsequent segments were received correctly or not. This loss of a segment is considered evidence of congestion and the window size is reduced by half. A more selective mechanism is required. There is a big difference between loss of segments due to real errors on the communications channel and congestion, but TCP cannot distinguish between the two.
When a TCP transaction is commenced, an initial window size of one segment (normally about 512 bytes) is selected. It then doubles the window size as successful acknowledgements are received from the destination up and until it reaches the network saturation state (where a packet is dropped). Once again, this is a very slow way of ramping up to full bandwidth utilization. The total time for a TCP slow start period is calculated as:
Slow start time = RTT * log (B/MSS)
where:
RTT = Round Trip Time
B = Bandwidth
MSS = TCP Maximum Segment Size
A TCP/IP transaction involves the use of the client–server interaction. The client sends a request to the server and the server then responds with the appropriate information (i.e. it provides a service to the client). With HTTP (the HyperText Transfer Protocol), which is what the World Wide Web is based on, the transmission of a web page as well as every item (e.g. gif file) on it has to be-commenced with the standard three-way handshake. This is particularly inefficient for small data transactions, as the process has to be repeated every time.
There are various ways to optimize the use of TCP/IP over satellite, especially with regard to mitigating the effects of latency. Interestingly enough, if these concerns with satellites can be addressed this will assist in the design and operation of future high-speed terrestrial networks because of the similar bandwidth-delay characteristic. The major problems for both satellites and high-speed networks with TCP/IP have been the need for a larger window size, the slow start period and ineffective bandwidth adaptation effects.
The various issues are discussed below:
Large Windows (TCP-LW)
A modification to the basic TCP protocol (RFC 13237) allows for a large window size by increasing the existing one from 216 to 232 bytes in size. This is accomplished by means of a multiplier in the ‘options’ field of the TCP header. This allows more effective use of the communications channel with large bandwidth-delay products. Note that both the receiver and sender have to use a version of TCP that implements TCP-LW.
Selective Acknowledgment (TCP-SACK)
A newly defined enhancement entitled Selective Acknowledgment (RFC 2018) allows the receiving node to advise the sender immediately of the loss of a packet. The sender then immediately sends a replacement packet, thus avoiding the timeout condition and the consequent lengthy recovery in TCP (which would otherwise then have reduced its window size and then very slowly increased bandwidth utilization).
Nearly all commercial TCP implementations now support SACK and LW although it may be necessary to explicitly enable them on some systems.
Congestion avoidance
There are two congestion avoidance techniques; but neither has been popular as yet. The first approach, which has to be implemented in a router, is called Random Early Detection (RED) where the router sends an explicit notice of congestion, using ICMP, when it believes that congestion will occur shortly unless it takes corrective action.
On the other hand an algorithm can be implemented in the sender where it observes the minimum RTT for the packets it is transmitting in order to calculate the amount of data queued in the communications channel. If the number of packets being queued is increasing, it can reduce the congestion window. It will then increase the congestion window when it sees the number of queued packets decreasing.
TCP for Transactions (T/TCP)
The three-way handshake represents a considerable overhead for small data transactions, often associated with HTTP transfers. An extension called T/TCP bypasses the three-way handshake and the slow-start procedure by using the data stored in a cache from previous transactions.
Middleware
It is also possible to effect significant improvements to the operation of TCP/IP without actually modifying the TCP/IP protocol itself, using so-called ‘middleware’ where split-TCP and TCP spoofing could be used.
Split-TCP
Here the end-to-end TCP connection is broken into two or three segments. This is indicated in the figure below. Each segment is in itself a complete TCP link. This means that the outer two links (which have minimal latency) can be set up as usual. However the middle TCP satellite link with its significant latency would have extensions to TCP such as TCP-LW and T/TCP. This results in only minor modifications to the application software at each end of the link.
TCP spoofing
With TCP spoofing (RFC 3135) an intermediate router (such as one connected to the satellite uplink) immediately acknowledges all TCP packets coming through it to the sender. All the receiver acknowledgments are suppressed so that the sender does not get confused. If the receiver does not receive a specific packet and the satellite uplink router has timed out, it will then retransmit this (missing) segment to the receiver. The resultant effect is that the originator believes that it is dealing with a low latency network.
Application protocol approaches
There are three approaches possible here:
Persistent TCP connections
In some client–server applications with very small amounts of data transfer, there are considerable inefficiencies. Later versions of HTTP minimize this problem and use a persistent connection to combine all these transfers into one fetch. Further to this it pipelines the individual transfers so that there is an overlap of transmission delays thus creating an efficient implementation.
Caching
In this case, commonly used documents (such as used with HTTP and FTP) are broadcast to local caches. The web clients then access these local caches rather than having to go through a satellite connection. The web clients thus have a resultant low latency and low network utilization, meaning more bandwidth available for higher speed applications.
Application specific proxies
In this case, an application specific proxy can use its domain knowledge to pre-fetch web pages so that web clients subsequently requesting these pages considerably reduce the effects of latency.
10Base2
IEEE 802.3 (Ethernet) implementation on thin coaxial cable (RG58/AU).
10Base5
IEEE 802.3 (Ethernet) implementation on thick coaxial cable (RG8).
10BaseT
IEEE 802.3 (Ethernet) implementation on unshielded 22 AWG twisted pair cable.
Access control mechanism
The way in which the LAN manages the access to the physical transmission medium.
Address
A normally unique designator for the location of data in memory, or the identity of a peripheral device that allows each device on a shared communications medium to respond to messages directed at it.
Address Resolution Protocol (ARP)
A TCP/IP protocol used by a router or a source host to translate the IP address of the destination host into the physical hardware (MAC) address, for delivery of the message to a destination on the local physical network.
Algorithm
Normally used as a basis for writing a computer program. This is a set of rules with a finite number of steps for solving a problem.
Alias frequency
A false lower frequency component that appears in analog data reconstructed from sampled data that was acquired by sampling the original signal at too low a rate. The sampling rate should be at least twice the maximum frequency component in the original signal.
ALU
Arithmetic Logic Unit
Amplitude modulation
A modulation technique (also referred to as AM or ASK) used to allow data to be transmitted across an analog network, such as a switched telephone network. The amplitude of a single carrier frequency is varied or modulated between two levels; one for binary ‘0’ and one for binary ‘1’.
Analog
A continuous real time phenomenon where the information values are represented in a variable and continuous waveform.
ANSI
American National Standards Institute. The national standards development body in the USA.
AppleTalk
A proprietary computer networking standard initiated by Apple Computer for use in connecting the Macintosh range of computers and peripherals. This standard operates at 230 kilobits/second.
Application layer
The highest layer of the seven-layer OSI reference model structure, which acts as the interface between the user application and the lower layers of the protocol stack.
Application Programming Interface (API)
A set of interface definitions (functions, subroutines, data structures or class descriptions) which, together, provide a convenient interface to the functions of a subsystem and which insulate the application from the details of the implementation..
Arithmetic Logic Unit (ALU)
The element(s) in a processing system that perform(s) the mathematical functions such as addition, subtraction, multiplication, division, inversion and Boolean computations (AND, OR, etc).
ARP
Address Resolution Protocol.
ARPANET
The packet switching network, funded by the DARPA, which has evolved into the world-wide Internet.
ARP cache
A table of recent mappings of IP addresses to physical (MAC) addresses, maintained in each host and router.
AS
Australian Standard
ASCII
American Standard Code for Information Interchange. A universal standard for encoding alphanumeric characters into 7 or 8 bits.
ASIC
Application Specific Integrated Circuit
ASN.1
Abstract Syntax Notation One. An abstract syntax used to define the structure of the protocol data units associated with a particular protocol entity.
Asynchronous
Communications where data can be transmitted at an arbitrary unsynchronized point in time without synchronization of the clock signals in the transmitter and receiver. Synchronization is controlled by a start bit at the beginning of each transmission.
Attenuation
The decrease in the magnitude of strength (or power) of a signal. In cables, generally expressed in dB per unit length.
Attenuator
A passive device that decreases the amplitude of a signal without introducing any undesirable characteristics to the signal, such as distortion.
AUI
Attachment Unit Interface. This cable, sometimes called the drop cable, was used to attach terminals to the transceiver unit in 10Base5 Ethernet systems.
AWG
American Wire Gauge.
Balanced circuit
A circuit so arranged that the voltages on each conductor of the pair are equal in magnitude but opposite in polarity with respect to ground.
Bandwidth
A range of frequencies available expressed as the difference between the highest and lowest frequencies expressed in hertz (or cycles per second). Also used as an indication of capacity of the communications link.
Base address
A memory address that serves as the reference point. All other points are located by offsets in relation to the base address.
Baseband
Baseband operation is the direct transmission of data over a transmission medium without the prior modulation of a high frequency carrier signal.
Baud
Unit of signaling speed derived from the number of signal changes per second.
BCC
Block Check Character. Error checking scheme with one check character; a good example being block checksum created by adding the data units together.
BCD
Binary Coded Decimal. A code used for representing decimal digits in binary code.
BIOS
Basic Input/Output System.
Bipolar
A signal range that includes both positive and negative values.
BIT
Derived from ‘binary digit’, a ‘1’or ‘0’ condition in the binary system.
BIT stuffing
Bit stuffing with zero bit insertion. A technique used to allow pure binary data to be transmitted on a synchronous transmission line. Each message block (frame) is encapsulated between two flags, which are special bit sequences (e.g.01111110). Then, if the message data contains a possibly similar sequence, an additional (0) bit is inserted into the data stream by the sender, so that the data cannot be mistaken for a flag character, and is subsequently removed by the receiving device. The transmission method is then said to be data transparent.
Bits per second (bps)
Unit of data transmission rate.
BNC
Bayonet type coaxial cable connector.
Bridge
A device used to connect similar sub-networks at layer 2 (the Data Link layer) of the OSI model. Used mostly to reduce the network load.
Broadband
Opposite of baseband. Several data steams to be transmitted are first modulated on separate high frequency carrier signals. They are then be transmitted simultaneously on a common the same transmission medium.
Broadcast
A message intended for all devices on a bus.
BS
British Standard.
BSC
Binary Synchronous Control protocol (a.k.a. Bi-Sync). One of the oldest communication protocols, developed by IBM in the 1960’s. It uses a defined set of ASCII control characters for the transmission of binary coded data between a master and several remote terminals
Buffer
An intermediate temporary storage device used to compensate for a difference in data rate and data flow between two devices (also called a spooler for interfacing a computer and a printer).
Burst mode
A high-speed data transfer mode in which the address of the data is sent followed by back-to- back data units as long as a physical control signal is asserted.
Bus
A data path shared by many devices, with one or more conductors for transmitting control signals, data or power.
Byte
A term referring to eight associated bits of information; sometimes called an ‘octet’.
Capacitance
Ability of a device to store electrically separated charges between two plates having different potentials. The value is proportional to the surface area of the plates and inversely proportional to the distance between them.
Capacitance (mutual)
The capacitance between two conductors with all other conductors, including the cable shield, short-circuited to the ground.
Cascade
Two or more electrical circuits in which the output of one is fed into the input of the next one.
CCITT (see ITU-T)
Consultative Committee on International Telegraphs and Telephone. A committee of the International Telecommunications Union (ITU) that sets world-wide telecommunications standards (e.g. V.21, V.22, V.22bis).
Character
Letter, numeral, punctuation, control code or any other symbol contained in a message.
Characteristic impedance
The resistance that, when connected to the output terminals of a transmission line of any length, makes the line appear infinitely long. Also defined as the ratio of voltage to current at every point along a transmission line on which there are no standing waves.
Clock
The source of timing signals for sequencing electronic events e.g. synchronous
data transfer.
CMRR
Common Mode Rejection Ratio.
CMV
Common Mode Voltage.
CNR
Carrier to Noise Ratio. An indication of the quality of the modulated signal.
Collision
The situation arising when two or more LAN nodes attempt to transmit at the same time.
Common mode signal
The common component to both signal voltages on a differential balanced circuit. This is usually caused by a difference in ground potential between the sending and receiving circuits.
Common carrier
A private data communications utility that offers communications services to the general public.
Contention
The situation arising when two or more devices contend for the same resources, for example the facility provided by a dial-up network or data PABX that allows multiple terminals to compete on a first come, first served basis for a smaller number of computer posts.
CPU
Central Processing Unit.
CRC
Cyclic Redundancy Check. An error-checking mechanism using a polynomial algorithm based on the content of a message frame at the transmitter and included in a field appended to the frame. At the receiver, it is then compared with the result of a similar calculation performed by the receiver.
Crosstalk
A situation where a signal from a communications channel interferes with an associated channel’s signals.
CSMA/CD
Carrier Sense Multiple Access/Collision Detection. When two stations realize that they are transmitting simultaneously on a Local Area Network, they both cease transmission and signal that a collision has occurred. Each then tries again after waiting for a predetermined time period. This forms the basis of the original IEEE 802.3 specifications.
Data Link layer
This corresponds to layer 2 of the OSI reference model. This layer is concerned with the reliable transfer of frames (packets) across the medium.
Datagram
A type of service offered on a packet-switched data network. A datagram is a self-contained packet of information that is sent through the network with minimum protocol overheads.
Decibel (dB)
A logarithmic measure of the ratio of two signal levels, measured in decibels (dB):
Ratio of voltages = 20 log10V1/V2 dB, or
Ratio of power = 10log10 P1/P2 dB
where V refers to Voltage and P refers to Power.
Default
An assigned value or set-up condition, which is automatically assumed for the system at start-up unless otherwise explicitly specified.
Delay distortion
Distortion of a signal caused by the frequency components of the signal having different propagation velocities across a transmission medium.
DES
Data Encryption Standard,
Dielectric constant (E)
The ratio of the capacitance using the material in question as the dielectric, to the capacitance resulting when the material is replaced by air.
Digital
A signal that has definite states (normally two).
DIN
Deutsches Institut für Normierung (i.e. German Standards Institute).
DIP
Acronym for dual-in-line package referring to integrated circuits and switches.
Direct Memory Access (DMA)
A technique of transferring data between the computer memory and a device on the computer bus without the intervention of the central processor unit.
DNA
Distributed Network Architecture.
Driver software
A program that acts as the interface between a higher level coding structure (e.g. the software outputting data to a printer attached to a computer) and the lower level hardware/firmware component of the printer.
DSP
Digital Signal Processing.
Duplex
The ability to send and receive data over the same communications line.
Dynamic range
The difference in decibels between the overload (maximum) and minimum discernible signal level in a system.
EBCDIC
Extended Binary Coded Decimal Interchange Code. An eight-bit character code used primarily in IBM equipment. The code allows for 256 different bit patterns.
EEPROM
Electrically Erasable Programmable Read Only Memory. Non-volatile memory in which individual locations can be erased and re-programmed.
EIA
Electronic Industries Association. A standards organization in the USA specializing in
the electrical and functional characteristics of interface equipment. Now known as the TIA.
TIA-232-F
Interface between DTE and DCE, employing serial binary data exchange. Typical maximum specifications are 15 m at 19200 baud. Generally known as RS-232.
EIA-422
Interface between DTE and DCE employing the electrical characteristics of balanced (differential) voltage interface circuits. Also known as RS-422.
EIA-423
Interface between DTE and DCE, employing the electrical characteristics of unbalanced voltage digital interface circuits. Also known as RS-423.
EIA-449
General-purpose 37-pin and 9-pin interface for DCE and DTE employing serial
binary interchange. Also known as RS-449.
EIA-485
A standard that specifies the electrical characteristics of drivers and receivers for use in balanced digital multi-point systems. Also known as RS-485.
EISA
Enhanced Industry Standard Architecture.
EMI/RFI
Electromagnetic Interference/Radio Frequency Interference. ‘Background noise’ that could modify or destroy data transmission.
Emulation
The imitation of a computer system performed by a combination of hardware and software, that allows programs to run between incompatible systems.
Enabling
The activation of a function of a device by a pre-defined signal.
Encoder
A circuit that changes a given signal into a coded combination for purposes of optimum transmission of the signal.
EPROM
Erasable Programmable Read Only Memory. Non-volatile semiconductor memory that is erasable in an ultraviolet light and is reprogrammable.
Equalizer
A device that compensates for the unequal gain characteristic of the
signal received.
Error rate
The ratio of the average number of bits corrupted to the total number of bits transmitted for a data link or system.
Ethernet
The general name for IEEE 802.3 networks.
Farad
Unit of capacitance whereby a charge of one coulomb produces a one volt potential difference.
FCC
Federal Communications Commission.
FCS
Frame Check Sequence. A general term given to the additional bits appended to a transmitted frame or message by the source to enable the receiver to detect transmission errors.
FIFO
First In First Out.
Filled cable
A cable construction in which the cable core is filled with a material that prevents moisture from entering or passing along the cable.
FIP
Factory Instrumentation Protocol.
Firmware
A computer program or software stored permanently in PROM or ROM or semi-permanently in EPROM.
Flame retardancy
The ability of a material not to propagate flame once the flame source is removed.
Floating
An electrical circuit in which none of the power terminals are connected to ground.
Flow control
The procedure for regulating the flow of data between two devices, preventing the loss of data once a device’s buffer has reached its capacity.
Frame
The unit of information transferred across a data link. Typically there are control frames for link management and information frames for the transfer of message data.
Frequency
Refers to the number of cycles per second.
Full duplex
Simultaneous two-way independent transmission in both directions. On a baseband system this requires 4 wires, in a broadband system it can be done with 2 wires.
Giga
Metric system prefix – 109.
Gateway
A device connecting two different networks that are incompatible in the lowest 3 layers of the OSI model. An alternative definition is a protocol converter.
Ground
An electrically neutral circuit having the same potential as earth. A reference point for an electrical system also intended for safety purposes.
Half duplex
Transmissions in either direction, but not simultaneously.
Hamming distance
A measure of the effectiveness of an error checking mechanism. The higher the Hamming Distance (HD) index, the smaller is the risk of undetected errors.
Handshaking
Exchange of predetermined control signals between two devices or protocols.
HDLC
High Level Data Link Control. An older communication protocol defined by the ISO to control the exchange of data across point-to-point or multi-drop data links.
Hertz (Hz)
A term replacing cycles per second as a unit of frequency.
Hex
Hexadecimal.
Host
This is normally a computer belonging to a user that contains (hosts) the communication hardware and software necessary to connect the computer to a data communications network.
A number that allows the CPU to distinguish between different boards in an input/output system. All computer interface boards must have different addresses.
IEC
International Electrotechnical Commission.
IEE
Institution of Electrical Engineers.
IEEE
Institute of Electrical and Electronic Engineers. A US-based international professional society that issues its own standards and is a member of ANSI and ISO.
IFC
International FieldBus Consortium.
Impedance (Z)
The total opposition that a circuit offers to the flow of alternating current or any other varying current at a particular frequency. It is a combination of resistance R and reactance X, measured in ohms.
Inductance
The property of a circuit or circuit element that opposes a change in current flow, thus causing current changes to lag behind voltage changes. It is measured in henrys.
Insulation Resistance (IR)
That resistance offered by cable insulation to a leakage current caused by an impressed voltage.
Interface
A shared boundary defined by common physical interconnection characteristics, signal characteristics and measurement of interchanged signals.
Interrupt
An external event indicating that the CPU should suspend its current task to service a designated activity.
Interrupt handler
The section of the program that performs the necessary operation to service an interrupt when it occurs.
IP
Internet Protocol
ISA
Industry Standard Architecture (for IBM Personal Computers).
ISB
Intrinsically Safe Barrier.
ISDN
Integrated Services Digital Network. A telecommunications network that utilizes digital techniques for both transmission and switching. It supports both voice and data communications.
ISO
International Organization for Standardization.
ISR
Interrupt Service Routine. See Interrupt Handler.
ITU
International Telecommunications Union.
Jabber
Garbage that is transmitted when a LAN NIC fails and continuously transmits.
Jumper
A connection between two pins on a circuit board to select an operating function.
k
This is 210 or 1024 in computer terminology, e.g. 1 kb = 1024 bytes.
LAN
Local Area Network. A data communications system confined to a limited geographic area with data rates up to 10 Gbps.
LCD
Liquid Crystal Display. A low-power display system used on many laptops and other digital equipment.
Leased (or private) line
A private telephone line without inter-exchange switching arrangements.
LED
Light Emitting Diode. A semiconductor light source that emits visible light or infrared radiation.
Line driver
A signal converter that conditions a signal to ensure reliable transmission over an extended distance.
Linearity
A relationship where the output is directly proportional to the input.
Link layer
Layer two of the OSI reference model. Also known as the Data Link layer.
LLC
Logical Link Control (IEEE 802.2).
Loop resistance
The measured resistance of two conductors forming a circuit.
Loopback
Type of diagnostic test in which the transmitted signal is returned to the sending device after passing through all, or a portion of, a data communication link or network. A loopback test permits the comparison of a returned signal with the transmitted signal.
m
Meter. Metric system unit for length.
M
Mega. Metric system prefix for 106.
MAC
Media Access Control
Manchester encoding
Digital technique in which each bit period is divided into two complementary halves; a negative to positive voltage transition in the middle of the bit period designates a binary ‘1’, whilst a positive to negative transition represents a ‘0’. The encoding technique also allows the receiving device to recover the transmitted clock from the incoming data stream (self clocking). Some variations of Manchester use a polarity opposite to the one described above.
Mark
This is equivalent to a binary 1.
MAU
Media Access Unit.
MAU
Multi-station Access Unit for IBM Token Ring.
Media Access Unit
This is the Ethernet transceiver for 10Base5 units situated on the coaxial cable that then connects to the terminal with an AUI drop cable.
Microwave
AC signals having frequencies of 1 GHz or more.
MIPS
Million Instructions Per Second.
MODEM
MOdulator–DEModulator. A device converting serial digital data from a transmitting terminal to a signal suitable for transmission over a telephone channel or to reconvert the transmitted signal to serial digital data for the receiving terminal.
MOS
Metal Oxide Semiconductor.
MOV
Metal Oxide Varistor.
MTBF
Mean Time Between Failures.
MTTR
Mean Time To Repair.
Multidrop
A single communication line or bus used to connect three or more points.
Multiplexer (MUX)
A device used for division of a communication link into two or more channels either by using frequency division or time division.
Multistation Access Unit
Passive coupling unit, containing relays and transformers, used to star-wire the lobes of an IBM Token Ring system.
Narrowband
A device that can only operate over a narrow band of frequencies, typically less than 128 kbps. Opposite of Wideband.
Network architecture
A set of design principles including the organization of functions and the description of data formats and procedures used as the basis for the design and implementation of a network.
Network driver
Program to provide interface between the network card (NIC) and higher layer protocols.
Network layer
Layer 3 in the OSI reference model, the logical network entity that services the Transport layer. Responsible for routing data through the network.
Network topology
The physical and logical relationship of nodes in a network; the schematic arrangement of the links and nodes of a network typically in the form of a star, ring, tree or bus topology.
Network
An interconnected group of nodes.
Node
A device connected to a network.
Noise
A term given to the extraneous electrical signals that may be generated or picked up in a transmission line. If the noise signal is large compared with the data carrying signal, the latter may be corrupted resulting in transmission errors.
Non-linearity
A type of error in which the output from a device does not relate to the input in a
linear manner.
NRZ
Non Return to Zero. A method of mapping a binary signal to a physical signal for transmission over some transmission media. Logical ‘1’ is represented by one voltage and logical ‘0’ represented by another voltage.
NRZI
Non Return to Zero Inverted. Similar to NRZ, but the NRZI signal has a transition at a clock boundary if the bit being transmitted is a logical ‘0’, and does not have a transition if the bit being transmitted is a logical ‘1’.
OHM (Ω)
Unit of resistance such that a constant current of one ampere produces a potential difference of one volt across a resistive element.
Optical isolation
Two networks with no electrical continuity in their connection because an optoelectronic transmitter and receiver have been used.
OSI
Open Systems Interconnection.
Packet
A group of bits (including data and control signals) transmitted as a whole on a packet switching network.
PAD
Packet Assembler/Disassembler. An interface between a terminal or computer and a packet switching network.
Parallel transmission
The transmission mode where a number of bits are sent simultaneously over separate parallel lines. Usually uni-directional e.g. the Centronics interface for a printer.
PCM
Pulse Code Modulation. The sampling of a signal and encoding the amplitude of each sample into a series of uniform pulses.
PCMCIA
Personal Computer Manufacturers Industries Association. Standard interface for peripherals for laptop computers.
PDU
Protocol Data Unit.
Peripherals
The input/output and data storage devices attached to a computer e.g. disk drives, printers, keyboards, display, communication boards, etc.
Physical layer
Layer 1 of the OSI reference model, concerned with the electrical and mechanical specifications of the network termination equipment.
PLC
Programmable Logic Controller.
PLL
Phase-Locked Loop.
Point-to-Point
A connection between only two devices.
Polyethylene
A family of insulators derived from the polymerization of ethylene gas and characterized by outstanding electrical properties, including low dielectric constant and low dielectric loss across the frequency spectrum.
PolyVinyl Chloride (PVC)
A general purpose family of insulation of which the basic constituent is polyvinyl chloride or its co-polymer with Vinyl Acetate. Plasticizers, stabilizers, pigments and fillers are added to improve mechanical and/or electrical properties of this material.
Port
A place of access to a device or network, used for input/output of digital and analog signals. Also a number used by TCP to identify clients and servers.
Presentation layer
Layer 6 of the OSI reference model, concerned with negotiation of suitable transfer syntax for use during an application. If this is different from the local syntax, the translation to/from this syntax.
Protocol
A formal set of conventions governing the formatting, control procedures and relative timing of message exchange between two communicating systems.
PSDN
Public Switched Data Network. Any switching data communications system, such as Telex and public telephone networks, that provides circuit switching to
many customers.
PSTN
Public Switched Telephone Network. This is the term used to describe the (analog) public telephone network.
PTT
Post, Telephone and Telecommunications Authority.
R/W
Read/Write.
RAM
Random Access Memory. Semiconductor read/write volatile memory. Data is lost if the power is turned off.
Reactance
The opposition offered to the flow of alternating current by inductance or capacitance of a component or circuit.
Repeater
An amplifier that regenerates the signal and thus expands the network.
Resistance
The ratio of voltage to electrical current for a given circuit, measured in ohms.
Response time
The elapsed time between the generation of the last character of a message at a terminal and the receipt of the first character of the reply. It includes terminal delay and network delay.
RF
Radio Frequency.
RFI
Radio Frequency Interference.
Ring
Network topology for interconnection of network nodes (e.g. in IBM Token Ring). Each device is connected to its nearest neighbors until all the devices are connected in a closed loop or ring. Data is transmitted in one direction only. As a message circulates around the ring, it is read by each device connected in the ring.
Rise time
The time required for a waveform to reach a specified higher value from some lower value. The time is usually measured between the two points representing 10% and 90% of the total amplitude change.
RMS
Root Mean Square.
ROM
Read Only Memory. Computer memory in which data can be routinely read but written to only once using special means when the ROM is manufactured. A ROM is used for storing data or programs on a permanent basis.
Router
A linking device between network segments which may differ in layers 1 and 2 of the OSI model.
SAA
Standards Association of Australia.
SAP
Service Access Point.
SDLC
Synchronous Data Link Control. Old IBM standard protocol superseding the BSC protocol, but preceding HDLC.
Serial transmission
The most common transmission mode in which information bits are sent sequentially on a single data channel.
Session layer
Layer 5 of the OSI model, concerned with controlling the dialog (message exchange) between to logical entities.
Simplex transmissions
Data transmission in one direction only.
Slew rate
The maximum rate which an amplifier’s output can change, generally expressed in V/µs.
SNA
Systems Network Architecture (IBM).
Standing wave ratio
The ratio of the maximum to minimum voltage (or current) on a transmission line at least a quarter-wavelength long. (VSWR refers to voltage standing wave ratio.)
Star
A type of network topology in which there is a central node that performs the interconnection between the nodes attached to it.
STP
Shielded Twisted Pair.
Switched line
A communication link for which the physical path may vary with each usage, such as a dial-up connection on the public telephone network.
Synchronization
The co-ordination of the activities of several circuit elements (e.g. clocks).
Synchronous transmission
Transmission in which the transmitter and receiver clocks are synchronized. Synchronized transmission eliminates the need for start and stop bits, but requires a self-clocking encoding method such as Manchester, as well as one or more synchronizing bytes at the beginning of the message.
TCP
Transmission Control Protocol.
TDR
Time Domain Reflectometer. This testing device enables the user to determine cable quality with providing information and distance to cable defects, by measuring the time taken by a signal to and from reflective i.e. impedance mismatch points (such as connectors or a break) on the cable.
Telegram
In general, a data block transmitted on the network. Usually comprises address, information and check characters. Basically a synonym for a packet or frame.
Temperature rating
The maximum and minimum temperature at which an insulating material may be used in continuous operation without loss of its basic properties.
TIA
Telecommunications Industry Association.
Time sharing
A method of computer operation that allows several interactive terminals to use one computer.
Token Ring
Collision free, deterministic bus access method as per IEEE 802.5 ring topology.
Topology
Physical configuration of network nodes, e.g. bus, ring, star, tree.
Transceiver
A combination of transmitter and receiver.
Transient
An abrupt change in voltage of short duration.
Transmission line
One or more conductors used to convey electrical energy from one point to another.
Transport layer
Layer 4 of the OSI model, concerned with providing a network independent reliable message interchange service to the application oriented layers (layers 5 through 7).
Twisted pair
A data transmission medium, consisting of two insulated copper wires twisted together. This improves its immunity to interference from nearby electrical sources that may corrupt the transmitted signal.
Unbalanced circuit
A transmission line in which voltages on the two conductors are unequal with respect to ground e.g. a coaxial cable.
UTP
Unshielded Twisted Pair.
Velocity of propagation
The ratio of the speed of an electrical signal down a length of cable compared to speed in free space, expressed as a percentage.
VFD
Virtual Field Device. A software image of a field device describing the objects supplied by it e.g. measured data, events, status etc, which can be accessed by another network.
VHF
Very High Frequency.
Volatile memory
An electronic storage medium that loses all data when power is removed.
Voltage rating
The highest voltage that may be continuously applied to a wire in conformance with standards or specifications.
VSD
Variable Speed Drive.
VT
Virtual Terminal.
WAN
Wide Area Network.
Word
The standard number of bits that a processor or memory manipulates at one time i.e. 16, 32 or 64 bits. Alternatively it means 16 bits, as opposed to 8 bits (byte) and 32 bits (double word)
X.21
ITU standard governing the interface between DTE and DCE devices for synchronous operation on public data networks.
X.25
ITU standard governing interface between DTE and DCE device for terminals operating in packet mode on public data networks.
X.25 PAD
A device that permits communication between non-X.25 devices and the devices in an X.25 network.
X.3/X.28/X.29
A set of internationally agreed-to standard protocols defined to allow a character oriented device, such as a visual display terminal to be connected to a packet switched data network.
As discussed earlier, there are three levels of addressing in an Ethernet TCP/IP environment:
Each host is assumed to have multiple applications running concurrently. An identifier known as the registered port number identifies the client application on the initiating host, and the well-known port number specifies the server application on the target host. These port numbers are 16 bits long and are standardized according to their use.
Port numbers 0 to 1023 are assigned by ICANN as ‘well known’ numbers while port numbers 1024 to 49151 are issued as ‘registered numbers’ by ICANN. Numbers 49152 to 65535 are used for testing, software development, etc. A complete listing of assigned ports is contained in RFC 1700 but an abbreviated list is contained below.
IP is responsible for the delivery of packets (datagram) between hosts. It forwards (routes) and delivers datagrams on the basis of IP addresses attached to the datagrams. The IP Address is a 32-bit entity containing both the network address and the host address. IP address can be configured in two ways. They are static configuration and dynamic configuration.
Follow the procedure given below, to set up static IP address for the network shown in Figure 1.1.
Note: Exclude the IP addresses allocated to the routers, viz. ‘192.168.0.1’ and ‘202.168.0.1’.
Note: If this exercise is performed on corporate laptops, please confirm with the IT department regarding the changes that are to be made to Domain name. If this is a possibility of network conflicts, leave the ‘Domain Name’ and ‘NetBIOS’ names as they are, unless, provided help from a member of the IT department.
Note: To check the new IP configuration immediately after changing IP address details, remember to close the configuration dialog window, otherwise the IP settings will not be updated.
Dynamic Host Configuration Protocol (DHCP) allows a server to dynamically distribute IP addressing and configuration information to clients. Normally the DHCP server provides the client with at least this basic information of IP Address, Subnet mask, Default Gateway.
The Ping program is simple tool that allows verifying if a host is live or not. The Ping program in the source host sends a request packet to the target IP address; if the target is live, the Ping program in the target host responds by sending a reply packet back to the source host. Both of these Ping packets (echo request and echo response) are ICMP (Internet Control Message Protocol) packets.
ping command – Syntax:
Note: The HTTP GET message is carried inside a TCP segment, which is in turn carried inside an IP datagram, which is carried inside an Ethernet frame.
The instructor will install a web server (Xitami) on one of the hosts. There is no need to install a custom home page as the default home page is sufficient for this exercise. Alternatively connect the Netgear 526T switch to the network (do not replace a hub; just connect one of the switch ports to a hub port). Unfortunately the switch does not have an FTP server built in.
If process described above does not work, then set up IE6 to look for web pages on the LAN and not, for example, on a dialup connection.
Port Scanning can also be performed using ‘NMap’ (Zenmap – provides graphical user interface)
Note: Select the host from the ‘Target’ dropdown box to re-scan a host.
This exercise is to study TCP connection setup and investigate packet trace of the TCP file from client computer to a remote server using Wireshark. TCP Sequence Numbers and Acknowledgements are studied using Wireshark. Obtaining port numbers of the source and destination systems.
Note: This cannot be possible if the netgear switch used does not support FTP.
A packet capture tool, Wireshark, can be used to monitor the packets going to and from a specific device. To be able to capture those packets, the packet capture tool has to be sharing the network segment.
One simple device that is useful to monitor another device’s packet flow is a hub. Using shared media or hubs to connect the Ethernet nodes together, meaning all packets could be received by all nodes on that network. Subsequently, all these packets can be monitored from any port on that hub. This shared feature makes Hubs a security risk.
Note: Switches optimize traffic and do not broadcast the data to all the computers in the network.
If, for some reasons routers are not available, they may be replaced with two laptops set up as routers. The procedure for each laptop to act as a router is as follows:
The result is a simple two-port router. We are not running any dynamic routing protocols here, so we have to use static routing table entries (once again with reference to Figure 9.1).
For the ‘router’ on the left:
Route add 202.168.0.0 mask 255.255.255.0 200.200.200.1
For the ‘router’ on the right:
Route add 192.168.0.0. 255.255.255.0 200.200.200.1
These settings will disappear when the machine shuts down. To make them persistent across boots, you have to add the –p switch to the route commands shown above.
When troubleshooting a routed network at packet level, it is imperative to understand the actual (physical) movement of the packets, as opposed to the ‘logical’ sequence of events. In this exercise host ‘A’ (e.g. 192.168.0.2) will ping host ‘B’ (e.g. 202.168.0.2) repetitively (ping 202.168.0.2 –t), and capture the packets with Ethereal.
Now select an ICMP Echo Message, go to the middle of the screen, and check the MAC addresses (source and destination) against your drawing. Suddenly things don’t look so right, or do they?
The utility “wntipcfg” produces the same results. Click the ‘More Info’ tab and analyse details.
In this exercise host ‘A’ (e.g. 192.168.0.2) will ping host ‘B’ (e.g. 202.168.0.2) repetitively (ping 202.168.0.2 –t), and everybody will capture the packets with Ethereal.
When we look at the MAC addresses (Figure 9.2), things seem different. The sender’s MAC address is correct, but the recipient’s address seems wrong. It is, in fact, not the MAC address for 202.168.0.3, but that of the router (192.168.0.1) instead.
The explanation is simple. The packet travels from the sender to the first router, then from router to router, and ultimately to the ping recipient. Using protocol analyzers on the two subnets we can observe two parts of the journey: from sender to Router 2 and from Router 1 to recipient.
With suitable equipment it would also be possible to capture the packets traversing the cable between Router 1 and Router 2.
Subscribe to our newsletter for all the latest updates and receive free access to an online engineering short course.