`Beaven
`
`USOO5627766A
`Patent Number:
`11
`45 Date of Patent:
`
`5,627,766
`May 6, 1997
`
`54 PERFORMANCE AND STATUS
`MONITORING IN A COMPUTER NETWORK
`
`75 Inventor: Paul A. Beaven, Romsey, Great Britain
`73) Assignee:
`International Business Machines
`Corporation, Armonk, N.Y.
`
`21 Appi. No.: 368,075
`22 Filed:
`Jan. 3, 1995
`30
`Foreign Application Priority Data
`Feb. 8, 1994 GB United Kingdom................... 94O2380
`(51) Int. Cl. ... G06F 11/34
`52 U.S. Cl. ................................ 364/551.01; 364/514 B;
`364/550; 370/241; 395/200.11
`58) Field of Search ................................ 364/514 B, 550,
`364/551.01; 370/13, 17:395/200.11
`
`56)
`
`References Cited
`FOREIGN PATENT DOCUMENTS
`0510822 10/1992 European Pat. Off. .
`OTHER PUBLICATIONS
`IBM, “MQ Series", Message Queue Interface Technical
`Reference (SC33-0850-01), Third edtition. Nov. 1994.
`
`Primary Examiner Edward R. Cosimano
`Attorney, Agent, or Firm-John J. Timar
`57
`ABSTRACT
`
`Provided is a method and a system for computer network
`monitoring, implemented in a network in which processes
`communicate using message queuing. Each node of the
`network has a network management program installed
`thereon which includes two independent components: a
`Point Of Control (POC) program for initiating network tests
`by injecting a test message into the network and for receiv
`ing responses from all the nodes of the network; and a
`NetworkTest Program (NTP) for sending a reply message to
`the single POC for a particular test when the NTP receives
`test messages within that test, and for propagating the test by
`forwarding a message to all of the current node's adjacent
`nodes. Test results are analyzed at the POC for display to the
`network administrator. Injected test messages propagate
`throughout the networkin a self-exploring manner, exploit
`ing the parallelism of the network. The individual nodes are
`not required to know the network topology other than to
`know their nearest neighbor nodes.
`
`18 Claims, 3 Drawing Sheets
`
`NTP Receives
`Test Message
`
`Recordine
`of Receipt
`
`300
`
`310
`
`
`
`
`
`Create a Reply
`Test Message
`And Send to POC
`
`320
`
`340
`
`Record Global
`Test identifier ,
`
`Construct Propagation
`Test Message
`
`Send to Each
`Adjacent Node
`
`End Local NTP
`
`350
`
`360
`
`370
`
`380
`
`Hewlett Packard Enterprise Co. Ex. 1028, Page 1 of 13
`Hewlett Packard Enterprise Co. v. Intellectual Ventures II LLC
`IPR2021-01378
`
`
`
`U.S. Patent
`
`May 6, 1997
`
`Sheet 1 of 3
`
`5,627,766
`
`30
`
`Put Message
`
`Get Message
`
`
`
`
`
`
`
`Message
`Routing
`Network
`
`
`
`
`
`
`
`
`
`Message
`Delivery
`System
`Message
`Delivery
`
`70
`
`Message
`Delivery
`System
`
`
`
`
`
`Gueue .
`
`401 -T-
`
`Get Message
`Process B
`
`Get Message
`
`
`
`PrOCeSSD
`
`FIG. 1
`
`-- Message Header --- Application Data --
`
`110
`
`100
`
`FIG. 2
`
`Hewlett Packard Enterprise Co. Ex. 1028, Page 2 of 13
`Hewlett Packard Enterprise Co. v. Intellectual Ventures II LLC
`IPR2021-01378
`
`
`
`U.S. Patent
`
`May 6, 1997
`
`Sheet 2 of 3
`
`5,627,766
`
`200
`
`204
`
`206
`
`210
`
`220
`
`230
`
`240
`
`POC Triggered
`
`Set Global
`Test identifier
`
`Set Expiry
`Time T for TEST
`
`SendMessage to
`First NOde Test
`Program
`
`Wait T SeCOnds
`
`Inspect Al
`Reply Messages
`FOr This Test
`
`Analyse and
`Display Results
`
`FIG. 3
`
`
`
`
`
`Hewlett Packard Enterprise Co. Ex. 1028, Page 3 of 13
`Hewlett Packard Enterprise Co. v. Intellectual Ventures II LLC
`IPR2021-01378
`
`
`
`U.S. Patent
`
`May 6, 1997
`
`Sheet 3 of 3
`
`5,627,766
`
`300
`
`310
`
`NTP ReCeives
`Test Message
`
`Record Time
`of Receipt
`
`
`
`
`
`Has Test
`Expired?
`
`Create a Reply
`Test Message
`And Send to POC
`
`320
`
`340
`
`350
`
`360
`
`370
`?
`
`380
`
`Record Global
`Test dentifier
`
`Construct Propagation
`Test Message
`
`Send to Each
`Adjacent Node
`
`End LOCa, NTP
`
`FG. 4
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Hewlett Packard Enterprise Co. Ex. 1028, Page 4 of 13
`Hewlett Packard Enterprise Co. v. Intellectual Ventures II LLC
`IPR2021-01378
`
`
`
`1.
`PERFORMANCE AND STATUS
`MONITORNG IN A COMPUTER NETWORK
`FIELD OF INVENTION
`The present invention relates to computer networks and
`more particularly to monitoring the nodes and communica
`tions links of a computer network, for the determination of
`either performance or status information or to determine the
`topology of the network.
`BACKGROUND
`Increasingly, business enterprises require large, complex
`distributed networks to satisfy their communications and
`data processing requirements, and many are moving towards
`implementing large scale computing systems that integrate
`all the disparate components of the enterprise's computing
`resources. The efficiency and reliability of communications
`within these networks is becoming increasingly more impor
`tant to the overall efficiency of the computing resources.
`Network management facilities are provided to rectify com
`20
`munications problems, and also to recognize potential prob
`lems before they result in communications outages, unac
`ceptable response times, or other impairments (i.e. problem
`recognition as well as problem resolution). Complex net
`works often require computer-based systems and network
`tools to monitor network equipment and facilities, as part of
`the provision of network management facilities. Concerns
`about communications' performance and operating costs,
`and the effects on these variables of node and link failures
`and reductions in availability, have increased with device
`and network complexity and sophistication. Hence, the need
`for monitoring has increased together with the need to
`enable network reconfiguration from a central location and
`the generation of alarms when predefined conditions occur.
`A highly desirable attribute of network monitoring sys
`tems is that they provide the facilities to obtain information,
`from a single node in the network, about: the state
`(operational or failed) of any accessible linkin the network;
`the performance of any such operational link (the time taken
`for inter-node transmissions to traverse that link); and pos
`40
`sibly also a specified set of status parameters for each node
`in the network (in this context, a network node may be either
`a computer within a network or an application program
`entity running on the computer).
`Amonitoring facility is provided in TCP/IP (Transmission
`Control Protocol/Internet Protocol suite of communications
`protocols), in the Internet Control Message Protocol
`(ICMP). ICMP provides error reporting, handling several
`types of error conditions and always reporting errors back to
`the original source of the message which led to the error
`being detected. Any computer using IP accepts ICMP error
`messages and will change behavior in response to reported
`errors. Communications links between specific nodes of the
`network are tested by a first network node (A) sending a
`timestamped "ICMP Echo Request" message to a second
`specified node (B). The receiving node (B) then generates a
`timestamped "ICMPEcho Reply" reply message (reversing
`the request datagram's source and destination addresses) and
`transmits it to node A. On receipt of the reply, the node A
`timestamps the received reply. The time taken to traverse the
`links (the performance of communication links) between the
`nodes in each direction can then be calculated. This test
`facility, known as "pinging" between the nodes, is limited to
`testing end-to-end performance (from node A to target node
`B, and vice versa).
`U.S. Pat. 5,095,444 describes a system for measuring
`application message transmission delays in a communica
`
`45
`
`50
`
`55
`
`65
`
`5,627,766
`
`5
`
`15
`
`25
`
`30
`
`35
`
`2
`tions network, providing measurement of delays in the
`transmission on the various inter-node links of a predeter
`mined communications route between a source node and a
`destination node. The source node requests (across the
`communications route) a response from the destination node
`and a monitor program determines the issue time of the
`request. The source node then receives the response and the
`monitor program determines the time of receipt. A trans
`mission delay between the source node and the destination
`node is determined by calculating a difference between the
`issue time and the response time. An intermediate transmis
`sion delay between any two adjacent intermediate nodes or
`between an intermediate node and an adjacent destination
`node is determined by calculating a difference between the
`transmission delay between the source node and one of the
`adjacent nodes and the transmission delay between the
`source node and the other of the adjacent nodes. The source
`node is required to specify the route between it and the
`destination node. The system described does not make
`provision for a changing topology.
`EP-A-0510822 describes a system for monitoring node
`and link status in a distributed network, in which the
`network monitoring function is distributed among each of
`the nodes of the network. A first node designated as a
`dispatching node, dispatches a status table to another node
`which is on-line. Selected status information about the
`receiving node is written into the circulating status table
`(CST) and selected information about the other nodes is
`read, and the CST is then forwarded to another on-line node.
`The CST thus circulates around the network according to an
`adaptive routing sequence, forming a master record of status
`information which both accumulates and disseminates infor
`mation about the various on-line nodes of the network.
`When it has circulated to each on-line node, the CST returns
`to the dispatching node.
`SUMMARY OF INVENTION
`The present invention provides a method and a system for
`monitoring the performance and status of links and/or nodes
`of a communications network from a single point of control
`(POC) node, by propagating a test message between the
`nodes of the network. The method comprises the following
`steps:
`a first process injects into the network a test message
`requiring specific information by sending the test message to
`a node test program entity (NTP) on one of the network
`nodes, the test message including a designation of the POC
`node for the test;
`automatically in response to receipt of the test message,
`the receiving NTP sends to the POC a reply message
`including information from the receiving node, and forwards
`a test message to an NTP on each of said receiving node's
`own adjacent connected nodes;
`each subsequent receiving NTP, automatically in response
`to receipt of the forwarded test message, also sends to the
`POC a reply message including information from said
`subsequent receiving NTP's node, and forwards a test mes
`sage to an NTP on each of its own adjacent connected nodes.
`When the point of control node has received the replies, it
`can perform an analysis of the received information, com
`pute the performance of any live link (or all live links) or
`determine the current topology of the network, and possibly
`display the results and take control actions to modify the
`network.
`In a second aspect, the present invention provides a
`system for monitoring the performance and status of links
`
`Hewlett Packard Enterprise Co. Ex. 1028, Page 5 of 13
`Hewlett Packard Enterprise Co. v. Intellectual Ventures II LLC
`IPR2021-01378
`
`
`
`3
`and/or nodes of a communications networkfrom a first Point
`Of Control (POC) node, by propagating a test message
`between the nodes of the network, the system comprising:
`a process for initiating a test by sending to a first Node
`Test Program entity (NTP) on one of the network nodes a
`test message requiring specific information, the test message
`including a designation of the POC node for the test;
`a NTP at the POC node and at every other node of the
`network, each of which nodes can be a current node for test
`activity, each NTP including means for receiving the test
`message and means for performing the following two
`operations, automatically in response to the received test
`message: sending to the POC a reply message including
`information from the current node; and forwarding a test
`message to an NTP on each of the current node's adjacent
`nodes;
`wherein the POC node has means for receiving the reply
`messages.
`The reply messages received by the POC can then be
`analyzed to determine the performance of links and nodes
`and their status (perhaps determining whether links are
`operative orinoperative and analyzing and displaying results
`of whatever other node status information was included in
`the reply messages).
`A major advantage of the invention is that a node of the
`network is not required to know the topology of the network,
`other than to know its nearest neighbor adjacent nodes, in
`order to take part in the test. Knowledge of how to address
`adjacent nodes is a general requirement for message com
`30
`munication rather than a requirement which is specific to the
`monitoring mechanism of the invention. The test message is
`similarly not required by the invention to include lists of all
`the links to be tested, or routing advice for the test. This not
`only reduces the amount of information which must be
`included in (i.e. the size of) the test messages but also means
`that test messages need not be modified if the network is
`reconfigured. In comparison, a system which relies on a
`centralized database to provide status and performance
`information has the considerable overhead of maintaining
`that database e.g. updating to respond to dynamic changes to
`the network topology.
`Thus, the test message which is initially created prefer
`ably does not include propagation routing information,
`except that it includes some designation of the node (or a
`45
`program thereon) to which it is initially to be sent. This may
`comprise a queue name or the network address of the node.
`Each node preferably has address tables of its local nearest
`neighbors, which tables are available for the onward trans
`mission of test messages and of other messages. It is,
`however, preferred that each node knows not only how to
`reach its nearest neighbors but also how to transmit reply
`messages to the point of control, both of these information
`items being part of the setup information specified when the
`node is configured as part of the network. That knowledge
`of how to reply to the POC is not essential will be explained
`later.
`It is preferred that each one of the injected test messages,
`each forwarded test message and each reply message is
`timestamped when sent by the sender, and that each received
`message is timestamped on receipt by the receiver. These
`timestamps, which designate the beginning and the end
`times of the traversal of each link, are returned to the point
`of control as part of the reply message so that an analysis
`process associated with the point of control node can cal
`culate the time taken to traverse each link-i.e. the perfor
`mance of that link.
`
`55
`
`35
`
`40
`
`50
`
`65
`
`5,627,766
`
`10
`
`15
`
`20
`
`25
`
`4
`Another major advantage of the invention is that indi
`vidual links that are far removed from the point of control
`may be tested by the injection of a single message: all nodes
`reply to the point of control so the point of control accu
`mulates information of the whole connected network. The
`initially injected message will preferably be sent by a POC
`process on the POC node to an NTP on either the POC node
`or on one of its adjacent neighbors. The technique of the
`invention enables simultaneous monitoring of multiple con
`nections between two nodes (automatically detecting all
`alternate paths) where multiple connections exist, and
`enables multiple tests initiated at different nodes to be
`running simultaneously. The running of a test does not
`prevent normal communication proceeding simultaneously
`but merely produces a message flow which adds to the
`normal network traffic-the monitoring method using spe
`cialized test messages which are transferred between nodes
`in the same way as the normal data transmission messages.
`In a preferred implementation of the present invention in
`a network using message queuing communication between
`programs (as described later), specific test message queues
`are used to separate network monitoring test messages from
`other network traffic to prevent resource contention. In that
`embodiment, a test message is not addressed to a NTP
`directly, but to a message queue which is serviced by the
`NTP. Generally, the test messages are sent to a network node
`on which some mechanism is provided for ensuring that test
`messages are received by the NTP
`According to the present invention, the test activity propa
`gates throughout the network in a self-exploring manner,
`exploiting the parallelism of the network, potentially until
`all live nodes (i.e. nodes which are not failed) which have
`live direct or indirect communication links to the point of
`control have been visited by messages generated within the
`test and have forwarded reply messages to the point of
`control. Thus, a single action of injecting a test message into
`the networkresults in network-wide information being accu
`mulated at the point of control, unless parts of the network
`are inaccessible due to failed links or nodes. This is a
`distinction over certain prior art systems which monitor only
`the existing network traffic rather than using specialized test
`messages, as such prior art systems are unable to test links
`and nodes which are not addressed by normal data trans
`mission messages during the monitoring period. The reply
`messages according to a preferred embodiment of the
`present invention, as well as carrying performance
`information, can equally carry any other status information
`requested by the point of control when it instigated the test.
`The desirability of testing the whole network from a
`single POC node may depend on the size of the network-if
`the network is very large and complex then the additional
`network traffic generated by a test of the whole network will
`be undesirable unless information of the whole network is
`actually needed. Thus, it is preferred to define a specific
`domain within the network which is to be tested from the
`POC when limitation of the test is desired. Alternatively, a
`test may be limited by specifying a maximum number of
`node hops from the POC beyond which the test is not
`propagated.
`Preferably each node of the network has installed thereon
`at network configuration (e.g. when that node is added to the
`network) a computer program for monitoring, which is
`similar for all nodes. This program includes a NTP compo
`nent and a POC component. Then any node may act as a
`point of control for monitoring the network.
`
`Hewlett Packard Enterprise Co. Ex. 1028, Page 6 of 13
`Hewlett Packard Enterprise Co. v. Intellectual Ventures II LLC
`IPR2021-01378
`
`
`
`5,627,766
`
`O
`
`15
`
`20
`
`5
`BRIEF DESCRIPTION OF DRAWINGS
`Embodiments of the present invention will now be
`described in more detail, by way of example, with reference
`to the accompanying drawings in which:
`FIG. 1 is a schematic representation of message queuing
`communication between processes in a simple distributed
`computer network;
`FIG. 2 is a representation of the data fields of an example
`data transmission message;
`FIG. 3 is a flow diagram showing the steps taken by a
`Point Of Control process in the execution of a network
`monitoring method according to an embodiment of the
`present invention; and
`FIG. 4 is a flow diagram showing the steps taken by a
`Network Test Program on receipt of a network monitoring
`test message, according to an embodiment of the network
`monitoring method of the present invention.
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`The problem of how to obtain performance and status
`information about the nodes and interconnecting links of a
`computer network from a single node of the network occurs
`25
`in messaging networks-where participating nodes commu
`nicate by the asynchronous passing of messages between
`adjacent nodes in the network. The facility in such systems
`for inter-process data exchange is often based on manage
`ment of message queues. The message model is simple: a
`30
`Sending process enqueues messages by putting to a queue
`(the sending process issues a "Put Message"-type
`command); and then a receiving process dequeues (issuing
`a "Get Message'-type command to take the message from
`the queue either for processing or transferring onwards
`towards the destination). The enqueue and dequeue opera
`tions are performed asynchronously, dequeue being when
`the receiving process chooses rather than at a time dictated
`by the sender. Queue managers deliver each message to the
`proper queue associated with the destination process; it is
`the network of interconnected queue managers that is
`responsible for moving messages to the intended queues.
`Message queuing is thus a method of inter-program
`communication which allows programs to send and receive
`application-specific data without having a direct connection
`between them, and without necessarily being operational
`simultaneously. Application programs can run indepen
`dently of each other, at different speeds and times. The
`application programs communicate by agreeing to use par
`ticular named message queues, sending messages to the
`specific queues that the target programs have agreed to read
`from. The locations of the queues are not generally apparent
`to the applications which send the messages; each applica
`tion interacts only with its local queue manager. Applica
`tions are thereby shielded from network complexities and
`variations by the queue managers. All of the work involved
`in maintaining message queues, in maintaining the relation
`ships between messages and queues, in handling network
`failures and restarts, and in moving messages around the
`network can be handled by the queue managers. Since cross
`network communication sessions are established between
`queue managers rather than between individual programs,
`programs are less vulnerable to network failures than in
`other types of communication.
`A message queue is thus a named object in which mes
`65
`sages accumulate and from which they are later removed,
`which is maintained by a particular queue manager. The
`
`45
`
`35
`
`50
`
`55
`
`6
`physical representation of a queue depends on the environ
`ment (e.g. it may be a buffer in main storage or a file on
`disk). A message queue is a storage means whereby mes
`Sages are normally added and removed in FIFO order, but
`facilities also exist allowing messages to be read in other
`than the order in which they arrive on the queue.
`Such message queuing communication is further.
`described in IBM Document Number GC33-0805-00 “IBM
`Messaging and Queuing Series: An Introduction to Messag
`ing and Queuing", and is implemented in the IBM Messag
`ing and Queuing Series products (see the "IBM Messaging
`and Queuing Series Technical Reference', IBM Document
`Number SC33-0850-01) such as the IBM Message Queue
`Manager MVS/ESA.
`A schematic representation of message queuing commu
`nication is shown in FIG. 1, within a computing network
`which comprises a set of nodes 10 and interconnecting links
`20. There may be a plurality of different physical commu
`nication paths between nodes of the network. In this context
`the "nodes” are the computers of a network, but unless
`otherwise required by the context the word "node” is
`intended in the following description to apply equally to
`program entities running on those computers. The present
`invention can monitor the status and performance of inter
`connected computers or of individual communicating pro
`cesses (e.g. application programs) within the same computer
`system or between different systems (in the latter case
`providing an application-connectivity test tool).
`A first process A (30) transmits messages to other pro
`cesses B,C,D by putting messages to message queues (40,
`50.60) which the target processes have agreed to receive
`messages from. The process A simply issues a Put Message
`command, specifying the name of the destination queue (and
`possibly the name of a particular queue manager which is
`responsible for managing that queue). A message delivery
`system (queue manager) 70 on the computer system on
`which process A is located is responsible for determining
`where to route incoming messages and where the messages
`from Ashould be sent for example to the incoming queue 40
`of another process B on the same computer or across the
`network to a queue (50 or 60) of a remote process (C or D).
`In actuality, a message sent to a remote node is transferred
`in multiple hops between queues on adjacent nodes enroute
`to the destination node, but this is invisible to the origin and
`destination processes. Each remote process (C or D) uses a
`message delivery system (queue manager) which is respon
`sible for putting all incoming messages to the appropriate
`queue for that process. Aqueue manager may be responsible
`for many queues. The processes C and D take the messages
`from their input queues 50 and 60 when they are ready.
`The present invention is particularly applicable to net
`works using message queuing communication, enabling the
`monitoring of message queue parameters such as the number
`of messages on (i.e. the size of) queues, the time taken to
`service a queue, and other data of the throughflow of
`messages. Such information is very valuable for systems
`management, for example for enabling the determination of
`whether load balancing network reconfiguration is required
`(e.g. increasing or decreasing the number of application
`program instances servicing a particular queue).
`One aspect of the invention which is particularly relevant
`to network management in a message queuing system, is the
`ability of the monitoring method of the present invention to
`provide performance information for links between nodes in
`both directions -which may be different from each other
`since the throughflow of messages across a link depends on
`
`Hewlett Packard Enterprise Co. Ex. 1028, Page 7 of 13
`Hewlett Packard Enterprise Co. v. Intellectual Ventures II LLC
`IPR2021-01378
`
`
`
`7
`the nature and number of application instances servicing
`(receiving messages from) the queue and the number of
`applications which are putting messages to the queue, rather
`than just the limitations of the communications protocol and
`the capacity of the underlying hardware links.
`As is known in the prior art, a message consists of two
`components-an application data component 100 and a data
`structure called the message header 110, containing control
`information, as shown in FIG. 2. The application data in a
`message is defined and supplied by the application which
`sends the message. There are no constraints on the nature of
`the data in the message (for example, it could consist of one
`or more bit strings, character strings, or binary integers: if
`the message is a data transmission message relating to a
`financial transaction, the application data items within the
`message may include, for example, a four byte unsigned
`binary integer containing an account number and a twenty
`byte character string containing a customer name).
`In addition to the application data, a message has asso
`ciated with it some ancillary data. This is information that
`specifies the properties of the message, and is used on
`receipt of the message to decide how the message should be
`processed. The ancillary control information is contained in
`the message header 110. Some of this information must be
`specified by the sending application program. With this
`message structure, routing information may be included in
`the message header so that it is possible to determine from
`the message header whether the message is destined for the
`local node (e.g. for an application program running on a
`computer which receives the message) or for an adjacent or
`remote node of the network, whereby to route the message.
`In principle, there is not a specific distinction between the
`information which can be contained in the data portion and
`that which can be contained in the header portion
`35
`information of the origin, destination, time of sending, etc.,
`may be in either portion.
`Where the routing information is included in the mes
`sage's header portion, messages are generally transferred
`between intermediate nodes enroute to a destination node
`without the application data being analyzed (or possibly
`even reformatted) by the intermediate nodes. If different
`operating systems are running on the various computers of
`the network, then communication between computers
`requires, however, reconfiguration of messages according to
`conventions of a network protocol stack existing between
`them.
`According to the present invention, a test message is a
`specific type of message which is distinguished from appli
`cation data messages in that it includes in its message header
`either: the name or other designation of a queue which is
`specific to test messages on the target node, so that the
`receiving queue manager can route the test message appro
`priately; or a flag by which the receiving node can identify
`it as a test message without reading the message's data
`content. The former is the preferred implementation. Three
`different types of "test messages” may be identified which
`are associated with the monitoring function of the present
`invention. These are described below.
`Each node of the network has a network management
`program, which can be identical for each node and which
`provides the network monitoring facility, installed thereon.
`The network management program provides the means for
`propagating a test message throughout the network from a
`single point of control, in a self-exploring manner, so that
`the test reaches all of the connected nodes of the network
`and each node replies to the point of control.
`
`65
`
`25
`
`30
`
`40
`
`45
`
`50
`
`55
`
`5,627,766
`
`O
`
`15
`
`20
`
`8
`Each node of the network has means for maintaining a list
`of its adjacent nodes, in a table which contains the names
`and addresses of the computers (and possibly also the
`processes thereon or their associated queue names) which
`have direct connections to the node in question. For a given
`node the local network manager program has access to the
`list of adjacent nodes, as a source of information on how to
`communicate with these adjacent nodes. This list is gener
`ated when the system is configured, or more precisely when
`the particular node is configured as part of the system, and
`is updated when a new adjacent node is added to the system.
`Reconfiguration can be done dynamically without taking the
`relevant nodes out of service. On addition of a new node to
`the network, that node may broadcast its configuration
`information to its connected nodes.
`A simple naming convention is adopted in that the input
`queue for node's system management program is: Node
`Name. Test Queue. The test program of the invention then
`puts messages to "Node Name. Test Queue" for each of
`its neighbor nodes in order to send messages to them. This
`is sufficient to get to the desired destination neighbors since
`they are known to the current node.
`The network manager program also has the ability to store
`a global test identifier of the tests for which it receives
`messages. The network manager program thereby forms a
`record of test identifiers that can be referred to on receipt of
`a further test message to determine whether the node has
`already participated in the particular test of which that new
`message is a part. The importance of the global test identifier
`is explained below. Because all nodes have installed the
`same (or an equivalent) network management program, any
`node can be the point of control for aparticular test instance
`i.e. the node to which test information is returned, and in the
`preferred embodiment of the invention also the node from
`which the test is initiated.
`The logical steps of the network monitoring function of
`the network management program for each node are repre
`sented in FIGS. 3 and 4. The network management program
`has two effectively independent components or subroutines
`which provide the monitoring facility and which can send
`messages to each other: the first component, referred to as
`the POC process, provides the means for the node to operate
`as a point of control (POC) node, initiating a test and
`receiving responses to the test; and the second, referred to as
`the network test program (NTP), provides the means at each
`node to