Hypercube Network Simulation --------- ------- ---------- Preface ------- Hypercube networks have been used to build high performance parallel computers. They are closely related to other kinds of networks, Butterfly, Benes, and Fat Tree, that have also been so used. In all such networks, nodes communicate through links between the nodes. A node may be a computer with its own memory. A link is just a wire that connects one node to another and over which messages can be sent. One problem that parallel computer networks must solve is permuting data among the nodes, e.g. among the computers. This means that each node has some data that belongs in some other destination node, and each node is the destination of only one such piece of data. There are permutation patterns that can cause many messages to pass through a single node, causing a bottleneck, otherwise known as network congestion. It is an interesting fact that if the permutation is random, then for all practical purposes congestion cannot happen. What do we mean by this? We mean that the probability of congestion happening is so small that it can be expected to happen only one in a zillion times the age of the universe. So for engineering purposes, it never happens. This leads to the clever idea of decomposing a permutation that might suffer from congestion into two random permutations. Each node picks another node at random, and sends the data first to that random node, and then the random node forwards the data to its true destination. For engineering purposes, the network will never be congested, even if the permutation would have congested the network were data sent directly to its destination. Problem ------- To investigate these ideas future you have been asked to build a simulation of a hypercube network. Here we will give a precise description of this network. The network contains N nodes, where N is a power of two, so for some B, N = 2**B (2 to the B'th power). Each node has a unique B bit unsigned integer address. Each node is attached to B links, and these links are numbered 0, 1, ..., B-1. The bits of a node address are also numbered 0, 1, ..., B-1, with the lowest order (units) bit being number 0, and the highest order bit being number B-1. Link j of node i is the same as link j of node k = i ^ ( 1 << j), in the notation of the C programming language, and connects nodes i and k. The node address k is just the node address i with the j'th bit complemented; and conversely, i is just k with the j'th bit complemented. In the C programming language ^ denotes `exclusive OR' and `1 << j' = 2**j as `<<' denotes left shift. Sending a message through link j complements the j'th bit of the address of the node the message was at. Thus one path through a 4 node hypercube would be: 3 -------> 2 -------> 0 link 0 link 1 where the 3 --> 2 part of the path is accomplished by sending the message through link 0 of node 3, and the 2 --> 0 part by sending through link 1 of node 2. Writing the node numbers in binary, it is a bit easier to see what is going on, and the above path would be: 11 -------> 10 -------> 00 link 0 link 1 Messages in the hypercube network (at least in the one we are simulating) are always sent so the link numbers of the message path increase as time increases. Or in other words, the lowest order wrong bit in the current message location is fixed first. Thus if we use binary addresses we have the path: 010011 -------> 010010 -------> 010110 -------> 110110 link 0 link 2 link 5 Each link can send messages in either direction. The network operates in time cycles. During each cycle, a message, but at most one, can be sent in a particular direction on each link. Each end of each link has a send queue of messages waiting to be sent on the link, and each end also has a receive buffer for a single message, the last message received on the link. The network is initialized by each node's being given exactly one message to be sent to some node, possibly to itself. This is because we are simulating the permuta- tion problem described above. If the message is to be sent to another node, the message is put on one of the sender's link send queues; otherwise the message has already arrived at its destination and is discarded. Then during each cycle the simulator performs the following two steps in order. 1) Removes from each non-empty send queue in each network node the queue's first message, and copies the message to the receive buffer on the other end of the queue's link. 2) For each node i, looking at the receive buffers of the links attached to node i in order of their link number (0, 1, ...), removes from each non-empty receive buffer its message, and if the message does not have node i as its final destination, puts the message on the end of the send queue of the link that will correct the low- est order address bit that needs to be corrected in the message. I.e., if d is the final destina- tion, i^d is the exclusive OR of i and d, and b is the bit number of the lowest order on bit in i^d, puts the message on the end of the send queue for link b. If the message has node i as its final destination, the message is discarded. A network simulation run is done when all send queues are empty after a cycle (or before the first cycle). Input ----- The input consists of input for any number of runs. For each run, the input consists of a command letter, the number of bits in a node address (B above), and N destinations for the messages sent by each of the N = 2**B nodes. The destinations are listed in order of the node number of the sending node. Thus the input r 3 1 0 5 4 3 7 2 6 has the command letter `r', B = 3, N = 2**3 = 8, and sends messages from node 0 to node 1 from node 1 to node 0 from node 2 to node 5 from node 3 to node 4 from node 4 to node 3 from node 5 to node 7 from node 6 to node 2 from node 7 to node 6 For each run input, a simulation is run, and the output is controlled by the command letter. The command letter and integers in the input are separa- ted by whitespace, which may consist of any combination of single space and newline characters. Newlines may appear anywhere between command letters and integers. B must be at least 1 and at most 10. The input ends with an end of file. Output ------ The output for the possible command letters is r At the end of the run, the following line is printed: RUN #: # cycles, # sends, # max queue length. Here each # denotes an unsigned integer. Runs are numbered 1, 2, 3, .... The total number of messages sent across all links is printed. The maximum length of any send queue at the end of any cycle is printed. q Just like `r' but also prints the following just before the first cycle and just after each cycle. First the line: RUN # CYCLE # QUEUE LENGTH is printed. Cycles are numbered 1, 2, ...; and the cycle number 0 is printed for the printout that occurs before the first cycle. Then for each node a line is printed consisting of B integers each in 4 columns. The first line is for node 0, the last for node N-1. The first integer in a line is for link 0, the last for link B-1. The integer for link j and node i is the length of the send queue for link j in node i. Example Input ------- ----- q 2 0 2 1 3 q 2 0 1 2 3 r 3 1 0 3 2 5 4 7 6 r 3 7 6 5 4 3 2 1 0 r 3 0 4 2 6 1 5 3 7 r 4 0 8 4 12 2 10 6 14 1 9 5 13 3 11 7 15 Example Output ------- ------ RUN 1 CYCLE 0 QUEUE LENGTHS: 0 0 1 0 1 0 0 0 RUN 1 CYCLE 1 QUEUE LENGTHS: 0 1 0 0 0 0 0 1 RUN 1 CYCLE 2 QUEUE LENGTHS: 0 0 0 0 0 0 0 0 RUN 1: 2 cycles, 4 sends, 1 max queue length. RUN 2 CYCLE 0 QUEUE LENGTHS: 0 0 0 0 0 0 0 0 RUN 2: 0 cycles, 0 sends, 0 max queue length. RUN 3: 1 cycles, 8 sends, 1 max queue length. RUN 4: 3 cycles, 24 sends, 1 max queue length. RUN 5: 2 cycles, 8 sends, 1 max queue length. RUN 6: 4 cycles, 32 sends, 1 max queue length. File: hypercube.txt Author: Bob Walton Date: Sat Aug 31 11:55:40 EDT 2002 The authors have placed this file in the public domain; they make no warranty and accept no liability for this file. RCS Info (may not be true date or author): $Author: hc3 $ $Date: 2002/08/31 15:59:21 $ $RCSfile: hypercube.txt,v $ $Revision: 1.11 $