[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Reliability analysis and fault tolerance for hypercube multi-computer networks

Published: 20 August 2014 Publication History

Abstract

A multi-computer system (MCS) offers the high speed and throughput needed in solving computing-intensive problems. Two main components constitute a MCS. These are the processing elements (PEs) and the interconnection network (IN). A faulty IN can lead to data losses and/or throughput degradation. Hence, it is necessary to consider the fault tolerance and reliability aspects in assessing the efficacy of INs. This paper provides coverage of the fault tolerance and reliability aspects of hypercube multi-computer networks (HCNs). In particular, we cover three broad aspects: task-based reliability, fault-tolerant routing, and communication in faulty HCNs. Our coverage includes introducing the particular HC architecture, analyze its reliability and assess its fault tolerance. The analysis provided in the paper is deemed helpful to HCN designers in making informed decisions about the appropriate approaches that can be used to assess the reliability and fault tolerance of existing HCNs.

References

[1]
Sequent Computer Systems, Inc., Balance Technology Summary, Beaverton OR 97006-6063, 1984.
[2]
Intel Scientific Computers, IPSC System Overview, Beaverton, Oregon, 1986.
[3]
Intel ipsc/2, Intel Scientific Computers, 1988.
[4]
Ncube Corporation, Ncube 2 Processor Manual, December 1990.
[5]
M.H. Abd-El-Barr, H. Benten, M.A. Hai, Subcube reliability of a modular fault-tolerant hypercube architecture, Kuwait Journal of Science and Engineering, 2 (1996) 7-25.
[6]
M.H. Abd-El-Barr, M. Nadeem, K. Al-Tawil, A heuristic-based routing algorithm for hypercube multicomputer networks, Cluster Computing Journal, 4 (2001) 253-262.
[7]
S. Abraham, K. Padmanabhan, Reliability of the hypercube, in: International Conference on Parallel Processing, October 1988, pp. 7-19.
[8]
J. Al-Sadi, O.-K.M. Day K, Unsafety vectors: a new fault-tolerant routing for k-ary n-cubes, Microprocessors and Microsystems, 25 (2001) 239-246.
[9]
J. Al-Sadi, K. Day, M. Ould-Khaoua, Probability-based fault-tolerant routing in hypercubes, The Computer Journal, 44 (2001) 368-373.
[10]
J. Al-Sadi, K. Day, M. Ould-Khaoua, Unsafety vectors: a new fault-tolerant routing for the binary n-cube, Journal of Systems Architecture, 47 (2002) 783-793.
[11]
N. Allahverdi, A fault-tolerant routing algorithm based on cube algebra for hypercube systems, Journal of Systems Architecture, 46 (2000) 201-205.
[12]
N. Allahverdi, S. Kahramanli, K. Erciyes, A fault-tolerant routing algorithm based on cube algebra for hypercube systems, Journal of Systems Architecture, 46 (2000) 201-205.
[13]
G. Angransson, R. Greenlaw, Graph Theory: Modeling, Applications, and Algorithms, Prentice-Hall, 2006.
[14]
S. Balakrishnan, F. Ozguner, B. Izadi, Fault tolerance in hypercubes <http://www.engr.newpaltz.edu/bai/Research/nato.pdf>.
[15]
P. Banerjee, M. Peercy, Design and evaluation of hardware strategies of reconfiguring hypercube and meshes under faults, IEEE Transactions on Computers, 43 (1994) 841-848.
[16]
J. Brian, R. Vijayagopal, S. Latifi, Testing the reliability of a hypercube and folded hypercube in the presence of random and dynamic faults, in: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'03), vol. 3, Las Vegas, Nevada, 2003, pp. 1267-1271.
[17]
J. Brian, R. Vijayagopal, S. Latifi, Testing the reliability of a hypercube and folded hypercube in the presence of random dynamic faults, in: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA03), 2003, pp. 1267-1271.
[18]
H.-W. Chang, C.-W. Yu, A novel optimal routing and a fault- tolerant routing for the incrementally extensible hypercube network, WSEAS Transactions on Systems, 4 (2005) 27-31.
[19]
H.-W. Chang, C.-W. Yu, A novel optimal routing and a fault-tolerant routing for incrementally extensible hypercube network, WSEAS Transactions on Systems, 4 (2005) 27-31.
[20]
S. Chau, A. Liestman, A proposal for a fault-tolerant binary hypercube architecture, in: Proceedings International Symposium Fault-Tolerant Computing, June 1989, pp. 323-330.
[21]
J. Chen, P. Gillard, C. Li, Performance evaluation of three network-on-chip (NoC) architectures (invited), in: 1st IEEE International Conference on Communications in China (ICCC), 2012, pp. 91-96.
[22]
J. Chen, I. Kanj, G. Wang, Hypercube network fault tolerance: a probabilistic approach, in: Proceedings of the IEEE International Conference on Parallel Processing (ICPP02), 2002, pp. 65-72.
[23]
J. Chen, L. Kanj, G. Wang, Hypercube network fault tolerance: a probabilistic approach, Journal of Interconnection Networks, 6 (2005) 17-34.
[24]
M.-S. Chen, K. Shin, Depth-first approach for fault-tolerant routing in hypercube multi-computers, IEEE Transaction on Parallel and Distributed Systems, 1 (1990) 152-159.
[25]
V. Chirivlla, R. Alcover, J. Duato, Accurate reliability and availability models for direct interconnection networks, in: Proceedings International Conference on Parallel Processing, September 2001, pp. 517-524.
[26]
M. Coppola, B. Falsafi, J. Goodacre, G. Kornaros, From embedded multi-core SoCs to scale-out processors, in: Design, Automation &amp; Test in Europe Conference &amp; Exhibition (DATE), 2013, pp. 947-951.
[27]
S. Dandamudi, D. Eager, Hierarchical interconnection networks for multi-computer systems, IEEE Transactions on Computers, C-38 (1990) 786-797.
[28]
M. Dasgupta, S. Choudhury, N. Chaki, A secure hypercube based team multicast routing protocols (S-HTMRP), in: Proceedings of the 2009 IEEE International Advance Computing Conference (IACC 2009), March 2009, pp. 1265-1269.
[29]
H. El-Miligi, M.W. El-Kharashi, F. Gebali, Reliability-aware design methodology for networks-on-chip applications, in: IEEE International Conference on Design and Technology of Integrated Systems in Nanoscale Era (DTIS09), Cairo, Egypt, April 2009, pp. 107-112.
[30]
L.-J. Fan, C.-B. Yang, S.-H. Shiau, Routing algorithms on the bus-based hypercube network, IEEE Transactions on Parallel and Distributed Systems, 16 (2005) 335-348.
[31]
C. Francalanci, P. Giacomazzi, A high performance deadlock-free multicast routing algorithm for k-ary n-cubes, IEEE Transactions on Computers, 59 (2010) 174-187.
[32]
F. Gao, Z.-C. Li, Y.-H. Min, J. Wu, A fault-tolerant routing strategy based on extended safety vectors in hypercube multicomputers, Chinese Journal of Computers, 23 (2000) 248-254.
[33]
F. Gebali, Analysis of Computer and Communication Networks, Springer, New York, 2008.
[34]
F. Gebali, H. Elmiligi, M. El-Kharashi, Networks on Chips: Theory and Practice, CRC Press, Boca Raton, Florida, 2008.
[35]
M. Gomez, N. Nordbotten, J. Flich, P. Lopez, A. Robles, J. Duato, T. Skeie, O. Lysne, A routing methodology for achieving fault tolerance in direct networks, IEEE Transactions on Computers, 55 (2006).
[36]
P. Gregor, Recursive fault tolerance of fibonacci cube in hypercubes, Discrete Mathematics, 306 (2006) 1327-1341.
[37]
S. Gunes, N. Yilmaz, N. Allahverdi, A multicast routing algorithm based on parallel branching method for faulty hypercubes, Journal of Electrical and Electronics, 2 (2002) 477-481.
[38]
S. Gunes, N. Yilmaz, N. Allahverdi, A multicast routing algorithm based on parallel branching method for faulty hypercubes, Journal of Electrical &amp; Electronics, 2 (2002) 477-481.
[39]
S. Gunes, N. Yilmaz, E. Yaldiz, Fault-tolerant unicast routing algorithm based on parallel branching method for faulty hypercube, in: Proceedings of the 8th IEEE International Conference on Electronics, Circuits, and Systems, vol. 1, 2001, pp. 103-106.
[40]
B. Izadi, F. Ozguner, Real-time fault-tolerant hypercube multicomputer, IEE Proceedings of the Computer Digital Techniques, 149 (2002) 197-202.
[41]
G. Jan, Y. Hwang, M. Lin, D. Liang, Novel hierarchical interconnection networks for high performance multi-computer systems, Journal of Information Science and Engineering, 20 (2004) 1213-1229.
[42]
K. Kaneko, H. Ito, Fault-tolerant routing algorithms for hypercube interconnection networks, IEICE Transactions on Information and Systems, E84-D (2001) 121-128.
[43]
H.M.T. Kawasaki, T. Kawamura, A fault-tolerant routing algorithm for hypercube multi-computers, in: Proceedings IEEE International on Conference on Systems, Man and Cybernetics, 2002, pp. 629-633.
[44]
Y. Lan, A. Esfahanian, L. Ni, Multicast in hypercube multiprocessors, Journal of Parallel and Distributed Computing, 50 (1990) 30-41.
[45]
S. Latifi, M. Lee, Wormhole broadcast in hypercubes, Journal of Supercomputing, 15 (2000) 183-192.
[46]
Y. Li, S. Peng, W. Chu, An efficient algorithm for fault-tolerant routing based on adaptive binomial-tree techniques in hypercubes, in: Proceedings of the 5th International Conference on PDCAT, December 2004, pp. 196-201.
[47]
F. Liu, Y. Song, Broadcast in the locally k-subcube-connected hypercube network with faulty tolerance, in: International Conference on Communications, Networking, and Mobile Computing (ICCNMC 2005), 2005, pp. 305-313.
[48]
P. Loh, W. Hsu, Fault tolerance of complete Josephus cubes, Journal of Systems Architecture, 49 (2003) 1-21.
[49]
P. Loh, W. Hsu, Y. Pan, The exchanged hypercube, IEEE Transactions on Parallel and Distributed Systems, 16 (2005) 866-874.
[50]
Q. Mohammad, Adaptive fault-tolerant routing algorithm for tree-hypercube multicomputer, Journal of Computing Science, 2 (2006) 124-126.
[51]
Q. Muhammad, Adaptive fault-tolerant routing algorithm for tree-hypercube multicomputer, Journal of Computer Science, 2 (2006) 124-126.
[52]
E. Oh, Strong Fault Tolerance in Multicomputer Networks, Doctoral dissertation, Texas A&M University, August 2004.
[53]
J. Ou, On maximal 3-restricted edge connectivity and reliability analysis of hypercube networks, Journal of Applied Mathematics and Computation, 217 (2010) 2602-2607.
[54]
Y. Pan, Fault tolerance in the block shift network, IEEE Transactions on Reliability, 50 (2001) 88-90.
[55]
M. Qatawneh, Adaptive fault-tolerant algorithm for tree-hypercube multicomputer, Journal of Computer Science, 2 (2006) 124-126.
[56]
B. Qiao, F. Shi, W. Ji, THIN: a new hierarchical interconnection network-on-chip for SOC, in: 7th International Conference on Algorithms and Architectures for Parallel Processing, 2007, pp. 446-457.
[57]
M. Ramras, Minimum cut-sets in hypercubes, Discrete Mathematics, 289 (2004) 193-198.
[58]
D. Rennels, On implementing fault tolerance in binary hypercubes, in: Proceedings IEEE Fault Tolerant Computing, vol. 4, 1988, pp. 463-468.
[59]
T. Sasama, On fault tolerance of hypercubes using subcubes, International Journal of Reliability, Quality, and safety Engineering, 9 (2002) 151-161.
[60]
Q. Shang, W. Zhang, X. Chen, X. Guo, Storage performance evaluation of media server based on multi-core network processors, in: IEEE Wireless Communications and Networking Conference Workshops (WCNCW), 2013, pp. 76-79.
[61]
Y. Shi, Z. Hou, J. Song, Hierarchical interconnection networks with folded hypercube as basic cluster, in: 4th International Conference on High Performance Computing, vol. 1, 2000, pp. 134-137.
[62]
J. Shih, Fault-tolerant routing in hypercube networks without virtual channels, IEE Proceedings of Computers and Digital Techniques, 151 (2004) 377-384.
[63]
J.-D. Shih, Fault-tolerant wormhole routing for hypercube networks, Information Processing Letters, 86 (2003) 93-100.
[64]
L. Song, F. BaoHua, D. Yong, Y. XiaoDong, Clustering multicast on hypercube networks, in: High Performance Computing Conference (HPCC), LNCS, vol. 4208, 2006, pp. 61-70.
[65]
L. Song, X. Yang, A multicast path algorithm on hypercube interconnection networks, in: Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008, pp. 641-646.
[66]
S. Sultan, R. Melhem, An efficient modular spare allocation scheme and its application to fault-tolerant binary hypercube, IEEE Transactions on Parallel and Distributed Systems, 2 (1991) 117-126.
[67]
T. Takabatake, K. Kaneko, H. Ito, HCC: generalized hierarchical completely connected networks, IECE Transactions on Information &amp; Systems, 2 (2000) 1216-1224.
[68]
S. Tian, A fault-tolerant routing strategy based on extended optimal path matrices in hypercube multi-computers, Chinese Journal of Computers, 25 (2002) 87-92.
[69]
S. Tian, Y. Lu, D. Zhang, Optimal fault-tolerant routing scheme for generalized hypercube, in: Proceedings 11th Pacific International Symposium on Dependable Computing (PRDC'05), 2005, pp. 91-100.
[70]
C. Tripathy, Star-cube: A new fault-tolerant interconnection topology for massively parallel systems, Journal of Electronics and Telecom Engineering, 84 (2004) 83-92.
[71]
C. Tripathy, N. Adhikari, On a new multicomputer interconnection topology for massively parallel systems, International Journal of Distributed and Parallel Systems (IJDPS), 2 (2011) 162-180.
[72]
H. Wang, Z. Wu, A level-wise clustering algorithm for multicast on hypercube network, in: Proceedings of the 2010 Conference on Dependable Computing (CDC-2010), November 2010, pp. 263-266.
[73]
H. Wang, Z. Wu, A level-wise clustering algorithm for multicast on hypercube network, in: World Automation Congress (WAC), 2012, pp. 263-266.
[74]
H. Wang, Z. Wu, X. Yang, H. Liu, A novel ACO-based multicast path algorithm in hypercube networks, Journal of Intelligent Automation and Soft Computing, 17 (2011) 541-549.
[75]
H. Wang, Z. Wu, X. Yang, H. Liu, Dong, A fault-tolerant routing model based on locally k-subcube connected hypercube networks, Journal of Computational Information Systems, 8 (2012) 3659-3669.
[76]
L. Wang, Y.-P. Lin, Z.-P. Chen, X. Wen, Fault-tolerant routing based on safety path vectors in hypercube systems, Journal of Software, 15 (2004) 783-790.
[77]
M. Wehner, L. Oliker, J. Shalf, A real cloud computer, IEEE Spectrum, 46 (2009) 24-29.
[78]
D. West, Introduction to Graph Theory, Prentice-Hall, 2001.
[79]
E. Witzke, S. Frese, Some limitations of adjacency matrices in computer network analysis, SIGCOMM Computer Communication, Review, 18 (1988) 43-47.
[80]
J. Wu, E. Fernandez, Reliable broadcasting in faulty hypercube computers, in: Proceedings 11th Symposium on Reliable Distributed Systems, October 1992, pp. 122-129.
[81]
R.-Y. Wu, J. Chang, G.-H. Chen, Node-disjoint paths in hierarchical hypercube networks, in: 20th International Parallel and Distributed Processing Symposium, April 2006, pp. 1-5.
[82]
R.-Y. Wu, G.-H. Chen, Y. Kuo, G. Chang, Node-disjoint paths in hierarchical cubic networks, Information Sciences, 177 (2007) 4200-4207.
[83]
R.-Y. Wu, G.-H. Chen, Y.-L. Kuo, G. Chang, Node-disjoint paths in hierarchical hypercube networks, Journal of Information Sciences, 177 (2007) 4200-4207.
[84]
D. Xiang, A. Chen, Local-safety-information-based broadcasting in hypercube multicomputers with node and link failure, Journal of Interconnection Networks, 2 (2001) 365-378.
[85]
Y. Xiang, S. Pasricha, Thermal-aware semi-dynamic power management for multicore systems with energy harvesting, in: 14th International Symposium on Quality Electronic Design (ISQED), 2013, pp. 619-626.
[86]
L. Youyao, H. Jungang, D. Huimin, A hypercube-based scalable interconnection network for massively parallel computing, Journal of Computers, 3 (2008) 58-65.
[87]
Q. Zhu, J.-M. Xu, X. Hou, M. Xu, On reliability of the folded hypercubes, Information Sciences, 177 (2007) 1782-1788.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Sciences: an International Journal
Information Sciences: an International Journal  Volume 276, Issue C
August 2014
392 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 20 August 2014

Author Tags

  1. Fault tolerance
  2. Fault-tolerant routing
  3. Hypercube multi-computer networks
  4. Multicasting and broadcasting in faulty hypercube
  5. Reliability analysis
  6. Reliability computation

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Reliability analysis of the pentanary n-cube based on h-extra edge-connectivity with a concentration behaviorThe Journal of Supercomputing10.1007/s11227-022-04489-178:13(15504-15531)Online publication date: 1-Sep-2022
  • (2019)Comparing the reliability of software systemsInformation Sciences: an International Journal10.1016/j.ins.2017.08.079423:C(398-411)Online publication date: 6-Jan-2019
  • (2019)Reliable networking in Ethernet ring mesh networks using regular topologiesTelecommunications Systems10.1007/s11235-019-00566-872:2(199-220)Online publication date: 1-Oct-2019
  • (2018)Fault-tolerant routing methodology for hypercube and cube-connected cycles interconnection networksThe Journal of Supercomputing10.1007/s11227-017-2033-773:10(4560-4579)Online publication date: 31-Dec-2018
  • (2018)Reliability Analysis of Fault-Tolerant Bus-Based Interconnection NetworksJournal of Electronic Testing: Theory and Applications10.1007/s10836-016-5601-532:5(541-568)Online publication date: 28-Dec-2018
  • (2017)Reliability modeling and analysis of communication networksJournal of Network and Computer Applications10.1016/j.jnca.2016.11.00878:C(191-215)Online publication date: 15-Jan-2017
  • (2016)Reliability analysis of multilayer multistage interconnection networksTelecommunications Systems10.1007/s11235-015-0093-762:3(529-551)Online publication date: 1-Jul-2016
  • (2015)Maximizing reliability with energy conservation for parallel task scheduling in a heterogeneous clusterInformation Sciences: an International Journal10.5555/2794084.2794133319:C(113-131)Online publication date: 20-Oct-2015

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media