Katsuki FUJISAWA
Transcription
Katsuki FUJISAWA
Laboratory of Advanced Software in Mathematics High-Performance Computing for Graph Analysis and Mathematical Optimization Problem Katsuki FUJISAWA Degree: PhD (Science)(Tokyo Institute of Technology) Research Interests: Mathematical Optimization Problem, Graph Analysis, High Performance Computing The objective of our ongoing research project is to develop an advanced computing and optimization infrastructure for extremely large-scale graphs on post peta-scale supercomputers. The recent emergence of extremely large-scale graphs in various application fields, such as transportation, social networks, cyber-security, and bioinformatics, requires fast and scalable analysis (Figure 1). For example, a graph that represents interconnections of all neurons of the human brain has over 89 billion vertices and over 100 trillion edges. To analyze these extremely large-scale graphs, we require an exascale supercomputer, which will not appear until the 2020’s. Figure 3 : The 2nd Green Graph500 list and our achievements We also present our parallel implementation for large-scale mathematical optimization problems. The semidefinite programming (SDP) problem is one of the most predominant problems in mathematical optimization. The primal-dual interior-point method (PDIPM) is one of the most powerful algorithms for solving SDP problems, and many research groups have employed it for developing software packages. However, two well-known major bottleneck parts (the generation of the Schur complement matrix (SCM) and its Cholesky factorization) exist in the algorithmic framework of PDIPM. We have developed a new version of SDPARA, which is a parallel Figure 1 : Graph analysis and its application fileds implementation on multiple CPUs and GPUs for solving extremely large-scale We have entered the Graph 500 (http://www.graph500.org) and Green SDP problems that have over a million constraints. SDPARA can Graph 500 (http://green.graph500.org) benchmarks, which are designed to automatically extract the unique characteristics from an SDP problem and measure the performance of a computer system for applications that require identify the bottleneck. When the generation of SCM becomes a bottleneck, irregular memory and network access patterns. Following its announcement SDPARA can attain high scalability using a large quantity of CPU cores and in June 2010, the Graph500 list was released in November 2010. The list has some techniques for processor affinity and memory interleaving. SDPARA can been updated semiannually ever since. The Graph500 benchmark measures also perform parallel Cholesky factorization using thousands of GPUs and the performance of any supercomputer performing a breadth-first search techniques to overlap computation and communication if an SDP problem (BFS) in terms of traversed edges per second (TEPS). We implemented the has over two million constraints and Cholesky factorization constitutes a world’s first GPU-based BFS on the TSUBAME 2.0 supercomputer at the bottleneck. We demonstrate that SDPARA is a high-performance general Tokyo Institute of Technology and came in 4th in the 4th Graph500 list in 2012. solver for SDPs in various application fields through numerical experiments Rapidly increasing numbers of these large-scale graphs and their applications at the TSUBAME 2.5 supercomputer, and we solved the largest SDP problem drew significant attention in recent Graph500 lists (Figure 2). In 2013, our (which has over 2.33 million constraints), thereby creating a new world project team came in 1st in both the big and small data categories in the 2nd record. Our implementation also achieved 1.713 PFlops in double precision Green Graph 500 benchmarks (Figure 3). The Green Graph 500 list collects for large-scale Cholesky factorization using 2,720 CPUs and 4,080 GPUs TEPS-per-watt metrics. (Figure 4). Figure 2 : Size of graphs in various application fields and Graph500 benchmark. Figure 4 : High-performance Computing for Mathematical Optimization Problem 39