Katsuki FUJISAWA

Transcription

Katsuki FUJISAWA
Laboratory of Advanced Software in Mathematics
High-Performance Computing for Graph Analysis and Mathematical Optimization Problem
Katsuki FUJISAWA
Degree: PhD (Science)(Tokyo Institute of Technology)
Research Interests: Mathematical Optimization Problem, Graph Analysis, High Performance Computing
The objective of our ongoing research project is to develop an advanced
computing and optimization infrastructure for extremely large-scale graphs on
post peta-scale supercomputers. The recent emergence of extremely
large-scale graphs in various application fields, such as transportation, social
networks, cyber-security, and bioinformatics, requires fast and scalable
analysis (Figure 1). For example, a graph that represents interconnections of
all neurons of the human brain has over 89 billion vertices and over 100
trillion edges. To analyze these extremely large-scale graphs, we require an
exascale supercomputer, which will not appear until the 2020’s.
Figure 3 : The 2nd Green Graph500 list and our achievements
We also present our parallel implementation for large-scale mathematical
optimization problems. The semidefinite programming (SDP) problem is one
of the most predominant problems in mathematical optimization. The
primal-dual interior-point method (PDIPM) is one of the most powerful
algorithms for solving SDP problems, and many research groups have
employed it for developing software packages. However, two well-known
major bottleneck parts (the generation of the Schur complement matrix (SCM)
and its Cholesky factorization) exist in the algorithmic framework of PDIPM.
We have developed a new version of SDPARA, which is a parallel
Figure 1 : Graph analysis and its application fileds
implementation on multiple CPUs and GPUs for solving extremely large-scale
We have entered the Graph 500 (http://www.graph500.org) and Green
SDP problems that have over a million constraints. SDPARA can
Graph 500 (http://green.graph500.org) benchmarks, which are designed to
automatically extract the unique characteristics from an SDP problem and
measure the performance of a computer system for applications that require
identify the bottleneck. When the generation of SCM becomes a bottleneck,
irregular memory and network access patterns. Following its announcement
SDPARA can attain high scalability using a large quantity of CPU cores and
in June 2010, the Graph500 list was released in November 2010. The list has
some techniques for processor affinity and memory interleaving. SDPARA can
been updated semiannually ever since. The Graph500 benchmark measures
also perform parallel Cholesky factorization using thousands of GPUs and
the performance of any supercomputer performing a breadth-first search
techniques to overlap computation and communication if an SDP problem
(BFS) in terms of traversed edges per second (TEPS). We implemented the
has over two million constraints and Cholesky factorization constitutes a
world’s first GPU-based BFS on the TSUBAME 2.0 supercomputer at the
bottleneck. We demonstrate that SDPARA is a high-performance general
Tokyo Institute of Technology and came in 4th in the 4th Graph500 list in 2012.
solver for SDPs in various application fields through numerical experiments
Rapidly increasing numbers of these large-scale graphs and their applications
at the TSUBAME 2.5 supercomputer, and we solved the largest SDP problem
drew significant attention in recent Graph500 lists (Figure 2). In 2013, our
(which has over 2.33 million constraints), thereby creating a new world
project team came in 1st in both the big and small data categories in the 2nd
record. Our implementation also achieved 1.713 PFlops in double precision
Green Graph 500 benchmarks (Figure 3). The Green Graph 500 list collects
for large-scale Cholesky factorization using 2,720 CPUs and 4,080 GPUs
TEPS-per-watt metrics.
(Figure 4).
Figure 2 : Size of graphs in various application fields and
Graph500 benchmark.
Figure 4 : High-performance Computing for Mathematical
Optimization Problem
39