Program & Abstract

name : speaker
* : invited speaker

1st day (3 Dec. 2015)

08:45-   Registration
09:20-09:30   Opening  (Masaharu Matsumoto)
Session 1  Numerical Libraries in Post-Peta/Exa-Scale Era (session chair: Takahiro Katagiri)
09:30-10:15   Kengo Nakajima (Information Technology Center, The University of Tokyo, Japan)
ppOpen-HPC beyond Post-Peta Scale Computing

ppOpen-HPC is an open source infrastructure for development and execution of large-scale scientific applications on post-peta-scale (pp) supercomputers with automatic tuning (AT).

ppOpen-HPC is one of the 14 projects of "Development of System Software Technologies for Post-Peta Scale HPC (Post-Peta CREST, PPC) (Supervisor: Prof. Mitsuhisa Sato, RIKEN AICS)", which was initiated in 2010 by the Japan Science & Technology Agency (JST) as one of its Strategic Basic Research Programs (CREST). It aims at developing software exploiting maximum efficiency and reliability on future HPC systems in late 2010's. Its 14 projects cover wide range of research areas in software for HPC, such as system software, programming languages, compilers, numerical libraries, and application frameworks. Developed software is open to public.

ppOpen-HPC started in FY.2011 as a 5-year project, and now it is in the final year.

ppOpen-HPC focuses on parallel computers based on many-core architectures and consists of various types of libraries covering general procedures for scientific computations. The source code, developed on a PC with a single processor, is linked with these libraries, and the parallel code generated is optimized for post-peta-scale systems. In this talk recent achievements and progress of the ppOpen-HPC project are summarized. Moreover, future perspectives including international collaborations will be also mentioned.
10:15-11:00   * Achim Basermann (German Aerospace Center (DLR), Germany)
Equipping Sparse Solvers for Exascale - A Survey of the DFG Project ESSEX

The ESSEX project investigates computational issues arising at exascale for large-scale sparse eigenvalue problems and develops programming concepts and numerical methods for their solution. The project pursues a coherent co-design of all software layers where a holistic performance engineering process guides code development across the classic boundaries of application, numerical method and basic kernel library. Within ESSEX the numerical methods cover both widely applicable solvers such as classic Krylov, Jacobi-Davidson or recent FEAST methods as well as domain specific iterative schemes relevant for the ESSEX quantum physics application. This presentation introduces the project structure and presents selected results which demonstrate the potential impact of ESSEX for efficient sparse solvers on highly scalable heterogeneous supercomputers.

In the second project phase from 2016 to 2018, the ESSEX consortium will include partners from the Universities of Tokyo and of Tsukuba. Extensions of existing work will regard numerically reliable computing methods, scalability improvements by leveraging functional parallelism in asynchronous preconditioners, hiding and reducing communication cost, improving load balancing by advanced partitioning schemes, as well as the treatment of non-Hermitian matrix problems.
Session 2  General Session I (session chair: Takeshi Iwashita)
11:15-11:45   Masaki Satoh (Atmosphere and Ocean Research Institute, The University of Tokyo, Japan)
High-resolution global nonhydrostatic simulations by NICAM using the K computer

High-resolution atmospheric global simulations by the Nonhydrostatic Icosahedral Atmospheric Model (NICAM) using the K computer will be introduced. The K computer enabled us to perform for the first time the sub-kilometer mesh global simulation, the long-term 30 year simulation with a future projection simulation, ensemble simulations, and atmosphere-ocean coupled simulations with the NICAM-COCO coupled models.
11:45-12:15   Akihiro Ida, Takeshi Iwashita (Academic Center for Computing and Media Studies, Kyoto University)
HACApK: Library for Large-scaled Simulations Using the Integral Equation Method

In this presentation, we introduce the HACApK library which we have been developing as the part of ppOpen-HPC project. The HACApK adopts an algorithm of hierarchical matrices (H-matrices) combined with the adaptive cross approximation (ACA) as an approximation technique for dense matrices derived from the integral equation method, and is implemented by using the hybrid MPI+OpenMP programming model to work on SMP cluster systems. For the implementation, we proposed a set of parallel algorithms applicable to H-matrices with ACA. Furthermore, the algorithm of H-matrices with ACA is improved to avoid the trouble that the conventional H-matrices with ACA fail to make efficient approximations when dealing with the huge-sized problem.
12:15-12:45   * Takane Hori (JAMSTEC, Japan)
Current status and future direction of earthquake generation simulations for evaluating possible scenarios of megathrust earthquakes

Megathrust earthquake sequences have been modeled as spatio-temporal variation in fault slip on the subducting plate boundary with frictional instability. Governing equations are composed of an equation of fault interaction in an elastic half space, a relation among slip velocity, stress and fault strength (in other words, fracture criteria), and an evolution equation for the fault strength. Stress evaluation on the fault requires high resolution (around 100m) and great earthquake modeling requires wide calculation area more than 700km x 300km. We can qualitatively reproduce variations in megathrust earthquake sequences in the past for the Nankai trough, southwest Japan and the Japan trench, northeast Japan. Based on such simulations, we have examined possible scenarios in near future. For a next step, we are developing an earthquake generation simulation code with 3D heterogeneous viscoelastic finite element model.
12:45-14:00   Lunch Break
Session 3  Auto-Tuning (session chair: Akihiro Ida)
14:00-14:45   * Weichung Wang (Institute of Applied Mathematical Sciences, National Taiwan University)
Optimal Space-Filling Computer Experiment Designs for Auto-Tuning

In software auto-tuning, we need to choose experiment points from the search domains and then run the codes to measure the performance. We study how we can choose such experiment points to satisfy the non-bias space-filling property over the search domain. For regular search domains, we consider Latin hypercube designs (LHDs). While LHDs are widely used in many applications, the large number of feasible designs makes the search for optimal LHDs a difficult discrete optimization problem. To tackle this problem, we propose a new population-based algorithm named LaPSO that is adapted from the standard particle swarm optimization (PSO) and customized for LHD and with GPU acceleration. For irregular search domains, we discuss how central composite discrepancy (CCD) based optimal uniform space-filling designs can be efficiently computed by proposing a PSO-based algorithm. Numerical results and a real application in data center thermal management will be presented to show the advantages of the proposed methods.
14:45-15:15   Takahiro Katagiri, Masaharu Matsumoto, Satoshi Ohshima (Information Technology Center, The University of Tokyo, Japan)
Towards Automatic Code Selection with ppOpen-AT: A Case of FDM - Variants of Numerical Computations and Its Impact on a Multi-core Processor –

In this study, we show a new ability of auto-tuning (AT) by utilizing selection of code variants based on totally different implementations of numerical computations. The selection function of the AT is carefully designed to apply ppOpen-AT, which is a computer language to adapt AT functions to simulation codes of actual use in ppOpen-HPC project. The AT is evaluated with ppOpen-APPL/FDM (Seism_3D), which is a simulation code of seismic wave based on Finite Difference Method (FDM). According to results of performance evaluation with an advanced multi-core processor, the Xeon Phi, crucial speedups are found by utilizing the selection of AT. Moreover, the best code variants were varied according to parallel executions, i.e. the number of MPI processes and OpenMP threads in hybrid MPI/OpenMP.
15:15-15:45   * Teruo Tanaka (Faculty of Informatics, Kogakuin University, Japan)
Enhancement of Functionality of ppOpen-AT with d-Spline based Incremental Performance Parameter Estimation

Automatic performance tuning (AT) is effective for the optimization of performance parameters that are suitable for certain computational environments in ordinary mathematical libraries. ppOpen-AT is a scripting language (set of directives) with features that reduce the workload of developers of mathematical libraries that have AT features. We enhanced functionality of ppOpen-AT using our proposed method, the Incremental Performance Parameter Estimation Method (IPPE method). This method estimates optimal performance parameters by inserting suitable sampling points automatically that refer to the computational results of a fitting function. For the fitting function, we used "d-Spline," which is highly adaptable and requires little estimation time. The effectiveness of IPPE method for simultaneous multiple performance parameter estimation was evaluated by the algebraic multigrid method, which has many performance parameters. The results of our evaluations show that our proposed method contributes to enhance the functionality of ppOpen-AT.
Session 4  General Session II (session chair: Takeshi Kitayama)
16:00-16:30   Takashi Arakawa (Research Organization for Information Science and Technology, Japan)
Introduction of atmosphere-ocean model coupling and seismic-structure model coupling by a coupler ppOpen-MATH/MP

Our group have been developing a coupling software ppOpen-MATH/MP for large-scale weak coupling simulation that enables multi-component coupling and large-scale data transfer/interpolation. PpOpen-MATH/MP has features as follows: 1) Supports various grid systems not only structured grid but unstructured grid, 2) 2D, 3D field coupling, 3) multi-component coupling. In this presentation, functions of ppOpen-MATH/MP will be explained and a couple of application example will be introduced. The first example is climate model coupling. The target models utilized in this study are an icosahedral atmospheric model NICAM and a tri-polar ocean model COCO. The second example is a seismic model and a structure model coupling. The seismic model is employing structured grid system and the structure mode has non-structured grid system. By these examples, wide applicability of ppOpen-MATH/MP will be shown.
16:30-17:00   Takashi Furumura (Earthquake Research Institute, The University of Tokyo, Japan)
Design of parallel FDM simulation code for earthquake ground motion simulation suitable for many-core machines

For extracting efficient computer performance of the latest and future high-performance computers, careful consideration of the simulation code suitable for many-core architecture is indispensable. For this purpose we have developed a ppOpenApp/FDM library in the JST CREST project, which is a FDM library for simulation of seismic wave propagation in heterogeneous 3D structural model. In this presentation large-scale parallel simulation of seismic wave propagation using recently developed new library packages suitable for Intel Xeon Phi processor and GPGPU accelerator. From this presentation key parameters to speed-up the FDM simulation by extracting performance of multi-core and many core processors are discussed.
17:00-17:30   Masaharu Matsumoto, Takashi Arakawa, Takeshi Kitayama (Information Technology Center, The University of Tokyo, Japan)
Coupled Multi-scale Simulations using ppOpen-HPC Libraries

The demand for multi-scale and multi-physics simulations in which multiple applications are combined by a “coupler” will be met with the advent of supercomputer systems. We have been developing ppOpen-MATH/MP as a part of ppOpen-HPC project. The ppOpen-MATH/MP is a coupler library for coupled simulation and connects applications of various discretization methods on ppOpen-APPL libraries. In order to demonstrate applicability of ppOpen-MATH/MP, in this talk, we show the development and usage of coupled simulations in which FDM applications (based on ppOpen-APPL/FDM) and FEM applications (based on ppOpen-APPL/FEM) are coupled using the coupler library (ppOpen-MATH/MP).
18:30-20:30   Reception Party RESTAURANT ABREUVOIR

2nd day (4 Dec. 2015)

09:00-   Registration
Session 5  Linear Solver (session chair: Hiroshi Okuda)
09:30-10:15   * Edmond Chow (Georgia Institute of Technology, USA)
Very Fine-grained Parallelization of Preconditioning Operations

Massive concurrency is required in scientific and engineering algorithms in order to run efficiently on future computer architectures. High-end compute nodes already have hundreds to thousands of accelerator cores and core counts are anticipated to further increase. In this talk, we describe some new approaches for preconditioning operations, particularly incomplete factorizations and sparse triangular solves, that have much more concurrency than existing approaches. The main idea is to transform a problem into one that can be solved iteratively. By using asynchronous iterative methods, the coupling that must exist between processing units is obeyed, but can have much lower overhead than in the synchronous case.
10:15-11:00   * Tetsuya Sakurai (College of Information Science, University of Tsukuba, Japan)
A Scalable Parallel Eigensolver for Large-scale Simulations on Post Peta-scale Computing Environments

Large-scale eigenvalue problems arise in wide variety of scientific and engineering applications such as nano-scale materials simulation, vibration analysis of automobiles, analysis of big data, etc. In such situation, high performance parallel eigensolvers are required to exploit distributed parallel computational environments. In this talk, we present a parallel eigensolver, the Sakurai-Sugiura method (SSM), for large-scale interior eigenvalue problems. This method is derived using numerical quadrature, and has a good parallel scalability. We also show a software package "z-Pares" that is an implementation of SSM. This software enables users to utilize a large amount of computational resources because of its hierarchical parallel structure. Some numerical experiments illustrate the efficiency of the proposed methods.
Session 6  General Session III (session chair: Tadashi Hemmi)
11:15-11:45   Hiroshi Okuda, Naoki Morita, Gaku Hashimoto (Gradiate School of Frontier Sciences, The University of Tokyo, Japan)
Parallel Localized Robust Incomplete Factorization Preconditioning IRIF(p) with Mixed-precision

Robust incomplete factorization (RIF) based on A-otrhogonalization process is an effective preconditioning technique for the conjugate gradient (CG) method to solve highly ill-conditioned linear systems. In this study, the same sparsity pattern of the preconditioning matrix is utilized as that of the finite element coefficient matrix. By doing so, the dropping process, which is the time consuming part in constructing the preconditioning matrix, is made unnecessary. When parallelizing IRIF(p), which considers the fill-in level of p, the localized and the mixed-precision strategies have been used for enhancing the computational intensity. The single precision computation is done for the preconditioning there. Performance of the present methods, in the example of a platy structural model discretized by MITC4 shell elements, has been compared with the conventional RIFs and the double-precision preconditionings.
11:45-12:15   Takeshi Kitayama, Hiroshi Okuda (Gradiate School of Frontier Sciences, The University of Tokyo, Japan)
Development and Usage of ppohFEM: A Library for Parallel Application with Finite Element Method

We have developed a development support library, named ppOpen-APPL/FEM (ppohFEM), for parallelized Finite Element Method (FEM) applications. The ppohFEM library provides Application Program Interfaces (APIs) for parallelized FEM calculation, such as file I/O, stiffness matrix assembly, boundary conditions setting and linear equation solver. The library use OpenMP/MPI hybrid programming model and the library gives good performance and scalability on cluster systems. In this talk, we show the usage of ppohFEM library to develop FEM based applications. A parallel structural analysis application and a parallel heat transfer analysis application are developed by using ppohFEM library. Both application based on FEM formulation so the common operations can be coded by using APIs of ppohFEM. Practical usage of these APIs will be shown. The performance of ppohFEM library on these applications will be discussed.
12:15-12:45   Yasuhito Takahashi, Tadashi Tokumasu, Koji Fujiwara, Takeshi Iwashita (Academic Center for Computing and Media Studies, Kyoto University, Japan)
Error Correction Method for High Performance Electromagnetic Field Simulation

We introduce an error correction method, TP-EEC method, to accelerate finite element electromagnetic filed analyses for electric machines. In an analysis for electric motors, the time-harmonic solution in the steady state is calculated by using a time-dependent non-linear analysis. While the step-by-step method is usually used, the number of time steps for obtaining the steady state tends to be large. TP-EEC method, which is one of error correction methods using a special mapping operator, improves the convergence rate and thus reduces the computational time.
12:45-14:00   Lunch Break
Session 7  General Session IV (session chair: Takeshi Furumura)
14:00-14:30   * Hajime Yamamoto (Taisei Corporation, Japan)
Leveraging Supercomputing for Geologic Carbon Sequestration

Geologic carbon sequestration is one of the promising approaches for reducing the greenhouse gas content in the atmosphere, through capturing CO2 from large emission sources and injecting it into reservoirs such as deep saline aquifers. Numerical simulation of two-phase fluid flow (normally water and CO2) is a key technology to understand the hydrodynamic behavior of CO2 in the subsurface for evaluating reservoir performance (e.g., capacity) as well as for assessing the long-term fate of CO2 in the post-injection period. However, the simulations are often computationally demanding for solving highly non-linear models in sufficient spatial and temporal resolutions due to complex coupled processes of physics and chemistry involved. We will present our efforts for leveraging supercomputing with a general-purpose reservoir simulator of multiphase flow and chemically reactive transport, which has been implemented on Oakleaf-FX at The University of Tokyo.
14:30-15:00   Tadashi Hemmi (JAMSTEC, Japan)
The ppOpen-APPL/DEM-Util libraries and the DEM applications

We have developed a set of the ppOpen-APPL/DEM-Util libraries that provides useful tools for application programs of the Discrete Element Method (DEM). The ppOpen-APPL/DEM-Util libraries mainly consist of two utility programs, ppOpen-APPL/DEM-Util-distance_calculate and ppOpen-APPL/DEM-Util-objects_update. The ppOpen-APPL/DEM-Util-distance_calculate library is a pre-processing utility program to prepare the 3D objects data for the computation of the DEM application program from the stereolithography (STL) format data files. The ppOpen-APPL/DEM-Util-objects_update library updates the position of the 3D objects in the computational space in the DEM application program. Both libraries can be easily adapted to application programs of the DEM, and these libraries will help the users who develop the DEM applications. We present the detail of the libraries and the examples of the application programs.
15:00-15:30   Miki Yamamoto (JAMSTEC, Japan)
Particle method in large scale simulation and its application

Particle method is one of the fundamental computational techniques used in computational physics aiming at solving realistic problems in the world. In this talk, we show several attempts of application of particle computation performed in ocean drilling program promoted by JAMSTEC. The particle computation is utilized to design efficient operation of drilling instrument, in which a million to billion particles mimics rock cuttings produced during the drilling of earth crust, where a massive computation of particles is realized by dynamic domain decomposition (DDD) and other techniques.
15:35-15:45   Closing (Takahiro Katagiri)