phone: (347) 204-8933
e-mail: Click here


PhD in Computer Science: NYU Courant Institute of Mathematical Sciences
Dissertation Title: An Adaptive Fast Volume Solver in Three Dimensions for Free-Space, Periodic and Dirichlet Boundary Conditions
Honors: McCracken Fellowship
Advisors Denis Zorin and Leslie Greengard.
MS in Scientific Computing: NYU Courant Institute of Mathematical Sciences
BA in Mathematics and Russian: Bowdoin College
Research Topic: Properties of Non-Unique Factorization in Quadratic Fields
Honors: Smyth Prize in Mathematics, Magna Cum Laude, James Bowdoin Scholor
Advisors: Wells Johnson and Adam Levy.

Publications and Reports

Adaptive Inhomogeneous PDE Solvers for Complex Geometries
H. Langston, L. Greengard, and D. Zorin
In Preparation, 2012
A Free-Space Adaptive FMM-Based PDE Solver in Three Dimensions
H. Langston, L. Greengard, and D. Zorin
Communications in Applied Mathematics and Computational Science, vol. 6, no. 1, 2011, pp. 79–122
We present a kernel-independent, adaptive fast multipole method (FMM) of arbi- trary order accuracy for solving elliptic PDEs in three dimensions with radiation and periodic boundary conditions. The algorithm requires only the ability to evaluate the Green's function for the governing equation and a representation of the source distribution (the right-hand side) that can be evaluated at arbitrary points. The performance is accelerated in three ways. First, we construct a piecewise polynomial approximation of the right-hand side and compute far-field expansions in the FMM from the coefficients of this approximation. Second, we precompute tables of quadratures to handle the near-field interactions on adaptive octree data structures, keeping the total storage requirements in check through the exploitation of symmetries. Third, we employ shared-memory parallelization methods and load-balancing techniques to accelerate the major algorithmic loops of the FMM. We present numerical examples for the Laplace, modified Helmholtz and Stokes equations.
A massively parallel adaptive fast-multipole method on heterogeneous architectures
I. Lashuk, A. Chandramowlishwaran, H. Langston, T.-A. Nguyen, R. S. Sampath, A. Shringarpure, R. W. Vuduc, L. Ying, D. Zorin, and G. Biros
Proceedings of SC09 ACM/ICEE SCXY Conference Series, pp. 1-11, 2009, Best Paper Finalist
We present new scalable algorithms and a new implementation of our kernel-independent fast multipole method (Ying et al. ACM/IEEE SC '03), in which we employ both distributed memory parallelism (via MPI) and shared memory/streaming parallelism (via GPU acceleration) to rapidly evaluate two-body non-oscillatory potentials. On traditional CPU-only systems, our implementation scales well up to 30 billion unknowns on 65K cores (AMD/CRAYbased Kraken system at NSF/NICS) for highly non-uniform point distributions. We achieve scalability to such extreme core counts by adopting a new approach to scalable MPI-based tree construction and partitioning, and a new reduction algorithm for the evaluation phase. Taken together, these components show promise for ultrascalable FMM in the petascale era and beyond.
Cash: Distributed Cooperative Buffer Caching
C. Decoro, H. Langston, J. Weinberger
Technical Report, 2004
Modern servers pay a heavy price in block access time on diskbound workloads when the working set is greater than the size of the local buffer cache. We provide a mechanism for cooperating servers to coordinate and share their local buffer caches. The coordinated buffer cache can handle working sets on the order of the aggregate cache memory, greatly imptoving performance on diskbound workloads. This facility is provided with minimal communication overhead, no penalty for local cache hits, and without any explicit kernel support.
A New Parallel Kernel-Independent Fast Multipole Method
L. Ying, G. Biros, H. Langston, and D. Zorin
Proceedings of SC03 ACM/ICEE SCXY Conference Series, pp. 1-11, 2003, Best Student Paper, Gordon Bell prize finalist
We present a new adaptive fast multipole algorithm and its parallel implementation. The algorithm is kernel-independent in the sense that the evaluation of pairwise interactions does not rely on any analytic expansions, but only utilizes kernel evaluations. The new method provides the enabling technology for many important problems in computational science and engineering. Examples include viscous flows, fracture mechanics and screened Coulombic interactions. Our MPI-based parallel implementation logically separates the computation and communication phases to avoid synchronization in the upward and downward computation passes, and thus allows us to fully exploit computation and communication overlapping. We measure isogranular and fixed-size scalability for a variety of kernels on the Pittsburgh Supercomputing Center's TCS-1 Alphaserver on up to 3000 processors. We have solved viscous flow problems with up to 2.1 billion unknowns and we have achieved 1.6 Tflops/s peak performance and 1.13 Tflops/s sustained performance.

Research Projects and Codes

3D Kernel-Independent FMM-Based Free-Space and Periodic Volume Solver
This volume code will be posted soon!

Brain Tumor and Heart Image Registration Experiments
Images, movies and results of work with Rahul Sampath and George Biros. Being processed and will be posted soon.

Kernel Independent 3D Fast Multipole Method
This is the original KIFMM3d code of Lexing Ying, for which documentation is provided here.

Fluid Dynamics Visualization
Scientific visualization performed with George Biros and Lexing Ying, using their embedded integral solver for the Stokes equations. Full videos and pictures for some results as appeared at SC03.

Line Integral Convolution (LIC) code
Working with Denis Zorin, George Biros and Lexing Ying to visualize 2d fluid dynamics simulations, specifically Navier Stokes problems in both steady and unsteady-states. Resulting images include 2d unsteady cavity problems for different Reynold's numbers. Code is attached as well as some movies formed from static images.

Vortex Methods
Code and images/movies for simulating fast-moving fluids areound abjects in two and three dimensions. Code currently being processed for distribution and will be posted soon.

Fast Poisson Solver in 2D
Fast Fourier Transform based Poisson equation solver in a regular two dimensional domain with inhomogeneous Dirichlet boundary conditions. All code in Matlab.

A customizable Unix shell, designed to sit on top of an existing shell with augmented commands in Perl and Korn shell scripting languages.

File-Derived Motion Machine
A motion machine for animating an articulated figure, in this case a human figure, derived with simple cubes. Examples for specific text-file inputs and JAVA code attached.

Motion-Capture Experiments
Simple motion capture and image processing experiments and projects.