|
The BeBOP group is broadly interested in understanding software
performance tuning issues, and the interaction or implications
for hardware design.
Among our general interests are
-
the interaction between application software, compilers,
and hardware
-
managing trade-offs among the various measures of performance,
such as speed, accuracy, power, storage, ...
-
automating the performance tuning process, starting with
the computational kernels which dominate application
performance in scientific computing and information
retrieval
-
performance modeling and evaluation of future computer
architectures
See the links below for detailed project information and status.
Principal Investigators
Affiliated Researchers
Graduate Students
Undergraduates
Previous Participants
Download BibTeX entries
for these papers and reports.
-
Paper: Avoiding Communication in Sparse Matrix Computations
(IEEE International Parallel and Distributed Processing Symposium, April 2008)
James Demmel, Mark Hoemmen, Marghoob Mohiyuddin, and Katherine Yelick
PDF (1 MB)
Talk slides, PDF (6.2M)
-
Tech report: Avoiding Communication in Computing Krylov Subspaces
(UCB/EECS-2007-123, October 2007)
James Demmel, Mark Hoemmen, Marghoob Mohiyuddin, and Katherine Yelick
PDF
(35M)
- Journal Paper: Scientific Computing Kernels on the Cell Processor
(International Journal of Parallel Programming, April 2007)
Samuel Williams, John Shalf, Leonid Oliker, Shoaib Kamil, Parry
Husbands, Katherine Yelick
PDF (376k)
- Journal Paper: When Cache Blocking Sparse Matrix Vector Multiply Works and Why
(Applicable Algebra in Engineering, Communication and Computing, March 2007)
Rajesh Nishtala, Richard W. Vuduc, James W. Demmel, Katherine Yelick
PDF (390k)
- Paper: Benchmarking Sparse Matrix-Vector Multiply in Five Minutes
(SPEC Benchmark Workshop 2007, Austin, TX, January 2007)
Hormozd Gahvari, Mark Hoemmen, James Demmel, Katherine Yelick
PDF (1 MB)
PPT slides (6.4 MB)
- Master's Thesis: Benchmarking Sparse Matrix-Vector Multiply
(Computer Science Division, U.C. Berkeley, December 2006)
Hormozd Gahvari
PDF (11.5 MB)
- Paper: The Potential of the Cell Processor for Scientific
Computing
(Computing Frontiers 2006, Ischia, Italy, May 2006)
Samuel Williams, John Shalf, Leonid Oliker, Shoaib Kamil, Parry
Husbands, Katherine Yelick
PDF (216k)
- Paper: Implicit and Explict Optimizations for Stencil Computations
(Memory Systems Performance and Correctness, San Jose, California, USA, October 2006)
Shoaib Kamil, Kaushik Datta, Samuel Williams, Leonid Oliker, John Shalf, Katherine Yelick
PDF (604k)
- Paper: OSKI: A library of automatically tuned sparse matrix kernels
(Proceedings of SciDAC 2005, Journal of Physics: Conference Series, June 2005)
Richard Vuduc, James Demmel, Katherine Yelick.
PDF (190k)
- Paper: Fast sparse matrix-vector multiplication by exploiting variable blocks
(Proceedings of the International Conference on
High-Performance Computing and Communications, Sorrento, Italy, September 2005)
Richard Vuduc, Hyun-Jin Moon.
PDF (322k)
- Paper: Self-Adapting Linear Algebra Algorithms and Software
(Proceedings of the IEEE, Special Issue on Program Generation, Optimization, and Adaptation, 93(2), February 2005)
James Demmel, Jack Dongarra, Victor Eijkhout, Erika Fuentes,
Antoine Petitet, Richard Vuduc, R. Clint Whaley, Katherine Yelick.
PDF (600k)
- Paper: Performance Models for Evaluation and Automatic Tuning of Symmetric Sparse Matrix-Vector Multiply
(International Conference on Parallel Processing, Montreal, Quebec, Canada, August 2004) [Winner, Best Paper Award]
Benjamin C. Lee, Richard Vuduc, James Demmel, Katherine Yelick.
PDF (178k)
| Gzip'd PostScript (204k) | PDF (540k)
- Paper: Toward automatic performance tuning of matrix
triple products based on matrix structure
(PARA'04 Workshop on State-of-the-art in Scientific Computing, Copenhagen, Denmark, June 2004.)
Eun-Jin Im, Ismail Bustany, Cleve Ashcraft, James Demmel,
Katherine Yelick.
- Paper: SPARSITY: An Optimization Framework for Sparse Matrix Kernels
(International Journal of High Performance Computing Applications, 18 (1), pp. 135-158, February 2004)
Eun-Jin Im, Katherine A. Yelick, Richard Vuduc.
PDF (1.1M)
| Gzip'd PostScript (1.2M)
- Paper: Statistical Models for Empirical Search-Based Performance Tuning
(International Journal of High Performance Computing Applications, 18 (1), pp. 65-94, February 2004)
Richard Vuduc, James W. Demmel, Jeff A. Bilmes.
PDF (950k)
| Gzip'd PostScript (983k)
- Ph.D. Thesis: Automatic Performance Tuning of Sparse Matrix Kernels
(Computer Science Division, U.C. Berkeley, December 2003)
Richard Vuduc.
PDF (7.6M)
- Tech report: Performance Optimizations and Bounds for Sparse Symmetric Matrix-Multiple Vector Multiply
(UCB/CSD-03-1297, November 2003)
Benjamin C. Lee, Richard W. Vuduc, James W. Demmel,
Katherine A. Yelick, Michael de Lorimier, Lijue Zhong.
PDF (867k)
| Gzip'd PostScript (1.3M)
- Tech Report: Performance Modeling and Analysis of Cache Blocking in Sparse Matrix Vector Multiply
(UCB/CSD-04-1335, June 2004)
Rajesh Nishtala, Richard W. Vuduc, James W. Demmel, Katherine A. Yelick
PDF (~8MB)
- Senior Thesis: Effects of Block Size on the Block Lanczos Algorithm
(Dept. of Mathematics, U.C. Berkeley, June 2003)
Christopher Hsu
MS Word (2.2 MB)
| PDF (273k)
- Paper: Memory Hierarchy Optimizations and Performance Bounds for Sparse ATA*x
(ICCS 2003: Workshop on Parallel Linear Algebra, Melbourne, Australia, June 2003)
Richard Vuduc, Attila Gyulassy, James W. Demmel, Katherine A. Yelick.
Abstract
| PDF (328k)
| Gzip'd PostScript (91k)
Talk slides, PDF (735k)
| Talk slides, gzip'd PostScript, 4-up (138k)
Extended version:
U.C. Berkeley Technical Report UCB/CS-03-1232
- Paper: Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply
(Proceedings of the IEEE/ACM Conference on Supercomputing, 2002, Baltimore, MD, USA, November 2002)
Richard Vuduc, James W. Demmel, Katherine A. Yelick, Shoaib Kamil, Rajesh Nishtala, Benjamin Lee.
Abstract
| PDF (834k)
| Gzip'd PostScript (2.7M)
Talk slides, PDF (1.0M)
| Talk slides, gzip'd PostScript, 4-up (159k)
- Paper: Automatic Performance Tuning and Analysis of Sparse Triangular Solve
(ICS 2002: Workshop on Performance Optimization via High-Level Languages and Libraries, New York, NY, USA, June 2002)
Richard Vuduc, Shoaib Kamil, Jen Hsu, Rajesh Nishtala, James W. Demmel, Katherine A. Yelick.
Abstract
| PDF (548k)
| PostScript (1.2M)
Talk Slides, PDF (681k)
- Paper: Optimizing Sparse Matrix-Vector Multiplication for Register Reuse
Eun-jin Im and Katherine A. Yelick.
International Conference on Computational Science, San Francisco, California, May 2001.
PDF (164k)
Gzip'd PostScript (132k)
- Paper: Statistical Models for Automatic Performance Tuning
Richard Vuduc, James Demmel, Jeff Bilmes.
International Conference on Computational Science,
San Francisco, CA, USA, May 2001.
PDF (410k)
| Gzip'd PostScript (163k)
| Talk Slides, PDF (685k)
Writing in preparation:
- Tech report: Matrix Splitting and Reordering for Sparse Matrix-Vector Multiply
Hyun Jin Moon, Richard Vuduc, James W. Demmel, Katherine A. Yelick.
UCB Technical Report, 2003. (in preparation)
Abstract (DRAFT)
(Talks delivered in conjunction with conference papers appear
under Publications.)
- Talk: Avoiding Communication in Linear Algebra
James Demmel.
(SIAM PP 2008, Atlanta, Georgia, USA,
March 12-14, 2008)
Powerpoint (940k) | PDF (1.7MB)
- Talk: Bandwidth Avoiding Stencil Computations
Kaushik Datta, Sam Williams, James Demmel, Katherine Yelick.
(SIAM PP 2008, Atlanta, Georgia, USA,
March 12-14, 2008)
Powerpoint (2.45MB) | PDF (3.05MB)
- Talk: Fast Implementations of the Akx Kernel
Marghoob Mohiyuddin, Mark Hoemmen, James Demmel, Katherine Yelick.
(SIAM PP 2008, Atlanta, Georgia, USA,
March 12-14, 2008)
PDF (9.3MB)
- Talk: Communication-avoiding Krylov subspace methods
Mark Hoemmen.
(SIAM PP 2008, Atlanta, Georgia, USA,
March 12-14, 2008)
PDF (952k)
- Talk: The Future of Numerical Linear Algebra Libraries: Automatic Tuning of Sparse Matrix Codes and the Next LAPACK and ScaLAPACK
(SciDAC 2005 Meeting, San Francisco, California, USA, June 2005)
PowerPoint (305k)
- Talk: OSKI: An automatically tuned library of sparse
matrix kernels
(SIAM CSE 2005, Orlando, Florida, USA, February 2005)
Richard Vuduc, James Demmel, Katherine Yelick.
PowerPoint
| PDF
- Talk: When Cache Blocking Sparse Matrix Multiply Works
and Why
(PARA'04 Workshop on State-of-the-art in Scientific Computing, Copenhagen, Denmark, June 2004)
Rajesh Nishtala, Richard Vuduc, James Demmel, Katherine Yelick.
PDF (147k)
| Gzip'd PostScript (138k) | Talk Slides: MS PPT (3MB)
- Talk: Adaptable benchmarks for register blocked sparse matrix-vector multiplication, Mark Hoemmen, Matrix Computations Seminar at U.C. Berkeley, May 5, 2004.
PowerPoint (420k)
| PDF (420k)
| Link to Software
-
Talk: Performance Optimizations and Bounds for
Symmetric Sparse Matrix-Multiple Vector Multiply
SIAM Parallel Processing Meeting, San Francisco, California, USA, March 2004.
PowerPoint (319k)
| PDF (331k)
-
Poster: A Computationally Efficient Triple Matrix Product
for a Class of Sparse Schur-Complement Matrices
SIAM Parallel Processing Meeting, San Francisco, California, USA, March 2004.
PowerPoint (678k)
| PDF (177k)
| PNG (504k)
| GIF (436k)
| JPEG (985k)
-
Poster: Automatic Tuning of Collective Communications
in MPI
An overview of the Probabilistic Algorithm Selection System (PASS) for finding and choosing the best implementation of a given MPI collective operation on a given hardware platform.
PowerPoint (5.3M)
| PDF (118k)
Full report: PDF (1.9 MB)
-
Poster: Automatic Performance Tuning of Sparse Matrix Kernels
This poster, shown at the Berkeley-Stanford CS Day
(2002,
2003),
Bay Area Scientific Computing Day (2002), and the SIAM CSE 2003 Meeting,
summarizes our overall approach to tuning a number of
sparse matrix kernels, including matrix-vector
multiply (non-symmetric and symmetric),
triangular solve, and multiplication by
ATA.
PowerPoint (1.7 MB)
| PDF (617k)
| PNG (520 kB)
| GIF (715 kB)
| JPEG (1.5 MB)
-
Talk: Automatic Performance Tuning of Sparse Matrix Kernels
Presentation of recent work to Intel, with a discussion
of ideas related to acceleration of the Google PageRank
algorithm.
(24 Jan 2003)
PowerPoint (806 kB)
| PDF (830k)
Google matrix pics, PowerPoint (142 kB)
| PDF (139k)
-
Talk: Automatic Performance Tuning of Linear Algebra Kernels
This presentation, given at the
TOPS-SciDAC
kick-off meeting (25 Jan 2002), describes recent results
performance tuning of sparse linear algebra kernels.
PowerPoint (1.5 MB)
| PDF (2.7 MB)
Current Local Activities
- XBLAS: Extended and Mixed-Precision BLAS
- Sparsity: A Toolkit for Optimizing Sparse Matrix-Vector Multiply
- IRAM: Intelligent RAM Project
- Titanium: High-Performance Java Compiler
Prior Projects
- PHiPAC:
An Automatic Tuning System for Matrix Multiply
External Collaborations
This research was supported in part by
the National Science Foundation under
NSF Cooperative Agreement No. ACI-9813362,
NSF Cooperative Agreement No. ACI-9619020,
the Department of Energy under
DOE Grant No. DE-FC02-01ER25478,
and
a gift from Intel.
The information presented here does not necessarily reflect
the position or the policy of the Government and no
official endorsement should be inferred.
Standard disclaimer: Publications are presented to ensure timely
dissemination of scholarly and technical work. Copyright and all
rights therein are retained by authors or by other copyright
holders. All persons copying this information are expected to
adhere to the terms and constraints invoked by each author's
copyright. In most cases, these works may not be reposted without
the explicit permission of the copyright holder.
|