Z. Li, S. Liu, X. Yu, K. Bhavya, J. Cao, J. Diffenderfer, P.T. Bremer, V. Pascucci. “Understanding Robustness Lottery”: A Comparative Visual Analysis of Neural Network Pruning Approaches, Subtitled arXiv preprint arXiv:2206.07918, 2022.
Deep learning approaches have provided state-of-the-art performance in many applications by relying on extremely large and heavily overparameterized neural networks. However, such networks have been shown to be very brittle, not generalize well to new uses cases, and are often difficult if not impossible to deploy on resources limited platforms. Model pruning, i.e., reducing the size of the network, is a widely adopted strategy that can lead to more robust and generalizable network -- usually orders of magnitude smaller with the same or even improved performance. While there exist many heuristics for model pruning, our understanding of the pruning process remains limited. Empirical studies show that some heuristics improve performance while others can make models more brittle or have other side effects. This work aims to shed light on how different pruning methods alter the network's internal feature representation, and the corresponding impact on model performance. To provide a meaningful comparison and characterization of model feature space, we use three geometric metrics that are decomposed from the common adopted classification loss. With these metrics, we design a visualization system to highlight the impact of pruning on model prediction as well as the latent feature embedding. The proposed tool provides an environment for exploring and studying differences among pruning methods and between pruned and original model. By leveraging our visualization, the ML researchers can not only identify samples that are fragile to model pruning and data corruption but also obtain insights and explanations on how some pruned …
Across domains massive amounts of scientific data are generated. Because of the large volume of information, data discoverability is often hard if not impossible, especially for scientists who have not generated the data or are from other domains. As part of the NSF-funded National Science Data Fabric (NSDF) initiative, we develop a testbed to demonstrate that these boundaries to data discoverability can be overcome. In support of this effort, we identify the need for indexing large-amounts of scientific data across scientific domains. We propose NSDF-Catalog, a lightweight indexing service with minimal metadata that complements existing domain-specific and rich-metadata collections. NSDF-Catalog is designed to facilitate multiple related objectives within a flexible microservice to: (i) coordinate data movements and replication of data from origin repositories within the NSDF federation; (ii) build an inventory of existing scientific data to inform the design of next-generation cyberinfrastructure; and (iii) provide a suite of tools for discovery of datasets for cross-disciplinary research. Our service indexes scientific data at a fine-granularity at the file or object level to inform data distribution strategies and to improve the experience for users from the consumer perspective, with the goal of allowing end-to-end dataflow optimizations
P. Olaya, J. Luettgau, N. Zhou, J. Lofstead, G. Scorzelli, V. Pascucci, M. Taufer. NSDF-FUSE: A Testbed for Studying Object Storage via FUSE File Systems, In Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing, Association for Computing Machinery, pp. 277–278. 2022.
This work presents NSDF-FUSE, a testbed for evaluating settings and performance of FUSE-based file systems on top of S3-compatible object storage; the testbed is part of a suite of services from the National Science Data Fabric (NSDF) project (an NSF-funded project that is delivering cyberinfrastructures for data scientists). We demonstrate how NSDF-FUSE can be deployed to evaluate eight different mapping packages that mount S3-compatible object storage to a file system, as well as six data patterns representing different I/O operations on two cloud platforms. NSDF-FUSE is open-source and can be easily extended to run with other software mapping packages and different cloud platforms.
A. Venkat, D. Hoang, A. Gyulassy, P.T. Bremer, F. Federer, V. Pascucci. High-Quality Progressive Alignment of Large 3D Microscopy Data, In 2022 IEEE 12th Symposium on Large Data Analysis and Visualization (LDAV), pp. 1--10. 2022.
Large-scale three-dimensional (3D) microscopy acquisitions fre-quently create terabytes of image data at high resolution and magni-fication. Imaging large specimens at high magnifications requires acquiring 3D overlapping image stacks as tiles arranged on a two-dimensional (2D) grid that must subsequently be aligned and fused into a single 3D volume. Due to their sheer size, aligning many overlapping gigabyte-sized 3D tiles in parallel and at full resolution is memory intensive and often I/O bound. Current techniques trade accuracy for scalability, perform alignment on subsampled images, and require additional postprocess algorithms to refine the alignment quality, usually with high computational requirements. One common solution to the memory problem is to subdivide the overlap region into smaller chunks (sub-blocks) and align the sub-block pairs in parallel, choosing the pair with the most reliable alignment to determine the global transformation. Yet aligning all sub-block pairs at full resolution remains computationally expensive. The key to quickly developing a fast, high-quality, low-memory solution is to identify a single or a small set of sub-blocks that give good alignment at full resolution without touching all the overlapping data. In this paper, we present a new iterative approach that leverages coarse resolution alignments to progressively refine and align only the promising candidates at finer resolutions, thereby aligning only a small user-defined number of sub-blocks at full resolution to determine the lowest error transformation between pairwise overlapping tiles. Our progressive approach is 2.6x faster than the state of the art, requires less than 450MB of peak RAM (per parallel thread), and offers a higher quality alignment without the need for additional postprocessing refinement steps to correct for alignment errors.
E. Deelman, A. Mandal, A. P. Murillo, J. Nabrzyski, V. Pascucci, R. Ricci, I. Baldin, S. Sons, L. Christopherson, C. Vardeman, R. F. da Silva, J. Wyngaard, S. Petruzza, M. Rynge, K. Vahi, W. R. Whitcup, J. Drake, E. Scott. Blueprint: Cyberinfrastructure Center of Excellence, Subtitled arXiv, 2021.
In 2018, NSF funded an effort to pilot a Cyberinfrastructure Center of Excellence (CI CoE or Center) that would serve the cyberinfrastructure (CI) needs of the NSF Major Facilities (MFs) and large projects with advanced CI architectures. The goal of the CI CoE Pilot project (Pilot) effort was to develop a model and a blueprint for such a CoE by engaging with the MFs, understanding their CI needs, understanding the contributions the MFs are making to the CI community, and exploring opportunities for building a broader CI community. This document summarizes the results of community engagements conducted during the first two years of the project and describes the identified CI needs of the MFs. To better understand MFs' CI, the Pilot has developed and validated a model of the MF data lifecycle that follows the data generation and management within a facility and gained an understanding of how this model captures the fundamental stages that the facilities' data passes through from the scientific instruments to the principal investigators and their teams, to the broader collaborations and the public. The Pilot also aimed to understand what CI workforce development challenges the MFs face while designing, constructing, and operating their CI and what solutions they are exploring and adopting within their projects. Based on the needs of the MFs in the data lifecycle and workforce development areas, this document outlines a blueprint for a CI CoE that will learn about and share the CI solutions designed, developed, and/or adopted by the MFs, provide expertise to the largest NSF projects with advanced and complex CI architectures, and foster a …
A. A. Gooch, S. Petruzza, A. Gyulassy, G. Scorzelli, V. Pascucci, L. Rantham, W. Adcock, C. Coopmans. Lessons learned towards the immediate delivery of massive aerial imagery to farmers and crop consultants, In Autonomous Air and Ground Sensing Systems for Agricultural Optimization and Phenotyping VI, Vol. 11747, International Society for Optics and Photonics, pp. 22 -- 34. 2021.
In this paper, we document lessons learned from using ViSOAR Ag Explorer™ in the fields of Arkansas and Utah in the 2018-2020 growing seasons. Our insights come from creating software with fast reading and writing of 2D aerial image mosaics for platform-agnostic collaborative analytics and visualization. We currently enable stitching in the field on a laptop without the need for an internet connection. The full resolution result is then available for instant streaming visualization and analytics via Python scripting. While our software, ViSOAR Ag Explorer™ removes the time and labor software bottleneck in processing large aerial surveys, enabling a cost-effective process to deliver actionable information to farmers, we learned valuable lessons with regard to the acquisition, storage, viewing, analysis, and planning stages of aerial data surveys. Additionally, with the ultimate goal of stitching thousands of images in minutes on board a UAV at the time of data capture, we performed preliminary tests for on-board, real-time stitching and analysis on USU AggieAir sUAS using lightweight computational resources. This system is able to create a 2D map while flying and allow interactive exploration of the full resolution data as soon as the platform has landed or has access to a network. This capability further speeds up the assessment process on the field and opens opportunities for new real-time photogrammetry applications. Flying and imaging over 1500-2000 acres per week provides up-to-date maps that give crop consultants a much broader scope of the field in general as well as providing a better view into planting and field preparation than could be observed from field level. Ultimately, our software and hardware could provide a much better understanding of weed presence and intensity or lack thereof.
X. Huang, P. Klacansky, S. Petruzza, A. Gyulassy, P.T. Bremer, V. Pascucci. Distributed merge forest: a new fast and scalable approach for topological analysis at scale, In Proceedings of the ACM International Conference on Supercomputing, pp. 367-377. 2021.
Topological analysis is used in several domains to identify and characterize important features in scientific data, and is now one of the established classes of techniques of proven practical use in scientific computing. The growth in parallelism and problem size tackled by modern simulations poses a particular challenge for these approaches. Fundamentally, the global encoding of topological features necessitates inter process communication that limits their scaling. In this paper, we extend a new topological paradigm to the case of distributed computing, where the construction of a global merge tree is replaced by a distributed data structure, the merge forest, trading slower individual queries on the structure for faster end-to-end performance and scaling. Empirically, the queries that are most negatively affected also tend to have limited practical use. Our experimental results demonstrate the scalability of both the merge forest construction and the parallel queries needed in scientific workflows, and contrast this scalability with the two established alternatives that construct variations of a global tree.
Z. Li, H. Menon, K. Mohror, P. T. Bremer, Y. Livant, V. Pascucci. Understanding a program's resiliency through error propagation, In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ACM, pp. 362-373. 2021.
Aggressive technology scaling trends have worsened the transient fault problem in high-performance computing (HPC) systems. Some faults are benign, but others can lead to silent data corruption (SDC), which represents a serious problem; a fault introducing an error that is not readily detected nto an HPC simulation. Due to the insidious nature of SDCs, researchers have worked to understand their impact on applications. Previous studies have relied on expensive fault injection campaigns with uniform sampling to provide overall SDC rates, but this solution does not provide any feedback on the code regions without samples.
T. McDonald, R. Shrestha, X. Yi, H. Bhatia, D. Chen, D. Goswami, V. Pascucci, T. Turbyville, P‐T Bremer. Leveraging Topological Events in Tracking Graphs for Understanding Particle Diffusion, In Computer Graphics Forum, Vol. 40, No. 3, pp. 251-262. 2021.
Single particle tracking (SPT) of fluorescent molecules provides significant insights into the diffusion and relative motion of tagged proteins and other structures of interest in biology. However, despite the latest advances in high-resolution microscopy, individual particles are typically not distinguished from clusters of particles. This lack of resolution obscures potential evidence for how merging and splitting of particles affect their diffusion and any implications on the biological environment. The particle tracks are typically decomposed into individual segments at observed merge and split events, and analysis is performed without knowing the true count of particles in the resulting segments. Here, we address the challenges in analyzing particle tracks in the context of cancer biology. In particular, we study the tracks of KRAS protein, which is implicated in nearly 20% of all human cancers, and whose clustering and aggregation have been linked to the signaling pathway leading to uncontrolled cell growth. We present a new analysis approach for particle tracks by representing them as tracking graphs and using topological events – merging and splitting, to disambiguate the tracks. Using this analysis, we infer a lower bound on the count of particles as they cluster and create conditional distributions of diffusion speeds before and after merge and split events. Using thousands of time-steps of simulated and in-vitro SPT data, we demonstrate the efficacy of our method, as it offers the biologists a new, detailed look into the relationship between KRAS clustering and diffusion speeds.
We present a Python-based renderer built on NVIDIA's OptiX ray tracing engine and the OptiX AI denoiser, designed to generate high-quality synthetic images for research in computer vision and deep learning. Our tool enables the description and manipulation of complex dynamic 3D scenes containing object meshes, materials, textures, lighting, volumetric data (e.g., smoke), and backgrounds. Metadata, such as 2D/3D bounding boxes, segmentation masks, depth maps, normal maps, material properties, and optical flow vectors, can also be generated. In this work, we discuss design goals, architecture, and performance. We demonstrate the use of data generated by path tracing for training an object detector and pose estimator, showing improved performance in sim-to-real transfer in situations that are difficult for traditional raster-based renderers. We offer this tool as an easy-to-use, performant, high-quality renderer for advancing research in synthetic data generation and deep learning.
A. Venkat, A. Gyulassy, G. Kosiba, A. Maiti, H. Reinstein, R. Gee, P.-T. Bremer, V. Pascucci. Towards replacing physical testing of granular materials with a Topology-based Model, Subtitled arXiv preprint arXiv:2109.08777, 2021.
In the study of packed granular materials, the performance of a sample (e.g., the detonation of a high-energy explosive) often correlates to measurements of a fluid flowing through it. The "effective surface area," the surface area accessible to the airflow, is typically measured using a permeametry apparatus that relates the flow conductance to the permeable surface area via the Carman-Kozeny equation. This equation allows calculating the flow rate of a fluid flowing through the granules packed in the sample for a given pressure drop. However, Carman-Kozeny makes inherent assumptions about tunnel shapes and flow paths that may not accurately hold in situations where the particles possess a wide distribution in shapes, sizes, and aspect ratios, as is true with many powdered systems of technological and commercial interest. To address this challenge, we replicate these measurements virtually on micro-CT images of the powdered material, introducing a new Pore Network Model based on the skeleton of the Morse-Smale complex. Pores are identified as basins of the complex, their incidence encodes adjacency, and the conductivity of the capillary between them is computed from the cross-section at their interface. We build and solve a resistive network to compute an approximate laminar fluid flow through the pore structure. We provide two means of estimating flow-permeable surface area: (i) by direct computation of conductivity, and (ii) by identifying dead-ends in the flow coupled with isosurface extraction and the application of the Carman-Kozeny equation, with the aim of establishing consistency over a range of particle shapes, sizes, porosity levels, and void distribution patterns.
H. Childs, S. D. Ahern, J. Ahrens, A. C. Bauer, J. Bennett, E. W. Bethel, P. Bremer, E. Brugger, J. Cottam, M. Dorier, S. Dutta, J. M. Favre, T. Fogal, S. Frey, C. Garth, B. Geveci, W. F. Godoy, C. D. Hansen, C. Harrison, B. Hentschel, J. Insley, C. R. Johnson, S. Klasky, A. Knoll, J. Kress, M. Larsen, J. Lofstead, K. Ma, P. Malakar, J. Meredith, K. Moreland, P. Navratil, P. O’Leary, M. Parashar, V. Pascucci, J. Patchett, T. Peterka, S. Petruzza, N. Podhorszki, D. Pugmire, M. Rasquin, S. Rizzi, D. H. Rogers, S. Sane, F. Sauer, R. Sisneros, H. Shen, W. Usher, R. Vickery, V. Vishwanath, I. Wald, R. Wang, G. H. Weber, B. Whitlock, M. Wolf, H. Yu, S. B. Ziegeler. A Terminology for In Situ Visualization and Analysis Systems, In International Journal of High Performance Computing Applications, Vol. 34, No. 6, pp. 676–691. 2020.
The term “in situ processing” has evolved over the last decade to mean both a specific strategy for visualizing and analyzing data and an umbrella term for a processing paradigm. The resulting confusion makes it difficult for visualization and analysis scientists to communicate with each other and with their stakeholders. To address this problem, a group of over fifty experts convened with the goal of standardizing terminology. This paper summarizes their findings and proposes a new terminology for describing in situ systems. An important finding from this group was that in situ systems are best described via multiple, distinct axes: integration type, proximity, access, division of execution, operation controls, and output type. This paper discusses these axes, evaluates existing systems within the axes, and explores how currently used terms relate to the axes.
L. Cinquini, S. Petruzza, Jason J. Boutte, S. Ames, G. Abdulla, V. Balaji, R. Ferraro, A. Radhakrishnan, L. Carriere, T. Maxwell, G. Scorzelli, V. Pascucci. Distributed Resources for the Earth System Grid Advanced Management (DREAM), Final Report, 2020.
The DREAM project was funded more than 3 years ago to design and implement a next-generation ESGF (Earth System Grid Federation ) architecture which would be suitable for managing and accessing data and services resources on a distributed and scalable environment. In particular, the project intended to focus on the computing and visualization capabilities of the stack, which at the time were rather primitive. At the beginning, the team had the general notion that a better ESGF architecture could be built by modularizing each component, and redefining its interaction with other components by defining and exposing a well defined API. Although this was still the high level principle that guided the work, the DREAM project was able to accomplish its goals by leveraging new practices in IT that started just about 3 or 4 years ago: the advent of containerization technologies (specifically, Docker), the development of frameworks to manage containers at scale (Docker Swarm and Kubernetes), and their application to the commercial Cloud. Thanks to these new technologies, DREAM was able to improve the ESGF architecture (including its computing and visualization services) to a level of deployability and scalability beyond the original expectations.
Machine learning and other Artifical Intelligenece technologies (all indicated in the following as AI) used within a modern, smart cyberinfrastructure have become critical new avenues for discovery and validation in data-driven science and engineering disciplines of all kinds. We can expect many landmark discoveries and new lines of productive research to be enabled through AI analysis of the rapidly growing treasure trove of scientific data. AI-based techniques have been applied in many fields of science and engineering, including remote sensing, cosmology, energy, cancer research, IT systems management, and machine design and control, but the lack of proper integration with the current NSF-supported cyberinfrastructure is limiting their potential. Recent events due to the COVID-19 pandemic have highlighted how cyberinfrastructure is a crucial enabler of modern research, with massive simulations and data management capabilities [8-10], but these events have also emphasized how the lack of proper integration with AI technology remains a major limiting factor for the advancement of science and engineering, especially when any kind of rapid response is needed.
A. Gyulassy, P.-T. Bremer, V. Pascucci. Shared-Memory Parallel Computation of Morse-Smale Complexes with Improved Accuracy, In IEEE Transactions on Visualization and Computer Graphics, Vol. 25, No. 1, IEEE, pp. 1183--1192. Jan, 2019.
Topological techniques have proven to be a powerful tool in the analysis and visualization of large-scale scientific data. In particular, the Morse-Smale complex and its various components provide a rich framework for robust feature definition and computation. Consequently, there now exist a number of approaches to compute Morse-Smale complexes for large-scale data in parallel. However, existing techniques are based on discrete concepts which produce the correct topological structure but are known to introduce grid artifacts in the resulting geometry. Here, we present a new approach that combines parallel streamline computation with combinatorial methods to construct a high-quality discrete Morse-Smale complex. In addition to being invariant to the orientation of the underlying grid, this algorithm allows users to selectively build a subset of features using high-quality geometry. In particular, a user may specifically select which ascending/descending manifolds are reconstructed with improved accuracy, focusing computational effort where it matters for subsequent analysis. This approach computes Morse-Smale complexes for larger data than previously feasible with significant speedups. We demonstrate and validate our approach using several examples from a variety of different scientific domains, and evaluate the performance of our method.
D. Hoang, P. Klacansky, H. Bhatia, P.-T. Bremer, P. Lindstrom, V. Pascucci. A Study of the Trade-off Between Reducing Precision and Reducing Resolution for Data Analysis and Visualization, In IEEE Transactions on Visualization and Computer Graphics, Vol. 25, No. 1, IEEE, pp. 1193--1203. Jan, 2019.
There currently exist two dominant strategies to reduce data sizes in analysis and visualization: reducing the precision of the data, e.g., through quantization, or reducing its resolution, e.g., by subsampling. Both have advantages and disadvantages and both face fundamental limits at which the reduced information ceases to be useful. The paper explores the additional gains that could be achieved by combining both strategies. In particular, we present a common framework that allows us to study the trade-off in reducing precision and/or resolution in a principled manner. We represent data reduction schemes as progressive streams of bits and study how various bit orderings such as by resolution, by precision, etc., impact the resulting approximation error across a variety of data sets as well as analysis tasks. Furthermore, we compute streams that are optimized for different tasks to serve as lower bounds on the achievable error. Scientific data management systems can use the results presented in this paper as guidance on how to store and stream data to make efficient use of the limited storage and bandwidth in practice.
W. Usher, I. Wald, J. Amstutz, J. Gunther, C. Brownlee, V. Pascucci. Scalable Ray Tracing Using the Distributed FrameBuffer, In Eurographics Conference on Visualization (EuroVis) 2019, Vol. 38, No. 3, 2019.
Image- and data-parallel rendering across multiple nodes on high-performance computing systems is widely used in visualization to provide higher frame rates, support large data sets, and render data in situ. Specifically for in situ visualization, reducing bottlenecks incurred by the visualization and compositing is of key concern to reduce the overall simulation runtime. Moreover, prior algorithms have been designed to support either image- or data-parallel rendering and impose restrictions on the data distribution, requiring different implementations for each configuration. In this paper, we introduce the Distributed FrameBuffer, an asynchronous image-processing framework for multi-node rendering. We demonstrate that our approach achieves performance superior to the state of the art for common use cases, while providing the flexibility to support a wide range of parallel rendering algorithms and data distributions. By building on this framework, we extend the open-source ray tracing library OSPRay with a data-distributed API, enabling its use in data-distributed and in situ visualization applications.
H. Bhatia, A.G. Gyulassy, V. Lordi, J.E. Pask, V. Pascucci, P.T. Bremer. TopoMS: Comprehensive topological exploration for molecular and condensed‐matter systems, In Journal of Computational Chemistry, Vol. 39, No. 16, Wiley, pp. 936--952. March, 2018.
We introduce TopoMS, a computational tool enabling detailed topological analysis of molecular and condensed‐matter systems, including the computation of atomic volumes and charges through the quantum theory of atoms in molecules, as well as the complete molecular graph. With roots in techniques from computational topology, and using a shared‐memory parallel approach, TopoMS provides scalable, numerically robust, and topologically consistent analysis. TopoMS can be used as a command‐line tool or with a GUI (graphical user interface), where the latter also enables an interactive exploration of the molecular graph. This paper presents algorithmic details of TopoMS and compares it with state‐of‐the‐art tools: Bader charge analysis v1.0 (Arnaldsson et al., 01/11/17) and molecular graph extraction using Critic2 (Otero‐de‐la‐Roza et al., Comput. Phys. Commun. 2014, 185, 1007). TopoMS not only combines the functionality of these individual codes but also demonstrates up to 4× performance gain on a standard laptop, faster convergence to fine‐grid solution, robustness against lattice bias, and topological consistency. TopoMS is released publicly under BSD License. © 2018 Wiley Periodicals, Inc.
Inspired by the works of Forman on discrete Morse theory, which is a combinatorial adaptation to cell complexes of classical Morse theory on manifolds, we introduce a discrete analogue of the stratified Morse theory of Goresky and MacPherson (1988). We describe the basics of this theory and prove fundamental theorems relating the topology of a general simplicial complex with the critical simplices of a discrete stratified Morse function on the complex. We also provide an algorithm that constructs a discrete stratified Morse function out of an arbitrary function defined on a finite simplicial complex; this is different from simply constructing a discrete Morse function on such a complex. We borrow Forman's idea of a "user's guide," where we give simple examples to convey the utility of our theory.