BigData project uses gaming hardware to tackle radio data challenges

big-data_title

Large physics experiments are often developed in remote areas: the vast plains of Argentina, a mountainside in Mexico, or deep in the ice at the South Pole in Antarctica. Constructing them is a formidable challenge, and so is collecting and analyzing the data they generate.

Physicists from the University of Wisconsin–Madison and Ohio State University have teamed up to take on a different type of challenge—big data. With funding from the National Science Foundation’s BIGDATA initiative, they will explore ways to better analyze, sort, and transmit data from the Askaryan Radio Array at the South Pole in Antarctica and from the Hydrogen Structure Array in Xinjiang, China.

The three-year project, titled “Advancing real-time data processing and reduction in radio astronomical detectors,” uses graphics processing units, or GPUs, designed for computer games, to analyze and sort physics data. Although it focuses on two remote projects, current and future radio telescopes could benefit from the algorithms and hardware developed as part of the BIGDATA project.

“Modern radio experiments produce huge amounts of data, and the field is rapidly developing,” said UW–Madison physics professor Albrecht Karle of the Wisconsin IceCube Particle Astrophysics Center (WIPAC), principal investigator on the BIGDATA award and the Askaryan Radio Array (ARA). “The projects have very similar technical challenges, and we bring people with different expertise together.”

Karle and Amy Connolly, a professor from the Center for Cosmology and Astroparticle Physics at Ohio State University, will lead efforts in development and testing for ARA, a radio neutrino detector. UW–Madison physics professor Peter Timbie is the lead from the Hydrogen Structure Array, or HSA, a radio experiment designed to map the large-scale structure of the universe.

Radio experiments are well suited for data management research. Faint radio signals observed by antennae are digitized and cross correlated with data from other nearby antennae. In addition, they view a broad area of the sky. The combination of a large viewing area and the need to compare signals from multiple antennae means that huge amounts of digital data are created and need to be analyzed. ARA is projected to generate 1,000 terabytes of data per year.

The BIGDATA project will use GPU clusters as much as possible to compress and manipulate data. The eventual objective is to extract the essential scientific information, such as the path of a particle or a map of particle sources in the universe. That information may correspond to less than a billionth of the original data set.

“We are taking advantage of a technology that already exists and can be used in a variety of applications. It will help us solve real issues that our projects are facing,” explains Karle.

The first year of the project will focus on development of special algorithms that will cross correlate data from the individual antennae. In years two and three, mobile GPU-based data reduction computers will be hand carried to experiment sites for testing.