The SKA is an ambitious project to construct the world’s largest radio telescope and enable transformational science and discoveries impossible with current facilities. Built over two sites in Australia and Africa, it will, when complete, provide over a million square metres of collecting area through many thousands of connected radio antennas. The SKA is currently foreseen to be constructed in two phases. By adopting a phased rollout, more developed technologies will be utilized early on in the project to secure the first wave of scientific discoveries at the earliest opportunity and then upgraded with new technology currently under development during the second phase. The first phase of the project, SKA1, represents a €650M investment and, together with other countries in the world, European member states are leading partners in the construction. SKA1 is currently in the preconstruction phase where the design and specifications are being finalized with construction slated to start toward the end of 2018 and first science operations in the early 2020’s.
With this timeline, the next few years will be crucial in preparing to support this first SKA science. Based on current projections, the SKA Observatory, once its first phase is operational, is expected to produce an archive of standard data products with a growth rate on the order of 50—300 petabytes per year. Although the challenges associated with populating and maintaining the SKA science archive are already impressive, these data products actually represent only the first part of the full science extraction chain. Further processing and subsequent science extraction by the community will require a significant research infrastructure providing capacity in networking, storage, computing, and expertise. The AENEAS project represents an opportunity to pursue the design, deployment, and operation of the necessary research infrastructure for SKA science at a European level and in close coordination with the SKA project, the host countries, and other international partners. Ultimately, our ambition is to ensure the astronomy community has the resources it will need to achieve the truly transformational science potential of the SKA.
Figure 1: An artist’s conception of the two types of radio telescope arrays to be constructed during Phase 1 of the SKA. During this first phase, thousands of radio dishes and stations of phased-array dipoles will be deployed across the deserts of South Africa and Australia, respectively.
SKA as a Big Data Project
The first phase of the SKA Observatory will include telescopes located in both South Africa and Australia and will feature both a high-frequency interferometric array of 15-meter reflecting dishes (SKA1-MID) and a large collection of individual, dipole antennas (SKA1-LOW) inspired by the LOFAR approach (see Figure 1). SKA1-LOW in Australia will consist of nearly 130,000 antennas, distributed over roughly 500 stations and have an operating frequency between 50 MHz and 350 MHz. SKA1-MID will be located in South Africa and will consist of 200 dishes equipped with a suite of receivers to cover the 350 MHz – 14 GHz range of the radio spectrum. Signals from the individual telescopes and antenna stations will be transported to a central processing facility on-site in each hosting country, with a dedicated high-bandwidth connection to initial science processing and archiving centres in Perth (AU) and Cape Town (SA).
The SKA will transmit high volumes of data through its dedicated network and intelligently reduce these data to a manageable size in near real time. With data rates from the dishes of over 1 petabits per second and 10 petabits per second from the low-frequency phased-arrays, the total data rates when the SKA1 is complete and starting operations (between about 2020 and 2023) are expected to exceed the total global internet traffic at present day rates.
Correspondingly, the magnitude of processing power that the SKA will need to handle this volume of data will be comparable to that of the largest computers in the world in the early 2020s – systems that are at least ten times the size of today’s biggest machines. The computational processing requirements for the full SKA phase 1 system are predicted to be of order 300 petaflops – about 10 times the performance of the world’s current fastest supercomputer. This level of performance will require development of innovative management for the ICT infrastructure to ensure sustained, optimal performance throughout the expected SKA lifetime, simultaneously driving and benefiting from the growth in capability provided by the ICT industry.
A range of innovative software will also be required both before and after construction, with much of the pre-construction development being re-usable after construction for monitoring system performance and the impact of component upgrades. With reduced science data products that will still run to 100’s of petabytes per year in size, enabling access to the data for the science community will present a further major challenge. It is currently envisioned that these products will be distributed to the international community via intercontinental networks to scientists using SKA Regional Centres (SRCs) distributed across the globe (see Figure 2).
The SKA has been widely identified as one of the major “Big Data” challenges for the next decade. The technical challenges in computing, storage, networking, and analytics required to deploy a research infrastructure capable of supporting European SKA science are also attractive to the IT community, and have much wider applicability both within an academic but also commercial context. A distributed and federated European SRC, therefore, can provide a platform for a European and nationally focused partnership with industry for the continued development of these core technologies and hence a clear route to delivering impact and return.
Figure 2: Illustration of the proposed federated network of SKA Regional Centres (SRCs) distributed around the world. These SRCs will provide access to the accumulated SKA science data for communities in different regions and globally. The proposed European Science Data Centre (ESDC) would serve as the European hub in such a network and support the European SKA community.