News 

Events

Press Corner

Press Releases

Press Coverage

Multimedia Corner

Partners' Spotlights

Documents

Annual Reports

Newsletter

Technical Documents

Presentations

 

openlab Phase III

Automation Controls CC

Database CC

Networking CC

Platform CC

Previous Phases

Management

Education Corner

 

Student Programme

What is it?

How to apply-2012

Students-2012

Programme-2012

About CERN openlab

What is it?

Participants

Guiding Principles

 
 
 
 

Printable Version

 

LHC Computing (Draft version)
Getting to grips with the Grid

The year 2002 was a turning point for the construction of a computing Grid for the LHC. The LHC Computing Grid (LCG) project was officially launched, with a mission to integrate thousands of computers at dozens of participating centres worldwide into a global computing resource. This technological tour-de-force will rely on software being developed in the European DataGrid (EDG) project, led by CERN, which is the largest software development project ever funded by the EU. And it will benefit from hardware developments initiated in the CERN openlab for DataGrid applications, a novel partnership between CERN and industry.

The Grid may well be the computer buzzword of the decade. Not since the World Wide Web was developed at CERN, over ten years ago, has a new networking technology held so much promise for both science and society. Once again, CERN is set to play a leading role in making the technology a reality.

The name Grid was coined in analogy with the way geographically distributed power stations supply power seamlessly to the electrical grid. The philosophy of the Grid is to provide vast amounts of computer power at the click of a mouse, by linking geographically distributed computers and developing software  to run this network of computers as though it were a monolithic resource. Whereas the Web gives access to distributed information, the Grid does the same for distributed processing power and storage capacity.

There are many varieties of Grid technology. In the commercial arena, Grids that harness the combined power of many workstations within a single organisation are already common. Another popular Grid-like technology is screensavers, such as SETI@home, which use spare time on PCs to analyse scientific data. However, CERN’s objective is altogether more ambitious. The amount of data that will pour out of the LHC experiments will be of the order of 10 petabytes a year – the equivalent of over 10 million CD-ROMs. Storing this data in a distributed fashion, and making it easily accessible to thousands of scientists around the world, is one of the major challenges for the LCG project.

The LCG project will pool the power of thousands of computers. In 2002, LCG began rapidly gearing up for this challenge, with over 50 computer scientists and engineers from partner research centres around the world joining over the year. The focus in 2002 was on defining the stringent data storage and processing requirements of the experiments. For example, a key technical requirement is to ensure data "persistency", which is how the Grid maintains data availability at all times, even as the underlying network of computers evolves.

The European dimension

One of the challenges of building a Grid is that the software needed to keep it ticking over - the middleware - barely exists. In engineering terms, it is rather like trying to build a suspension bridge before the technology for steel cables has been fully developed. CERN is not alone in facing this challenge. Other disciplines, such as bioinformatics and Earth observation, are also contemplating huge increases in computing and storage requirements, demanding similar technology. This is why CERN, together with a host of leading European research centres, took the initiative for the European DataGrid (EDG) project, to develop a testbed for Grid technologies.

EDG builds on a software toolkit for Grid technology known as Globus, developed in the US, as well as other software packages, and uses these to build a functioning Grid testbed. The project brings together over 100 computer engineers, who have generated over 300 000 lines of code already. In 2002, EDG middleware managed to connect computing resources at some 20 major centres. In collaboration with the LHC experiments CMS and ATLAS, a number of demanding computational challenges was successfully processed. This proved that many components of the EDG software are ready for use in the LCG project.

The success of EDG has generated strong support for a follow-up effort to build a permanent European Grid infrastructure that can serve a broad spectrum of applications reliably and continuously. Providing such a service will require a much larger effort than setting up the current testbed. So CERN has established a pan-European consortium to build a production Grid infrastructure, in the context of the EU 6th Framework Program. The potential benefits of such an infrastructure for Europe will extend well beyond the LHC, to nearly all areas of science and commerce where large amounts of data must be managed in a distributed way. However, the benefits of Grid technology are not just regional but global. Reflecting this, CERN has been playing an active role in the Global Grid Forum, which helps to set worldwide standards in this rapidly evolving field.

The year 2002 also saw the launch of several other EU-funded Grid projects in which CERN plays a significant role. CERN is leading DataTag, which ensures hig-speed links and middleware compatibility between Grids in Europe and the US. CERN is also involved in CrossGrid, a project that aims to extend EDG functionality to advanced applications such as real-time simulations. The GRACE project develops a higher-layer of software for Grids, including concepts such as semantic Grids, which provide contextual meaning to information stored on the Grid. CERN is also a partner in MammoGrid, a project dedicated to building a Grid for hospitals to share and analyse mammograms in an effort to improve breast cancer treatment. Finally, GridStart aims to coordinate the efforts of nine major Grid initiatives in Europe, including EDG, and disseminate information about the benefits of Grid technology to industry and society.

Open for industry

In 2002, HP joined Intel and Enterasys Networks in the CERN openlab for DataGrid applications. This partnership has launched an ambitious project called CERN opencluster, which combines 64-bit processor technology from Intel, computer clusters from HP, and a 10 gigabit/s switching environment from Enterasys Networks. The objective is to build a cluster based on technologies that are well beyond the cutting edge of what is available on the market today.

The CERN openlab partnership allows CERN to peer into the technological crystal ball and test technologies that may well be commercially competitive when the LHC is up and running. The industrial partners view this as a great opportunity to develop and test new technologies, which are still far from the market, under the rigorous and demanding conditions that CERN's advanced computing environment provides. In particular, the CERN opencluster will be linked to the EDG testbed, to see how these new technologies perform in a Grid environment. The results will also be closely monitored by the LCG project, to determine how the new technologies fit into the project’s future technology roadmap.

The CERN openlab provides the LHC with a vital source of industrial sponsorship for long-term technology development. The equipment for the CERN opencluster, as well as funding for some of the researchers to develop it, is provided by the industrial partners as part of CERN openlab membership requirements. The concept has proved very popular, with other major computer and software manufacturers eager to join.

François Grey

 

 

 

 
 


Last update: Thursday, 26. January 2012 13:12


Copyright CERN