News 

Events

Press Corner

Press Releases

Press Coverage

Multimedia Corner

Partners' Spotlights

Documents

Annual Reports

Newsletter

Technical Documents

Presentations

 

openlab Phase III

Automation Controls CC

Database CC

Networking CC

Platform CC

Previous Phases

Management

Education Corner

 

Student Programme

What is it?

How to apply-2012

Students-2012

Programme-2012

About CERN openlab

What is it?

Participants

Guiding Principles

 
 
 
 

Printable Version

CERN openlab for DataGrid applications

Contribution to IT department Annual Report 2003

F.Fluckiger, S. Jarp

 

The CERN openlab for DataGrid applications is a framework for evaluating and integrating cutting-edge technologies or services in partnership with industry, focusing on potential solutions for the LCG. The openlab invites members of the industry to join and contribute systems, resources or services, and carry out with CERN large-scale highly-performing evaluation of their solutions in an advanced integrated environment.

 

In a nutshell, the major achievements in 2003 were: the successful incorporation of two new partners: IBM and Oracle; the consolidation and expansion of the opencluster (a powerful compute and storage farm); the start of the gridification process of the opencluster;  the 10 Gbps challenge where very high transfer rates were achieved over LAN and WAN distances (the latter in collaboration with other groups); the organization of three thematic workshops including one on Total Cost of Ownership; the creation of a new, lighter, category of sponsors called contributors; the implementation of the openlab student programme,  bringing some 11 students in the summer.

 

Management

The project is formally led by the IT Department Head, seconded by the Associate Head, the Chief Technology Officer and the Communication and Development officer (the latter function will be provided in 2004 by the departmental Strategy and Communication unit).

 

Industrial Sponsors

The year 2003 started with three sponsors, Enterasys Networks

(Contributing high bit-rate network equipment), Hewlett Packard (computer servers and fellows), Intel Corporation (64-bit processors technology and 10 Gbps Network Interface Cards). In March 2003, IBM joint the openlab (to contribute hardware and software disk storage solution), followed by Oracle Corporation (to contribute Grid technology and fellows).

 

The annual Board of Sponsors meeting was successfully held the 13th of June, and the annual report issued at this occasion. In addition, three Thematic Workshops were organized (on Storage and Data Management, Fabric Management and Total Cost of Ownership). On the latter topic (TCO), a position paper establishing the facts and figures was produced.

 

In order to permit time-limited incorporations of sponsors to fulfil specific technical missions, a concept of contributor was devised and proposed to existing sponsored. Contributor status (as opposed to partner status for existing sponsors) implies lower financial commitment and correspondingly lesser benefits in terms of influence and exposure.

 

Technical progress

The openlab is constructing the opencluster, a pilot compute and storage farm based on HP's dual processors machines, Intel's Itanium Family Processors (IFP) processors, Enterasys's 10-Gbps switches, IBM’s Storage Tank system and Oracle’s 10g Grid solution. 

 

In 2003, the opencluster was first expanded with 32 servers (RX2600) equipped with IPF processors (second generation, 1 GHz) and running Red Hat Linux Enterprise Server 2.1 and open AFS and LSF. In October, 16 servers equipped with IPF’s third generation processors (1.3 GHz) were added. This is complemented with 7 development systems. 

 

The concept of openlab technical challenge –where tangible objectives are jointly targeted by some or all of the partners- was proposed to the sponsors. The first instantiation was the 10 Gbps Challenge, a common effort by Enterasys, HP, Intel and CERN.  In this context, a first experiment where two Linux-based HP computers with 1GHz IPF processors directly connected (back-to-back through 10 GbE Network Interface Cards) reached 5.7 Gbps for memory-to-memory transfer (single stream). The transfer took place over a 10 km fibre. To extend tests over WAN distances, collaborations took place with the DataTag project and the ATLAS DAQ group. Using openlab IPF-based servers as end-systems, DataTag and Caltech established a new world Internet-2 land-speed record. Extensive tests with Enterasys’s ER16 router demonstrated that 10 Gbps rates could only be achieved through multiple parallel streams. An upgrade strategy, including the use of Enterasys new N7 devices in 2004 was agreed between Enterasys and CERN.

 

On the front of storage, a 30TB disk sub-system (6 meta-data servers and 8 disk servers) was installed, using IBM’s StorageTank solution. Performance tests will be conducted in 2004.

 

Porting to IFP of physics applications (in collaboration with EP/SFT) and CERN systems continued in 2003, including for Castor, CLHEP, GEANT4 and ROOT.  Other groups also ported their applications (including ALIROOT by ALICE Collaboration, CMSIM by CMS US). Results of scalability tests with PROOF were reported to the CHEP2003 conference. As another example of collaboration with other groups, twenty of the IPF servers were used by ALICE for their 5th Data Challenge.

 

The gridification effort culminating with the porting of the LCG middleware (based on VDT and EDG). After some difficulties, porting was almost completed at the end of the year. HP Lab’s SmartFrog monitoring system was evaluated. As first results are promising, the effort will continue in 2004.

 

A full technical report is annexed.

 

Dissemination and Development activities

In addition to the thematic workshops organized in the framework of the technical programme, two papers were published in the Proceedings of the CHEP2003 conference and  one article was published in the CERN Courier and 3 joint press releases were issued.  The openlab also hosted at CERN two meetings of the First Tuesday Suisse Romande series, involving active participation of openlab partners.

 

Following a series of meetings, a document exploring the possibilities for development of the openlab in the field of security was produced.

 

Based on a pilot programme run in 2002, a CERN openlab student programme was run in the summer 2003, involving 11 students from seven European countries. Four of these students contributed directly to the opencluster activity; the others worked on the Athena experiment and on the development of the Grid Café web site. The later was successfully demonstrated at the Telecom2003 exhibition and at the SIS-Forum, part of the World Summit on Information Technology event.

 

Resources

The openlab integrates technical and managerial efforts from several IT groups: ADC (Technical Management; opencluster via two fellows who joined in 2003; StorageTank); CS (10 GbE networking); DB (Oracle 10g); DI (Project management, communication).

 


Annex: Detailed technical report

 

Basic systems

After receiving nine “development” systems in the last part of 2002, the openlab Itanium Processor Family (IPF) cluster was expanded with 32 “production” servers at the beginning of the year. These are HP RX2600 Integrity servers equipped with Intel’s second-generation IPF processors running at 1 GHz.

The chosen 64-bit software stack consisted of Red Hat’s Linux Enterprise Server 2.1 (beta version) with openAFS for file access and the Load Sharing Facility (LSF) for batch control. The installation process was fully aligned with CERN’s standard procedures for installing and maintaining Linux on the standard 32-bit PC systems.

Several compilers, most notably the GNU and the Intel compilers for Itanium, were installed and updated at regular intervals.

In October 16 additional servers were added to the cluster. These contained Intel’s third-generation Itanium processors, now running at 1.3 GHz. Furthermore, two systems (with a workstation frame) at 1.5 GHz were installed and used to obtain the best-possible benchmarking results. An agreement was reached with HP and Intel that should allow all the installed systems to be upgraded to 1.5 GHz early in 2004. 

 

10 Gbit NIC testing

In a relatively simple experiment, involving Intel’s 10Gbit Ethernet cards in two servers connected back-to-back, we obtained record-breaking speeds. In a memory-to-memory test, using CERN’s GENSINK test program, a single stream was measured at 5.7 Gbps (using large frames). This caught the attention of both the DataTag project and ATLAS DAQ and both groups “borrowed” two Itanium systems in order to saturate their transatlantic lines at 10Gbits. Both groups demonstrated their set-up, at Telecom 2003 in October and RSIS in December. DataTag generated new Internet2 land-speed records, using IPv4, in the tests carried out between Geneva and CalTech.

In openlab where a third Enterasys ER16 router was installed, 10 Gbps LAN tests were also run, but the results concluded that these routers could only reach speeds close to 10Gbps when aggregating traffic. Single-stream traffic between two high-speed servers, however, remained limited to 1 Gbps due to the trunk-based design of these routers.  Enterasys and CERN have agreed on an upgrade strategy of the openlab networking equipment that should allow great improvements to be seen in 2004. In a first phase (towards the end of the year) CERN installed four N7 high-speed switches that should allow more that 300 connections at 1Gbps and half a dozen at 10Gbps (with improved throughput capabilities).

 

StorageTank

IBM joined the CERN openlab in April and, as a result, we installed a large disk subsystem consisting of six meta-data servers and eight disk servers with almost 30 TB of storage capability. Intensive tests of this iSCSI-based disk system were carried out and we believe that the first Data Challenges can be launched early in 2004 to evaluate the performance of this novel approach to disk storage.

 

Application porting, benchmarks, scalability tests

Early in the year, the porting of CERN’s Data Management Package, CASTOR, was carried out successfully. Additioonally, several of CERN’s key software packages, such as CLHEP, GEANT4 and ROOT, were ported to the Itanium systems in collaboration with the EP/SFT staff. The entire ALIROOT framework (including a “private” port of CERNLIB) was also carried out inside the ALICE collaboration. The CMS simulation program, CMSIM was ported by CMS US and the entire reconstruction framework, ORCA, is in progress of being ported. The latter required both the SEAL and POOL packages (supplemented by a large number of external packages) to be ported as well, so the verification effort is more complex.  

The scalability of parallel ROOT queried (via PROOF) was tested on the 32-node cluster and found to be linear. This encouraging result was reported at CHEP2003 and the aim is to try to use 80 – 100 nodes next year for an expanded scalability test.

In a joint activity with Intel, two LHC applications (ROOT and GEANT4) were used to scrutinize the quality of the code generation of Intel’s C++ compiler. Dozens of code snippets (such as the deployment of a rotation matrix) were used in the effort in order to understand how the compiler could best optimize the CERN applications.

On the 1.5 GHz systems, ROOT version 3.10 was benchmarked at about 1000 ROOTmarks when using the Intel compiler and aggressive optimization. A GEANT4 example was submitted for inclusion in SPEC2004 (to increase the number of C++ applications inside this benchmark suite). The final acceptance will be known by the middle of next year.

 

GRID porting

In the summer an ambitious porting effort of the LCG middleware, based on VDT and EDG, was started. The effort had been made unnecessarily complex by the way the original software had been generated (with a large amount of interdependencies and very complex generation procedures), but at the end of the year all but one RPM (from EDG/WP1) had been converted and several intermediate tests had been run successfully. The plan is now to generate IPF-based Worker Nodes (WN) early in 2004 and gradually enhance the software, so that IPF-based Computing Elements (CE) and Storage Elements (SE) can also be deployed. (Mention the HP Press Release here?)

 

SmartFrog

An effort to evaluate HP Labs SmartFrog (Smart Framework for Object Groups) was evaluated. In spite of having initially received a beta version we were able quickly to generate a demo that started and monitored a remote Web service. A technical student will strengthen the SmartFrog effort in 2004.

 

Alice Data Challenge V

Twenty of the IPF servers were “lent” to ALICE for their 5th Data Challenge. The entire DAQ software in general, including the GDC (Global Data Collector) environment, was transferred to the IPF architecture and ran flawlessly during the whole exercise.

 

Fellows/Summer students

Two fellows, largely funded by HP started working in openlab in April. In July/August the openlab team hosted four summer students, working on IPF compilers, 10-Gbps networking, StorageTank testing, and GRID porting.

 

 

 

 
 


Last update: Thursday, 26. January 2012 13:12


Copyright CERN