Cyberinfrastructure for ICME
Team Members: T. Haupt, R. Carino
The primary focus of the ICME vision is establishing a knowledge base accessible to the community-at-large for solving a plethora of disparate issues in material science, applied mechanics, and engineering. This knowledge base requires collection of experimental data describing phenomena at different scales (exploratory experiments, calibration of material models, and validation of models), performing simulations at different scales (atomic, molecular, dislocation, crystal-plasticity, macro-scale FEA), and linking all this information together to determine structure-properties relationships, thereby leading to new concepts and design of new materials. In addition to pushing the edge of material science and solid mechanics by supporting the development and validation of new methods, particularly in the area of multiscale modeling which requires multidisciplinary expertise, the knowledge base is further expected to be used for engineering design optimization and to support workforce training, including enhancing academic curricula at the graduate level.
It follows that managing the ICME knowledge base directs the principal rationale and objective for establishing a VO. Management entails gathering, developing, integrating, and disseminating experimental data, material models, and computational tools, as well as their use for material and product design. Consequently, the Engineering Virtual Organization for CyberDesign (EVOCD, http://icme.hpc.msstate.edu) is dedicated to the accumulation of the “intellectual capital” pertaining to ICME. It is the organization’s capital that attracts community participation in the organization. There are three critical aspects to the process of accumulating capital in order to create a relevant organization: (1) protection of intellectual property, (2) quality assurance of information, and (3) the management of complexity.
In a competitive environment in which materials research and development is performed, protection of the intellectual property is imperative. While the information meant for public consumption must be clearly attributed to its creators, the innovative research may require a restriction of information (e.g., pre-publication or proprietary data) exchange to only a narrow group of collaborators. The VO must support the former, and enforce the latter. Quality assurance of the information must include its pedigree and then its validation, followed with approval by either a curator or a peer-review process. The management of complexity implies that the information must be easily navigable through intuitive interfaces, yet all complexity of the underlying infrastructure must be hidden from the end user. Furthermore, the information must be understandable to students and directly and efficiently accessible to practitioners.
Notably, many other currently established VOs facilitate wide-spread collaborative efforts to process huge datasets (petabytes of satellite images, collider data, astronomical observations, etc.) and require a network of petaflop-range supercomputers to solve specific grand-challenge problems. In this sense, EVOCD is different: it is a cyberspace where the participants can solve their own scientific and engineering problems. On the other hand, EVOCD complements the efforts of nanoHub, 3D Material Atlas, MatDL, NIST Data Gateway, and when linked together with these portals, it will become part of the global ICME cyberinfrastructure.
EVOCD has been developed with the primary goal of accumulating and protecting the intellectual property generated by the participants of the organization. The portal provides powerful passage for accruing and exchanging community knowledge as well as access to repositories of experimental data, material models and computational tools at different length scales, which together exploit the integrative nature of ICME. To achieve this goal, EVOCD is comprised of four primary functional components that are the foundation of the VO: (i) Knowledge Management; (ii) Repository of Codes; (iii) Repository of Data; (iv) Online Calibration Tools.
Knowledge management has been achieved by applying an “architecture of participation” as advocated and implemented by Web 2.0 concepts and technologies. Tools like Wiki lead to the creation of a collective (read: peer-reviewed) knowledge database that is always up-to-date with a structure that spontaneously evolves to reflect the current state of the art. Therefore, we have chosen Wiki as the mechanism for community-driven knowledge management.
The Wiki has become the façade for the EVOCD portal to accumulate the knowledge pertaining to ICME. The Wiki captures the knowledge about different classes of materials (metals, ceramics, polymers, and others), material models at various length scales, and design issues, from process and performance models, to optimization under uncertainty, to bio-inspired design. In addition, the Wiki provides direct access to resources, such as data and code repositories.
The intellectual property is protected by configuring the Wiki server  to restrict creation and editing of pages to only registered users verified by their email addresses. As a result, all contributions are uniquely attributed to their authors. Following the model introduced by Wikipedia, the quality of contributions is guaranteed by the Web 2.0 process but further monitored by the Wiki editors.
Repository of Codes
ICME applies computational methods to material science, applied mechanics, and engineering. A significant part of the knowledge is therefore captured as software artifacts from implementing material models, as well as simulation, modeling and optimization codes. Realization of ICME thus critically depends upon providing the capability to gather and disseminate the information about these software components, which is, in turn, an imperative part of the VO’s intellectual capital. Consequently, EVOCD serves as the repository of open-source codes contributed by the EVOCD participants. Each code is accompanied with documentation (installation instructions, user manual, theoretical background, and examples). In addition to the open-source material models, the repository provides tutorials and examples for popular commercial or otherwise proprietary codes (such as ABAQUS and LAMMPS). The repository of codes complements the knowledge captured in Wiki, enabling the EVOCD user to reproduce the results reported there.
The intellectual property is further protected by restricting access to the actual SVN repository. Only individually-approved contributors have the privilege to offer new revisions. The contributed codes are made available to the general public through a read-only SVN mirror that serves as the back-end for the Web SVN client (open source ViewVC) . All codes downloaded through the ViewVC client are subject to Mississippi State University (MSU) policies and disclaimers. Because of these arguably restrictive policies, many codes listed and documented in the EVOCD repository are available from other locations specified in the repository, typically web sites of their developers or vendors’ web sites. This is the beginning of the “supply chain” envisioned as being the foundation of the global cyberinfrastructure for ICME. The quality of the codes is assured by the fact that they have been used to generate results described in the EVOCD Wiki.
Repository of Data
Experimental data is another critical component of the intellectual capital captured by EVOCD. At this time, EVOCD focuses on force-displacement, stress-strain, strain-life (fatigue), and materials characterization data, such as images of microstructure, all of which complement the data repositories offered by other ICME cyberinfrastructure participants, e.g., 3D Material Atlas. The significance of the data types supported by EVOCD is that they are necessary for the development of Internal State Variable (ISV) material models  used in hierarchical multiscale modeling. The ISV-based models are described in detail in the Wiki pages, and the codes that implement them are available from the repository.
This time, the intellectual property is protected at two levels. At the first level, similarly to the protection of Wiki and repository of codes, only registered users are allowed to contribute. At the second level, the data repository is under access control. To this end, each data request is augmented with SAML-based credentials that are checked against the custom-developed Community Authorization Server (CAS) . This authorization mechanism allows each user to create a group and invite a selected group of users to participate in the group. Only the members of this group are permitted (subject to CAS authorization) to upload data to the group folder. The group moderator (the group creator, or a group member appointed by the group creator) makes the decision to keep the data private, i.e., visible only to the group members, or to make the data “public” by granting read-only access to all users. This group-based access control mechanism is used to exchange restricted-access data, an essential tool for collaborations within EVOCD.
The issue of data quality is addressed in several ways. First, metadata is provided to reveal the pedigree of the data. The information included in a metadata record and pertaining to the data quality is generated automatically from the user session and the mandatory header of the data file. The data is automatically rejected by the system if any critical information is missing (e.g., initial temperature or strain rate for stress-strain data). Most of the publicly available data in the repository have been published in professional journals, thus verified by a peer-review process, and described in the Wiki pages. Non-published data are typically hidden from the general public (group data) and require verification by the group members. Finally, data generated by students are subjected to approval by a curator, most often an academic advisor, and therefore are stored in private group folder. The assurance of the data quality is an example of standardizing the organizational patterns of interactions between the participants defining the organization.
Online Calibration Tools
The derivation of the material constants from the experimental data to be used by a particular material model is referred to as model calibration, and the capability of model calibration is yet another distinguished feature of EVOCD. Currently, the EVOCD provides three online models for calibration: Plasticity-Damage, Multistage Fatigue, and Thermoplastic; there is also an Image Analysis tool for material characterization . The models are contributed to EVOCD by MSU researchers, available to all users, and their quality scrutinized by the community-at-large. In addition to an intuitive user interface, the tools are functionally integrated with the data repository to facilitate their use; therefore, a selected data set can be seamlessly loaded into the tool, even if it requires data format translation. This defines two important patterns of use possible with EVOCD: (1) the user uploads experimental data, performs model calibration, and saves the material constants in the data repository; (2) the user searches for the constants of a particular model of a particular material and retrieves the constants for further analysis, typically to use them in numerical simulations, such as finite element analysis using ABAQUS, LS-Dyna, or other software.