Greetings from the iPlant Collaborative! With this e-newsletter, we are pleased to launch our updated website and a refreshed logo. Our website (www.iplantcollaborative.org) has been redesigned to better present iPlant’s informational materials; eventually it will become the platform for community collaborations and discovery environments. The new logo provides a stronger visual identity for iPlant as the project moves into a new phase of cyberinfrastructure development. Please feel free to submit comments on the new website or logo to feedback@iplantcollaborative.org.
Status Report: iPlant Genotype-to-Phenotype Grand Challenge Kickoff Meeting
By Steve Goff, iPlant Project Director, Steve Welch, iPG2P GCT Lead, and Martha Narro, iPlant EOT Director
The planning group for iPlant’s second Grand Challenge Project, iPlant Genotype to Phenotype (iPG2P), met in Chicago on July 27th–28th, 2009, to establish working groups with distinct focal areas and to develop high-level implementation plans for the sub-projects arising from these focal areas. The group comprised 16 community collaborators, including Steve Welch (Kansas State University), Ruth Grene (Virginia Tech), and Tom Brutnell (Cornell) as leads of Grand Challenge Workshops held in late 2008, and eight iPlant faculty and staff members, including Matt Vaughn (Cold Spring Harbor Laboratory), iPlant’s Scientific Lead for the Engagement Team working with this Grand Challenge project. Dan Kliebenstein (UC Davis) served as the plant science community facilitator.
Dan Stanzione, iPlant’s co-PI and Deputy Director of the University of Texas Austin Texas Advanced Computing Center (TACC), served as the Cyberinfrastructure and High-Performance Computing expert at the kickoff meeting. Recognizing that the effort to accurately connect genotypes and phenotypes will require high-performance computing resources, he has arranged for at least one million CPU hours dedicated from one of the TACC computer clusters (note that TACC has several large clusters of computers uniquely networked into a very high-efficiency system; see http://www.tacc.utexas.edu/resources/hpc/).
To further refine the CI requirements that will be needed to address this Grand Challenge, five iPG2P Working Groups were established to create detailed technical plans for the selected sub-projects. iPlant’s goal is to create cyberinfrastructure and related computational tools that are broadly applicable to plant science research, and the five working groups illustrate this approach . The groups are listed in rough order of data flow through the CI with the first three having a heavy data orientation and the last two zeroing in on analysis and prediction.
- NextGen Sequencing: this Working Group will create pipeline tools allowing the efficient use of next generation sequencing data by members of the plant science research community interested in genotype-to-phenotype relationships. Tom Brutnell will lead this group with Steve Rounsley (iPlant) serving as the co-lead.
- Data Integration: this Working Group will deal with the infrastructure necessary to combine/overlay existing data to permit deeper insights into biological mechanisms, generation of hypotheses, evaluation of models, and various other practical applications. Doreen Ware (iPlant) will lead this group with Chris Jordan (TACC) serving as the co-lead.
- Visual Analytics: this group will focus on the use of modern visualization approaches to enhance extraction of knowledge from large data analysis and/or modeling efforts. Ruth Grene will lead this group with Greg Abram (TACC) serving as the co-lead.
- Inferential Tools: this Working Group will address statistically-based tools for use in inferring genotype-to-phenotype relationships ranging from marker-trait associations to links in biochemical or signal transduction pathways and/or protein interaction networks. Dan Kliebenstein will lead this group with Ed Buckler (Cornell) serving as the co-lead.
- Integrated Modeling Framework: this Working Group will adapt and/or develop modeling frameworks and tools to support the construction, parameter and confidence estimation, sensitivity analysis, verification testing, and utilization of models. Chris Myers (Cornell) will lead this group with Jeff White (USDA-ARS) serving as the co-lead.
Leaders and co-leaders of each iPG2P Working Group were nominated by plant science community participants attending the kickoff meeting and comprise the project steering committee. Likewise, Working Group members were selected by these participants. Both meeting participants as well as appropriate community members not present were considered in the nominations and selections. Final membership in these Working Groups is now being decided, and each Working Group will be tasked with generating project details to develop the appropriate CI. Community members participating in these Working Groups will likely rotate in and out of the groups based on interest and availability.
The iPG2P Grand Challenge team is particularly interested in phenology (such as flowering time), drought stress, photosynthesis, and applying genotype-to-phenotype results in the context of plant breeding. Some of the Working Groups’ efforts, especially in data integration and visualization analysis, will have synergy with the iPlant Tree of Life (iPToL) Grand Challenge Project and will be coordinated with those efforts. Wherever possible, iPlant efforts will consider how developments in support of the plant science research community can also benefit and work together with humanitarian research and development projects globally.
Certain efforts within the iPG2P project, e.g., statistical analysis capability, are fundamental to making progress in associating genotypes with phenotypes. It was also recognized that “data integration” is an enormous effort and success in data integration will need to be made with a relatively small set of the most important and relevant data sets. Therefore, tools should be flexible and allow for virtual integration where possible, storage of intermediate analysis results, and retrieval of those results for multiple applications. Modular design is important, especially in the modeling framework. The community at large needs to drive data standardization to allow efficient data integration since this will be too big an effort for any single group. Rather than force data standards on the community, iPlant will provide tools that allow existing and new standards, as appropriate, to be adopted.
iPlant’s Director of Education, Outreach, and Training (EOT), Martha Narro, led a discussion on the EOT opportunities that could be leveraged within the context of the iPG2P Grand Challenge. The Grand Challenge team members discussed a number of possible EOT projects, a few of which are recounted below.
Two phenology projects that would engage students in plant science research were described. One would involve students in a distributed phenotype screening effort using a model organism for grasses, Brachypodium distachyon (http://en.wikipedia.org/wiki/Brachypodium_distachyon). Since this would be the first attempt at a systematic, large-scale screen of a mutagenized B. distachyon population, students would have the opportunity to discover phenotypes of interest to researchers. The second phenology project the iPG2P group discussed would leverage the National Phenology Network’s (http://www.usanpn.org/) Project Budburst (http://www.windows.ucar.edu/citizen_science/budburst/index.php). Educators, students, and the public would monitor the timing of various plant life cycle events in species of interest to the iPG2P group. A recently developed iPhone application could be used to upload image and GPS data directly to the National Phenology Network database. Students participating in either phenology project could ask questions, share data, and discuss their results with researchers through an iPlant Discovery Environment.
Another education outreach project would involve using a simulation to engage students in learning about photosynthesis and plant productivity. The simulation would enable students to input parameters choosing from resources such as existing U.S. soil maps or historical climate data. A prototype simulation that was tested with high school students was well received, but needs further development.
Growers are also an important stakeholder group for iPG2P outreach and training. The iPG2P team discussed the need for Discovery Environment tools designed to enable growers to make informed decisions based on knowledge of the regional climate forecast for a growing season coupled with crop and cultivar data. The Southeast Climate Consortium’s AgroClimate project (http://agroclimate.org/), a tool that serves growers in the Southeastern United States, could be leveraged to provide a similar service to growers in other parts of the country.
Follow these links for more information on the progress of iPG2P Grand Challenge Team (click here), as well as the Tree-of-Life Grand Challenge Team (click here).
NSF Grant Awarded for Simple Semantic Web Architecture and Protocol
By Damian Gessler, iPlant Semantic Web Architect
iPlant Semantic Web Architect, Damian Gessler, has been awarded a two-year, $761,000 National Science Foundation (NSF) grant in semantic web services. Specifically, the work will expand the use of ontologies from their current form as static knowledge management structures into dynamic vocabularies for the semantic web; it will advance our capabilities in context-sensitive semantic searching, and it will expand our Education, Outreach, and Training in semantic web services.
The vision of the semantic web is ambitious. Today, the web exists as billions of web pages— 'documents' in the language of the web. Currently, there is no systematic, non-arbitrary way to tag information in documents with computer-discernable meaning. For example, "72F" could be the temperature in Fahrenheit on one web page, an airline seat on another, a technical section header on a third, and so forth. The vision of the semantic web is to associate data with meaning such that computer programs could assist in generating a more informative, productive web experience. As just one of many examples, in the grand vision of the semantic web you could have an appointment calendar that linked the date and the city you will be visiting with the weather, your hotel information, and local current attractions, etc. In biology, it could link genomic data with the phylogenetic, the evolutionary, the proteomic, the metabolic, and so forth. The semantic web excels where data connectivity is elusive; where connections may be unknown at design time and contributed by multiple, independent actors, and where value and context is subject to change.
Web services recognize that much of what is done deep within monolithic computer programs could be more efficiently deployed if made easily available as discrete services over the web. Web services specifies a series of protocols that allows one to stitch together disparate functionality—services—under a common mechanism of invocation and extraction. Web services excel where data connectivity is structured; where value can be identified at design time; and in areas where controlled availability, security, and hidden context contribute important qualities.
The semantic web and web services both deliver important functionality. But they exist as separate and disjoint technologies. The semantic web lacks formal web service protocols just as web services lack the explicit semantics and formal logic of the semantic web. The NSF award to Gessler funds research in a novel hybrid approach that integrates aspects of the semantic web and web services into a single semantic web services protocol and architecture called SSWAP. SSWAP (pronounced “swap”) is an acronym for Simple Semantic Web Architecture and Protocol. For more information about SSWAP, visit http://sswap.info, http://en.wikipedia.org/wiki/SSWAP, and http://www.biomedcentral.com/1471-2105/10/309/abstract.
A Meeting of the Minds: iPlant Developers Summit at TACC
By Dan Stanzione, iPlant Co-PI, and Sonya Lowry, iPlant Sr. Software Engineer

The first iPlant cyberinfrastructure (CI) team all-hands meeting was held in September in Austin, Texas, the only place 'weird' enough to handle such an event! This meeting brought together nearly two dozen iPlant staff based in Cold Spring Harbor Laboratory, University of Arizona, and our newest partner, The Texas Advanced Computing Center (TACC) at The University of Texas at Austin. TACC, which joined iPlant when Co-PI Dan Stanzione became TACC’s Deputy Director in July, is one of the nation’s premiere centers for advanced computing and a leading partner for the National Science Foundation’s TeraGrid. TACC’s mission is to enable discoveries that advance science and society through the application of advanced computing technologies and certainly, the iPlant Collaborative provides a phenomenal opportunity to leverage TACC’s mission, systems, and staff for maximum impact in the scientific community.
The face-to-face meeting of the CI team allowed members across sites to get to know one another and develop the working relationships needed to build the quality software that will address the needs of the plant biology community and solve Grand Challenge questions. To collectively answer the essential question of "What tools do the Grand Challenge teams need and how do we build them," iPlant’s software developers, experts in user requirements, semantic web architecture, visualization tools, and data integration, together with the Engagement Teams' scientific leads and project managers, spent a day and a half examining how quality improvement techniques impact the development of production software, the development process itself, collaboration and communication techniques, and the formulation of the user stories on which software is engineered.
The two days in Austin were not all work and no play, however; the team also toured TACC's world-class hardware facilities available to the project, such as Ranger, the world’s #8 supercomputer; Longhorn, the world's largest remote visualization system; Stallion, the world's highest resolution tiled display; and Corral, the petabyte storage facility. Said Sonya Lowry, iPlant’s Senior Software Engineer, "this meet-and-greet allowed us to strengthen relationships and best practices, making us more effective in using our own collaboration tools in the development process."
Organizational Changes at iPlant
After more than two years at the helm of the iPlant Collaborative, Rich Jorgensen recently announced his intention to step down from his current role and focus his energies on scholarly research. Jorgensen has been involved with iPlant since its conception nearly 3 years ago, and has taken iPlant from an idea to the robust, community-driven organization it is today. Jorgensen’s contributions over this time have been centered on keeping iPlant focused on its central tenet of being a cyberinfrastructure organization "by, for and of the community." Said Jorgensen, "Now that iPlant has been successful in achieving the major goals of its first phase of community and team building and is transitioning into a distinct, new phase of developing cyberinfrastructure for the biological sciences, new leadership is appropriate for the long term success of the Collaborative. This has truly been one of the most challenging and most worthwhile projects with which I have ever been associated, and it truly has been a privilege to have been able to work with such a talented and dedicated team."
The iPlant Faculty Advisory Committee and Board of Directors, in close conjunction with the National Science Foundation, have initiated a process that will ensure a smooth leadership transition and no loss of momentum. Steve Goff and Dan Stanzione, the remaining two members of the iPlant Executive Committee, will serve as Co-Directors of the project moving forward. The complementary expertise of the Co-Directors is representative of iPlant’s continued commitment to be a collaborative effort between the biological and computing communities. In addition, the large number of collaborating plant biologists and computational scientists working on the community-driven Grand Challenge Teams and the Scientific Opportunities Team will provide tremendously valuable cross-disciplinary input. Goff and Stanzione will continue to work closely with these leaders from the community.
|