Subproject 4:
Analysis of Marine Genomic/Environmental Data

The fundamental idea of this research will be to incorporate genomic data from environmental shotgun sequences with more standard biological/oceanographic data with a view towards understanding the interactions of the genomic characteristics of marine organisms with environmental factors. The characterization and analysis of spatial and temporal variability will require integrating genomic data with a disparate array of complementary data on the marine environment. Appropriate methods of statistical analysis will depend on the nature of the genomic data available (e.g., presence/absence, diversity and/or relative abundance), as well as the number and types of available oceanographic measurements along with their associated time and space scales. Available marine data includes various physical, biological, and chemical variables measured as time series from fixed mooring arrays, depth profiles from instrument drops, as well as spatial fields from satellites and outputs from numerical ocean models.

The diversity in the scale of sampling will also be a challenge at the sites being considered (coastal, continental shelf and deep-ocean) The appropriate methods for spatial and temporal data analysis will depend on the sampling rates, the types of genomic and marine environmental data used for predictive purposes. The methodology will vary according to the scientific questions posed. The overall modeling scheme must be sufficiently general to accommodate binary, qualitative or quantitative outcomes and predictors, and needs to allow for the modeling of short or long series at different scales, with non-uniform sampling rates. The methods must also be able to incorporate mechanistic models of environmental processes.