4 Data integration
4.1 Species characteristics
To develop a comprehensive national timing windows dataset, we integrate multiple datasets covering various aspects of fish traits and phenology. This integration ensures that life history characteristics, ecological interactions, and temporal patterns relevant to timing windows assessments are well-represented. By harmonizing data across sources, we establish a structured foundation for evaluating the seasonal and ecological constraints of freshwater fish in Canada.
4.1.1 Data Sources
The datasets used in this integration provide a diverse range of attributes, from taxonomic classifications to physiological tolerances and habitat preferences. Below is an overview of the key data sources:
Dataset | Description | Reference |
---|---|---|
Ontario Freshwater Fishes Life History Database | Covers 161 species in Ontario, including 43 life history traits such as habitat, spawning season, thermal regime, fecundity, and lifespan. | Ontario Freshwater Fishes |
FishPass Database | Focuses on biological attributes influencing fish passage, including morphology, physiology, phenology, and behavior for 220 species. | FishPass |
North American Freshwater Migratory Fish Database (NAFMFD) | Provides migration data for 1,241 species, detailing migratory behaviors across North America. | NAFMFD |
Roberge et al. (2002) | Documents stream habitat requirements for 86 fish species across different life stages in British Columbia and Yukon. | Roberge et al. (2002) |
Dahlke et al. (2020) | Compiles experimental and imputed thermal tolerance data, thermal safety margins, and responsiveness for multiple species and life stages. | Dahlke et al. (2020) |
FishBase | Global database providing extensive taxonomic, ecological, and biological data, including growth, diet, reproduction, and distribution. | FishBase |
4.2 Fish Traits Integration
4.2.1 Rationale
Understanding fish traits is fundamental for evaluating species-specific responses to environmental changes, including their vulnerability to anthropogenic stressors and climate variability. By structuring data into thematic tables, we facilitate the identification of key ecological attributes that influence timing windows. These traits encompass habitat preferences, reproductive strategies, morphological adaptations, and physiological tolerances, among others. The integration of these datasets into a national framework ensures consistency and accessibility for environmental assessments and conservation planning.
4.2.2 Thematic Trait Tables
The following tables outline the thematic structuring of fish traits, incorporating data from multiple sources:
Category | Table Name | Output File | Source(s) |
---|---|---|---|
Habitat (Adult) | habitat_adult | habitat_adult.csv |
FishBase, FishPass, Ontario Freshwater Fishes, Roberge (2002) |
Habitat (Juvenile) | habitat_juvenile | habitat_juvenile.csv |
FishBase, FishPass, Ontario Freshwater Fishes, Roberge (2002) |
Habitat (Spawning) | habitat_spawning | habitat_spawning.csv |
FishBase, FishPass, Ontario Freshwater Fishes, Roberge (2002) |
Habitat (YOY) | habitat_yoy | habitat_yoy.csv |
FishBase, FishPass, Ontario Freshwater Fishes, Roberge (2002) |
Food Items | food_items | food_items.csv |
FishBase |
Spawning | spawning | spawning.csv |
FishBase, FishPass, Ontario Freshwater Fishes |
Eggs | eggs | eggs.csv |
FishBase |
Larvae | larvae | larvae.csv |
FishBase |
Migration | migration | migration.csv |
FishBase, FishPass, NAFMFD |
Morphology | morphology | morphology.csv |
FishPass, FishBase, Ontario Freshwater Fishes |
Tolerance (Adult) | tolerance_adult | tolerance_adult.csv |
Dahlke (2020) |
Tolerance (Embryo) | tolerance_embryo | tolerance_embryo.csv |
Dahlke (2020) |
Tolerance (Larvae) | tolerance_larvae | tolerance_larvae.csv |
Dahlke (2020) |
Tolerance (Spawner) | tolerance_spawner | tolerance_spawner.csv |
Dahlke (2020) |
Taxonomy | taxonomy | taxonomy.csv |
FishBase |
Picture | picture | picture.csv |
FishBase |
Swimming Behavior | swimming | swimming.csv |
FishBase |
4.2.3 Data Processing and Standardization**
Once loaded, the data undergoes multiple transformations to harmonize species identifiers, format variables consistently, and extract relevant information for each category.
4.2.3.1 a. Species Identification
Some datasets lack a species_id
field. These are standardized by: - Extracting species names and joining with a reference list of species (spList
). - Ensuring consistent taxonomic names across sources.
4.2.3.2 b. Data Extraction and Formatting
- Habitat:
- Extracted from multiple sources (
FishBase
,FishPass
,Roberge
, andOntario Freshwater Fishes
). - Life-stage-specific habitat information is categorized (
adult
,juvenile
,spawning
,YOY
).
- Extracted from multiple sources (
- Food Items:
- Taken from
FishBase
and standardized.
- Taken from
- Spawning:
- Data on fecundity, spawning cycles, and seasonal cues are extracted and formatted.
- Spawning months are converted into binary monthly indicators.
- Eggs & Larvae:
- Egg development parameters (
FishBase
) are directly used. - Larval duration and environmental requirements are processed from
FishBase
.
- Egg development parameters (
- Migration:
- Different migration types (
anadromous
,potamodromous
,diadromous
) are extracted fromFishBase
,FishPass
, andNAFMFD
.
- Different migration types (
- Morphology:
- Merged from
FishBase
,FishPass
, andOntario Freshwater Fishes
. - Body shape, length, and physiological characteristics are included.
- Merged from
- Tolerance:
- Data from
Dahlke (2020)
is split by life stage (adult
,embryo
,larvae
,spawner
).
- Data from
- Taxonomy:
- Extracted from
Freshwater Fish Canada
dataset and separated into genus, species, family, and order.
- Extracted from
- Pictures:
- Image URLs are retrieved from
FishBase
.
- Image URLs are retrieved from
- Swimming Behavior:
- Extracted from
FishBase
to categorize locomotion patterns.
- Extracted from
4.3 Fish Phenology Integration
4.3.1 Rationale
Phenology, or the timing of biological events, is a critical factor in determining species-specific timing windows for migration, spawning, and early life stages. By integrating data from multiple sources, we systematically capture the seasonal variations in key life processes. This allows for the identification of species-specific windows when fish are most vulnerable to environmental stressors or anthropogenic impacts.
4.3.2 Thematic Phenology Tables
The following tables capture species phenology, incorporating data from multiple sources:
Category | Table Name | Temporal Dimension | Source(s) |
---|---|---|---|
Migration | migration_timing |
Seasonal movement periods (e.g., spring vs. fall migration) | FishBase, FishPass, NAFMFD, Roberge (2002) |
Spawning | spawning_timing |
Monthly spawning presence, peak spawning months | FishBase, FishPass, Ontario Freshwater Fishes, Roberge (2002) |
Larvae | larvae_timing |
Seasonal larval presence (last spawning month + next month) | FishBase, Roberge (2002) |
These tables provide a structured approach to analyzing seasonal life cycle events and their ecological implications for freshwater fish in Canada.
4.3.3 Integration Process
The integration of fish phenology data follows a structured workflow:
- Data Collection & Preprocessing
- Input data is gathered from multiple sources (
FishBase
,FishPass
,NAFMFD
,Ontario Freshwater Fishes
,Roberge 2002
). - Species IDs are standardized, and datasets lacking direct species identifiers are joined to a reference species list.
- Input data is gathered from multiple sources (
- Spawning Timing Calculation
- The spawning table compiles monthly spawning presence using available data.
- If multiple records exist, North American sources are prioritized; otherwise, values are aggregated.
- Migration Timing Inference
- Migration timing is inferred based on spawning months and species migration category.
- If a species is migratory (
anadromous
,potamodromous
, etc.), migration occurs one or more months before spawning.
- Larvae Timing Determination
- The last observed spawning month is identified per species.
- Larvae presence is assigned to the last spawning month and the following month.
- Final Data Export
- Processed data is structured into three output files:
spawning.csv
,migration.csv
,larvae.csv
.
- Processed data is structured into three output files:
This workflow ensures that the timing windows for key life processes are systematically captured and formatted for ecological analysis.