2026-02-26
inSileco & ArcticNet (since 2023)
RDM, Data & Open Science
RDM: Research Data Management
GBIF: Global Biodiversity Information Facility
A modern LLM is typically trained with 1x10^13 two-byte tokens, which is 2x10^13 bytes. (Yann Lecun, X, 2024)
We need to take good care of all collected datasets
Metadata?
Metadata standards vs Data standards
Metadata standards describe the data itself (context & discovery), while data standards define how the data is structured. Together, they ensure interoperability and reuse.
Dublin Core?
Example Record
Data repositories
Choose a repository that is:
(F) Findable(A) Accessible(I) Interoperable(R) ReusableGoals:
Goals:
Tip
OCUL = Ontario Council of University Libraries
In Canada, Borealis is a national instance of the Dataverse repository hosted by OCUL’s Scholars Portal at the University of Toronto.
Clearly state the license in your metadata, README, or repository record
Important
You may not always be able to share your data but you can always share your metadata.
(C) Collective Benefit(A) Authority to Control(R) Responsibility(E) EthicsGoals:
DMP: Data Management Plan
IRP: Interdisciplinary Research Program
In other words: what is expected of you as a researcher
Network: balance autonomy & coordination
Researchers: manage and document project data responsibly
Building your Data Management Plan
DMP: Data Management Plan
A Data Management Plan (DMP) is a formal document, typically 1-2 pages long, that outlines how data will be handled during and after a project.
DMP: Data Management Plan
Benefits:
Answer these questions with substance and you will have a complete DMP:
DMP Tips
Equip researchers with concrete steps to manage data responsibly, efficiently, and in line with network & funder expectations.
At the end, you should know what steps to undertake to prepare and update an adequate Data Management Plan
Let us create a DMP together. Let’s start here.

Data collection
Documentation & metadata
Storage & protection
Data Analysis
Preservation & archiving
Sharing & reuse
Legal & ethics

Guiding Questions

Quality Assurance / Quality Control (QA/QC)

Organization & naming
_ or -.projectA_samples_2025-03-01_v1.csv)Do & Don’t
✅ lakeC_fieldnotes_2025-03-01_v2.csv
❌ data latest & updated.xlsx


Guiding Questions

ArcticNet’s requirements
ArcticNet’s commitment
Role:
Initiatives


Guiding Questions

Good Practices
Special Considerations


Guiding questions


Good Practices
Builds trust, efficiency, and long-term usability of results


Guiding Questions
Goal: ensure your data remain usable and accessible well beyond the project

Good Practices
Special Considerations

Some notes on file formats
Avoid Proprietary & Unsuitable Formats
Rule of thumb: if a file requires special software, or might lose information when saved, it’s not a good archival format.
Preferred Open Formats by Data Type
Tabular data ➡️ CSV, Parquet
Spatial data ➡️ GeoPackage, GeoTIFF, NetCDF
Images ➡️ TIFF (uncompressed), PNG
Audio / Video ➡️ WAV, MP4 (H.264 codec)
Text / Documents ➡️ TXT, PDF/A, XML, JSON
Metadata ➡️ XML, JSON, standardized schemas (e.g., ISO 19115, Darwin Core)
Choose formats that are:
ArcticNet’s requirements
ArcticNet’s guidance


Guiding Questions
Goal: make data available in a way that is clear, usable, and responsible

Good Practices
Special Considerations
ArcticNet’s requirements
ArcticNet’s guidance


Guiding Questions

Good Practices
Special Considerations
Building your Data Management Plan
DMP: Data Management Plan
A Data Management Plan (DMP) is a formal document, typically 1-2 pages long, that outlines how data will be handled during and after a project.
DMP: Data Management Plan
Benefits:
Answer these questions with substance and you will have a complete DMP:
DMP Tips
Equip researchers with concrete steps to manage data responsibly, efficiently, and in line with network & funder expectations.
At the end, you should know what steps to undertake to prepare and update an adequate Data Management Plan
Let us create a DMP together. Let’s start here.

Data collection
Documentation & metadata
Storage & protection
Data Analysis
Preservation & archiving
Sharing & reuse
Legal & ethics

Guiding Questions

Quality Assurance / Quality Control (QA/QC)

Organization & naming
_ or -.projectA_samples_2025-03-01_v1.csv)Do & Don’t
✅ lakeC_fieldnotes_2025-03-01_v2.csv
❌ data latest & updated.xlsx


Guiding Questions

ArcticNet’s requirements
ArcticNet’s commitment
Role:
Initiatives


Guiding Questions

Good Practices
Special Considerations


Guiding questions


Good Practices
Builds trust, efficiency, and long-term usability of results


Guiding Questions
Goal: ensure your data remain usable and accessible well beyond the project

Good Practices
Special Considerations

Some notes on file formats
Avoid Proprietary & Unsuitable Formats
Rule of thumb: if a file requires special software, or might lose information when saved, it’s not a good archival format.
Preferred Open Formats by Data Type
Tabular data ➡️ CSV, Parquet
Spatial data ➡️ GeoPackage, GeoTIFF, NetCDF
Images ➡️ TIFF (uncompressed), PNG
Audio / Video ➡️ WAV, MP4 (H.264 codec)
Text / Documents ➡️ TXT, PDF/A, XML, JSON
Metadata ➡️ XML, JSON, standardized schemas (e.g., ISO 19115, Darwin Core)
Choose formats that are:
ArcticNet’s requirements
ArcticNet’s guidance


Guiding Questions
Goal: make data available in a way that is clear, usable, and responsible

Good Practices
Special Considerations
ArcticNet’s requirements
ArcticNet’s guidance


Guiding Questions

Good Practices
Special Considerations
Emerging Trends & Opportunities
ArcticNet Data Hub 
Regional Assessment Hub 
Thank you!