Better science for future us
“Better science for future us” is science that is more efficient, reproducible, open, inclusive, and kind. There are growing examples of better science in environmental and Earth science, and beyond, including from the Ocean Health Index team: Our path to better science in less time using open data science tools (Lowndes et al. 2017).
Here we also introduce the Pathways concept that teams will develop throughout the Champions program. The Pathway is based on Table 1 in Lowndes et al. 2017, and helps teams deliberately identify data workflow practices and next steps to facilitate efficiency and open practices in terms of reproduciblity, collaboration, communication, and culture.
Pathways to better science in less time
Figure 1 of Lowndes et al. 2017 shows that open data science tools increased the ease of reproducibility and the ease of collaboration for the Ocean Health Index (OHI) team. But it was not the tools alone - it was the process the team co-created and prioritized.
Create space
Creating space means committing to synchronous collaboration convenings to learn and teach each other together.
A critical first part of this was prioritizing time (which included getting buy-in, showcasing). Then, this meant that the team could focus time on:
- Getting comfortable talking about data/workflows
- Building trust (to share imperfect work)
- Recognizing that what we invest incrementally will have large dividends in the future
Create place
Creating place is critical for asynchronous collaboration. It is a place for code, shared practices, resources, conversations. Critically, this involves making sure that everyone on the team is comfortable contributing through these channels. This means both with the technology, and the culture of the team.
Asynchronous collaboration often requires some form of version control so the team can understand what documentation, data, code, graph, etc., is current. Specific places depend on the tools and platforms used in a given organization or research group. These can include GitHub Organizations, Repositories, Issues, Projects; Google Drive Folders, Docs, Spreadsheets, Slides, Calendars; Slack Organizations and Channels; Teams, Sharepoint; JupyterHubs; etc.
Find the common
Through creating space and place, teams will find the common workflows, tools, skills that they already have and need to do their work.
Documentation was a key part of this. And, writing documentation “for nobody” is very hard, and it’s a huge task. We prioritized documentation based on Onboarding and Offboarding: for our future selves first, and then future us.
Shifting incrementally
Shifting workflows takes time, particularly because it is most often done while also meeting existing deliverables and deadlines. It requires changing behaviors and habits, which takes time, and is messy, and really depends on the trust built with the team.
Reproducibility & communication enabled by open tooling
RMarkdown/Quarto to reimagine data analysis and communication. RMarkdown/Quarto combines analyses & figures together, rendered to your reporting output of choice.
Impact of shifting to open science
Here are a few examples to showcase what is possible and being done in environmental science.
- Regime Shifts in R & Data Science within the BC Public Service Observations from the field - Stephanie Hazlitt, Government of British Columbia, slides from CascadiaRconf keynote
- NMFSReports: Easily write NOAA reports and tech memos in R Markdown! - Emily Markowitz, NOAA Alaska Fisheries Science Center, slides from CascadiaRconf talk
- Tampa Bay Estuary Program
- Automated reporting in Tampa Bay with open science (blog) - Marcus Beck, Tampa Bay Estuary Program, Openscapes blog
- TBEP’s Data Management Workflow and open science cake
- Coordinated monitoring of the Piney Point wastewater discharge into Tampa Bay: Data synthesis and reporting, 2023. Florida Scientist, 86(2), pp.288-300 - Beck, M.W., Burke, M.C., Raulerson, G.E., Scolaro, S., Sherwood, E.T. and Whalen, J.
Further resources
Not so standard deviation podcast
Parker & Peng, http://nssdeviations.com. Great discussions about data concepts and “in the wild”. Start with Episode 9: Spreadsheet drama
Practical computing for biologists
Haddock & Dunn, http://practicalcomputing.org. Software & computing concepts already on your computer. Start with Chapter 2: Regular expressions