Chapter 3 Openscapes Mindset

Please see accompanying slides as this chapter is built out more.

Open data science tools, practices, & communities exist and are powerful and empowering, and game-changing for science. They enable us to do better science in less time. They are like the Force from Star Wars:

  • More powerful than you ever imagined
  • Enable you to broaden the scope of the questions you can ask
  • You can be a Jedi to others: pass forward what you’ve learned
  • You can join & build diverse communities of allies (not all allies are Jedis)

We can harness this power for science more broadly. We can create the culture that we want to be a part of – towards kinder science. We can do this if we:

  • Redefine collaborators & community
    • Think like a team: Future You, Future Us
    • Beyond your own discipline & online
  • Reimagine challenges
    • Expect there is a better way
    • Achieve with confidence, agency, & community
    • You’re not alone, it’s not too late

We are here because I know these files are on your computer — we all have them.

data_final_final.xls
data_final_usethis.xls
...
thesis_v16_new_ch1.docx
thesis_v16.docx
...

And we also send and receive emails with subject lines like:

Re:FWD:Fwd:Data question
Re:Sending again with the correct version

We are going to talk about how to make the data experience better, for you, your lab, your department, and beyond.

3.1 Data science as a discipline

Alternative title: “Data science is a thing”.

No matter what your study system or your question, to Do Your Science you will need to get your data into analytical software, wrangle it (tidy and transform), and make sense of it visually and with models. Very important here: tidy your data first, don’t build your whole analysis around whatever weird format your data may have come in. We’ll talk about tidy data in more detail another day.

R for Data Science

3.1.1 There are concepts, theory, and tools for thinking about and working with data

Just like a field chemistry has concepts for things like moleculte, theory for how they work, and tools for studying them, so does data science — for data.

3.1.2 Emphasis on communication

It is incredible what is possible on the communication front. Watch this one-minute video called What is RMarkdown? to blow your mind.

3.1.3 Not just for “big data”

3.1.4 Your study system is not unique when it comes to data

Think about your data separately from your study system. Don’t confound them or it will be really hard to ask for help.

Expect there is a way to do what you want to do.

This will help you find commonalities and unite you with other lab members and beyond.

3.1.5 Distinguish data questions from research questions, learn how to ask for help

3.2 Open data science tools exist

3.2.1 Tools to match data science theory

Wickham 2017

3.2.2 They exist to streamline working with data

3.2.3 And they are developed by actual people – nice people!

3.2.4 My advice

3.2.4.1 Expect there is a better way

If you’re making the same plot 10 times, stop.

Don’t confound data science with your science. Expect that someone has had your problem before or done what you want to do.

3.2.4.2 Divorce your science question from the data science question

Focus on the operations for the data, not your hypothesis

3.2.4.3 Google your question (ask for help)

Articulate it, and identify useful solutions Trusted urls, recent dates

3.3 Open as a way to work

3.3.1 Open science as a way to be more efficient and streamlined

Not an added ask at publication to share your data

It’s not only about sharing data. It’s about how you work, who you include, and the tools that you use.

3.3.2 External memory (personal and collective)

Easier on/offboarding

3.3.3 Find solutions faster – learn to talk about your data

3.3.4 Build confidence – skills are transferable beyond your science

3.3.5 Be empathic and inclusive – grow a network of allies

3.4 Lab members as a team

Science is collaborative. Not heads down elbows out.

3.4.1 Focus on what unites lab members, not what sets them apart

3.4.2 Think of the lab horizontally as skillsets & needs instead of vertically as science bins

Instead of the skills you have when you come to the lab determining how you will be able to Do Science, have shared practices in the lab and paths to onboard new people to work that way as well.

3.5 Learn with collaborators and community (redefined)

Communities for learning, teaching, and mentorship.

3.5.1 Helps overcome isolation, self-taught bad practices, apprehension

Stevens et al. 2018 ### Your most important collaborator is Future You

Cannot emphasize this enough. Work now so that you can succeed later (whether that’s this afternoon or 4 years from now)

3.5.2 Communities beyond the colleagues in your field

3.5.3 Learn from, with, & for others

3.6 The internet as an underleveraged tool for science

3.6.1 Twitter for learning

Follow selectively, listen & learn (e.g. #rstats)