Despite the collaborative nature of scientific research, a key component – data analysis – can be a lonely burden. Performed by researchers who lack formal training in data and open science, such analyzes are often efforts that scientists must undertake themselves, reinventing the wheel as they do so.
Furthermore, when we become faculty members, lecturers and project managers, we may feel unqualified to establish more responsible data practices and feel unsupported in this effort, despite the growing need. We found a sustainable approach to instituting more responsible data practices in our research groups through Openscapes, originally funded by open-source software company Mozilla in Mountain View, Calif., and the National Center for Ecological Analysis and Synthesis (NCA).
There is a mentorship program conducted by NCEAS. ) in Santa Barbara, California. OpenScapes has helped us supercharge our research, and we have advice on how others can ignite change in their teams.
As we originally understood it, open science had little relevance or benefit to our daily research – mainly because it was unclear how to apply it. We interpreted the concept narrowly to only share data at publication, and we assumed that data science was applicable only to big data and machine learning. Existing software tools that could automate data analysis seemed out of reach as we quietly handcrafted our own approaches to writing code and analyzing data.
Now, we have reframed data analysis as a collaborative effort rather than an individual burden. We as a team discuss our data challenges regularly, starting with the hope that better approaches and tools exist and that we can find them together.
Our idea of open data science R developer Hadley Wickham’s definition of data science – “turn[ing] into understanding raw data” – with open science tools and practices, such as the use of collaborative version-control platforms for code and project management do.
Empowered with our new approach, we are instilling such practices in our groups by creating workflows that facilitate reproducibility and data sharing, and streamline code organization and collaboration. All of our approaches are centered around an ‘open’ ethos.
This change requires a change in mindset, as well as investment in skill development and team-building. Here are three ideas for how research groups can be started, and plans to start this change in ten weeks (see ‘The ten-week plan for open data science’).
1. Normalize Data Discussions
Create digital and physical spaces where group members – despite having different research questions and expertise – feel comfortable discussing data challenges and seeking, offering and accepting guidance from each other. regularly scheduling data-centric meetings demonstrates that this is a priority and fosters a more open culture; Naming these meetings can give them value and identity. (NCEAS’ Ocean Health Index team calls these meetings Seaside Chats.)
You don’t have to be an expert to start a conversation about data in your research group. However, you need to be comfortable enabling group members to learn from, with and for each other—which means encouraging them to engage with coding communities online and in person.
For example, they can follow #rstats discussions on Twitter, participate in or organize in-person coding clubs and ‘hacky hours’, or contribute documentation and tutorials to open-source projects. can. Encouraging horizontal leadership within your research group is key to seeding better data practices and evolving with the ‘software-scape’.
2. Identify and Address Shared Needs
Start by discussing software and workflows that group members use for reproducibility, collaboration, and communication. For example, what software is used for data analysis, data collection and documentation? How do members share data and methods, and request feedback? And how do members learn to use these tools?
Once your team’s needs are identified, choosing the way forward will require identifying and working on shared priorities in the research group, such as organizing scripts, improving metadata, and building skill sets. . Skill-building opportunities may include online tutorials and videos, workshops, skill-sharing meetings, and university courses. The goal is not perfection, but incremental improvement through attainable goals.