This is us: Maike Kleemeyer
Research data management sounds like a lot of work to many people, but it actually offers enormous potential for better collaboration, sustainable data structures and real time savings. In this interview, Maike Kleemeyer, Research Data Management Coordinator, explains how personal frustration with data chaos turned her into an advocate for clear structures. She explains why well-thought-out research data management is beneficial, not only for Open Science, but also for one's own research success.
What motivated you to get involved in Research Data Management (RDM)?
Maike Kleemeyer: Looking back, it was probably the moment when I started working with data from other projects without knowing how it was organised. I found it extremely tedious to wade through unfamiliar file structures and constantly adapt my analysis scripts. Fortunately, I was mainly dealing with MR data, for which a standard had been established with BIDS (Brain Imaging Data Structure). It was a bit of a learning curve at first, but after that I never had to rewrite scripts or decipher file structures again – I could start working on the content straight away. That was so convincing that I got more and more involved in RDM.
You are the Research Data Management Coordinator at the institute. Can you briefly explain what that entails?
Maike Kleemeyer: Research Data Management refers to the handling of research data throughout its entire life cycle, from data organisation and naming, description and documentation to storage, data protection and publication. This makes RDM an important cornerstone for Open Science, because having a well-defined data management concept from the outset minimises the effort of publishing (high-quality) data. In addition, the publication of underlying data is increasingly required when submitting manuscripts.
What are your daily tasks?
Maike Kleemeyer: My role as RDM coordinator involves developing and implementing a cross-institutional RDM strategy, i.e. where and for how long data is stored, how it is organised and documented, and how its reuse can be enabled. Together with colleagues, I am currently working on finalising an RDM policy and a guide for implementation, which will be presented soon. The study registration is one part of this strategy. It provides a central overview of the institute's data-collecting studies and automatically creates study folders where all study material (not just data) is stored. This enables me to support researchers throughout their specific data life cycle, helping them to do the right things at the right time so that they don't end up with a huge amount of work. So, the emails that may annoy some people are actually intended to help reduce the effort in the long term. Of course, I am always happy to advise researchers on RDM issues in person or by email.
What challenges do you face in implementing a cross-institute RDM strategy?
Maike Kleemeyer: As much as I try to emphasise the advantages of a well-defined RDM, for many researchers it is a time-consuming nuisance that distracts them from their actual research. One challenge is therefore to find a reasonable trade-off not to overload researchers with RDM requirements but still obtain the necessary information.
Another challenge lies in the heterogeneous nature of our institute's research, which is reflected in the associated research data. An MR study with multiple measurement points has completely different needs and requirements for RDM than a one-time online survey. The strategy must therefore be abstract enough to accommodate the heterogeneous data and concrete enough to actually have an effect. In addition, some of the research areas have formulated their own requirements, which the cross-institutional RDM strategy must accommodate, because changing existing processes must be well justified.
How do you raise awareness among researchers about the responsible handling of research data?
Maike Kleemeyer: The advantages of good data management usually only become apparent when you look into someone else's data folder. However, this often only happens when the research career is already quite advanced. That's why I teamed up with colleagues from the Max Planck Digital Library (MPDL) and Ludwig Maximilian University of Munich (LMU) to design a very practical workshop that artificially advances this experience. This is often the most effective way to raise awareness among researchers for the importance of responsible data handling.
However, some researchers, especially young ones, are already very aware of the issue and wish to publish their data in the spirit of Open Science. Here, effective support is more relevant than raising awareness.
Are there any best practices or recommendations you would give researchers for documenting and publishing their data?
Maike Kleemeyer: If I reveal all that here, no one will need to come to my workshop. But joking aside, these topics are indeed central components of my workshops.
If I had to name the most important basics that are essential for the meaningful and sustainable reuse of research data, it would probably be these four points: 1. Assigning a licence (usually CC0 or CC-By) 2. Using open, sustainable file formats (e.g. csv, tsv) 3. Creating a README file that introduces the dataset and its research context, and 4. Creating a codebook that defines the specific contents of the dataset.
Was there a particularly inspiring experience?
Maike Kleemeyer: During my postdoc, I had the pleasure of participating in the Berlin Aging Study II at the Lifebrain Consortium. The consortium had already agreed to use BIDS as the common underlying data standard. Thanks to this uniform structure, we were able to work in interdisciplinary teams to develop analysis pipelines that worked for everyone. This form of smooth, productive collaboration was a real eye-opener for me – and has had a lasting impact on my understanding of collaborative research.
Are there any specific skills or knowledge you would like to learn or deepen in the future, and why?
Maike Kleemeyer: Every now and then, I find that my basic programming skills quickly reach their limits. In particular when helping researchers to reorganise their data, for example, from wide format to long format, I would like to be able to quickly provide them with the relevant lines of code in R or Python. I am looking forward to deepening my knowledge in this area through targeted workshops.



