‘Python Essentials for GIS Learners’: a targeted FAIR research workshop by TU Delft’s Digital Competence Centre

  • Are you interested in learning to program with Python?
  • Are you currently working with data using geographic information systems (GIS) or interested in doing so?
  • Would you like to apply computational thinking and data analysis tools to your research?

These were questions posed to members of the Historical GIS (HGIS) and Delft Digital Humanities (DDH) communities within the Faculty of Architecture and the Built Environment to invite their participation in the ‘Python Essentials for GIS Learners’ workshop. The workshop was coordinated by the TU Delft Digital Competence Centre (DCC) from 15th to 17th March.

This 3-day online workshop was designed and delivered by members of the DCC as part of a collaborative project with researchers Carola Hein, Thomas van den Brink and Yvonne van Mil. The fundamental aim was to support skill building in FAIR data, software, and research workflows.

Using simple geospatial datasets as examples, workshop instructors Ashley Cryan and Jose Urra Llanusa taught fundamental concepts of programming with Python, version control with Git, working with the command line, and how tools for each skill can be used to find, explore, produce, and share FAIR geospatial data. 

Specifically, participants learned how to: navigate files using the Bash (Unix) shell; interact with the Python ecosystem (Anaconda, JupyterLab) and popular libraries for data analysis and visualisation; work with the Python console in QGIS; implement version control with Git; and practice social coding with GitHub. Hands-on group activities were used to demonstrate how these tools are useful to automate and augment components of a GIS-based research workflow. 

The workshop was designed for complete beginners in programming, with the goal of inspiring and empowering participants to apply these practices in the context of their own research. Thirteen active participants and two observers tuned in to the online workshop for an engaging and fun learning experience that we believe achieved our goal!

Most participants in the workshop had extensive experience working with desktop applications for GIS like ArcGIS and QGIS (pictured above). During the workshop, the group practiced working programmatically with geospatial data in QGIS using the Python console and code editor.
Python offers some powerful libraries for creating beautiful maps, but to understand how they work “under the hood” we started with the fundamentals! Participants were introduced to basic concepts of object-oriented programming to create a map of cities in the Netherlands from a python script using the Turtle graphics module.
Participants took part in a hands-on coding session in a Jupyter Notebook to explore and plot data from the Natural Earth Airports dataset, using fundamental Python libraries for data analysis: numpy, matplotlib and pandas. 

Creating the workshop

The materials used in the workshop were a compilation of original lessons and external sources. During the workshop design phase, we as instructors discovered many excellent resources that could be adapted to produce a cohesive custom learning experience. Lessons were created based on the Unix Shell and Plotting and Programming in Python lessons from Software Carpentry, the Turtle Graphics Map repository by Andrea Cleland and the Gizmo “Can you speak Python?” challenges repository by Marijn van Vliet.

The central platform for the workshop was a website used to host the lesson material and serve as an interactive online educational resource. The website, which is publicly available for reuse, was created using Jupyter Book and hosted with GitHub Pages. All lesson materials were collaboratively developed and are now hosted in the DCC organization on GitHub, written in Jupyter Notebooks and markdown files to showcase code- and text-based examples. Interactive group discussion between participants was facilitated by Jupyter Book’s  ‘utterances’ feature, a web commenting system that uses GitHub Issues to store and manage comments for effective follow-up and lively discussions.

The workshop was advertised using Eventbrite and promoted within the HGIS and DDH communities with the help of the faculty Data Steward, Diana Popa, and Data Steward Coordinator, Yan Wang. The group size was capped at 15 participants (with a waiting list of 10 participants) who registered within four days of publishing the event page – an encouraging signal of the demand for the skills this workshop targeted. 

Measuring impact

Pre- and post-workshop surveys were used to measure the impact that this experience had on participants’ confidence and ability to apply fundamental programming tools and concepts. 

Participants were asked to rate their reaction to ten statements before and after the workshop, respectively, on a scale from 0 (“strongly disagree”) to 10 (“strongly agree”). These statements were:

  1. I feel capable of writing a small program, script, or macro to address a problem in my own work.
  2. I know how and where to search for answers to my technical questions online.
  3. I feel capable of using scripting and automation in my data analysis.
  4. I understand why and how to use a version control system like Git to track changes to my own files.
  5. I feel capable of collaboratively writing and sharing code with others.
  6. I think that programming skills can improve my research process.
  7. I feel capable of using an online repository like GitHub to search for others’ code and publish code from my own projects.
  8. Basic programming skills and data literacy should be taught as part of a University curriculum.
  9. I understand the benefit of using metadata/documentation to enrich and describe my research data outputs according to my domain standards.
  10. I believe that PhD students can benefit from workshops like this on programming basics at the start of their PhD.

For each statement, the mean aggregate score was higher after participation in the workshop – in some cases substantially – indicating that participants benefited from the learning experience. Because participants were beginners at programming, the pre-workshop mean scores were fairly low (around 2), however, after the workshop they increased considerably. The greatest improvement was observed for question 5, “I feel capable of collaboratively writing and sharing code with others”, which increased by 4.65 points (from a pre-workshop mean score of 0.25 to a post-workshop mean score of 4.9). These results are encouraging to say the least!

Through the post-workshop survey, participants also highlighted the workshop’s role in inspiring them to continue building their Python skills; helping them understand how tools for working with code can be used to support different phases of a research project; and, giving them a better appreciation of the time commitment and practice necessary to develop programming skills. 

Quotes from participant feedback included: 

“The teaching style was fun, interactive, and a nice change of pace from standard “lectures”. I liked how informal it was, and how we could ask questions as we progressed through the lessons.”

“Great moderation and feedback! Very accessible and useful walkthrough with QGIS! Excellent intro to shell scripting. Excellent answers to questions raised during the workshop, both in the comment section of the website and during the live sessions.”

Opportunities and lessons learned

There are clear opportunities for future engagement through training using this workshop material and implications for the value of offering targeted workshops for discipline-specific research groups.

One important finding is the interest and ability of workshop participants to become instructors for this workshop in the future. Six of the thirteen active participants indicated either “yes” or “maybe” to the post-workshop survey question: Would you be interested in teaching this material to others to support competency building in your own faculty or research group?

Integration with the Open Science Community Delft is also a logical next step, given that the participants in this workshop were already interested in Open Science topics and are now equipped with essential FAIR data and code skills. The possibility for past participants to join future workshops (including TU Delft Software and Data Carpentry workshops) as helpers and contribute to new educational material is open and exciting!

The format used to run this workshop was well-received and could be emulated in the future. Each day was divided into online teaching sessions of 3 hours in the morning followed by offline self-study sessions in the afternoon. This allowed participants to join group discussions and demonstrations in the morning and afterwards, work independently on prepared exercises related to the morning’s material. Instructors were available for “Zoom Office Hours” each afternoon to provide participants with help and answer questions via GitHub with the assistance of geospatial data expert and DCC research software engineer, Manuel Garcia Alvarez.

The majority of participants found the format of this workshop to be a good balance of teaching time and independent exercises. About 30% of participants thought the workshop was too short and that more than three days of training would have been preferable. 

Our experience validated that this workshop could be effectively conducted for this group size with just two instructors. However, whilst the teaching experience was fun and manageable an important observation was the prevalence of programming environment-related errors which were difficult to troubleshoot online. We were able to solve most issues during the online teaching session. In some cases, small groups with similar issues were asked to remain online after the teaching session to resolve technical issues. Such technical issues caused some understandable frustration among participants, and could potentially be avoided in future workshops by creating a shared environment with JupyterHub in which to run the workshop. This would help to reduce friction for beginners and create a smooth experience for participants and instructors alike. 

Lastly, several PhD and masters student participants inquired about receiving ECTS credits which could not be provided for this particular workshop due to development time constraints. In the future, it is recommended to provide participants with 2 ECTS for workshops of this kind if at all possible. This will further incentivize participants to devote time to learning and feel that their accomplishments are recognized by the graduate school, which will promote a high-quality learning experience for all. 

Plans for the future

Running this ‘Python Essentials for GIS Learners’ workshop for the first time was a great experience! In the future, this material can be offered with support and either full or co-instruction from the DCC in connection with research groups working on geospatial research. 

Groups of up to 15 researchers are suggested for keeping this workshop small and manageable by two instructors. The workshop material (lessons, exercises and instructor notes) can be reused by anyone (even independently) via the workshop website.

Since September 2020, the DCC has been actively involved in supporting a number of long-term projects across all faculties at TU Delft. With a mission to help researchers develop skills in applying the FAIR principles to their research activities and reach their data management and software development goals, this team of Data Managers and Research Software Engineers is poised to offer targeted training. 

Get in touch!

If you’re interested in developing a workshop on a topic related to FAIR data and software or have questions about how to make your own research outputs FAIR, please get in touch with the DCC by emailing: dcc@tudelft.nl.

Written by Ashley Cryan and Jose Urra Llanusa (TU Delft DCC)
Edited by Connie Clare (4TU.ResearchData)

Cover image by Gerd Altmann from Pixabay

Related Articles

Responses

Leave a comment!