Our compute cluster is down for maintenance, so obviously instead of working on my advancement exam presentation, I’m gonna write a blog post.
Today I taught a 1-hour lesson on “Cleaning data” for graduate students in UCSD’s BIOM262: Quantitative Methods in Genetics class. Here is the IPython notebook that I used. While designing the lesson, I used concepts I learned from Software Carpentry’s Training for instructors, such as:
During the workshop, I again implemented things we learned in software carpentry, such as working in pairs and using stickie notes to indicate progress. Overall, I thought it went quite well, with students being engaged for most of the time. Here’s the results of what went well and what could be improved.
Here are some of the things I thought went well.
Everyone worked in pairs. Before we split into pairs, I surveyed the class to check for previous experience with Python, IPython Notebook, and pandas. Then, the person who has less experience was the one who did all the typing (aka “driving” in pair-programming parlance), and the more experienced person wasn’t allowed to type. We did this at the software carpentry training while doing a Github exercise. I paired with someone who was less experienced with Github, and it was a good experience for me to have to sit on my hands and talk through every step. This was very useful because the more experienced person had to be explicit and describe every step, and the less experienced person automatically had a mentor to ask “dumb” questions to, without having to grab an instructor. An additional benefit is that when working in pairs, you’re held more accountable, so you can’t just do some exercises and then switch to your email or facebook, because your partner is right there.
Used pink and blue stickies to show progress. Pink was “still working” and blue was “done.” This was helpful to be able to scan the room and get a sense of the progress. We used this at software carpentry as well, but it was definitely more helpful as an instructor than as a student, because you got an implicit sense of progress, without having to check in personally with everyone.
Here’s what the students said they liked.
The instructions for getting set up with downloading an IPython notebook, and having to cd
and navigate to that directory, and start up an IPython notebook were confusing and need to be rewritten or completely reworked. Maybe give them a wget
or similar command? Then they’ll be directly in that folder where the .ipynb
was downloaded to, and don’t have to do any navigation. This took up a good 15 minutes at the beginning, that could have been used on programming instead.
Windows was a much larger impediment than I thought. The first part of the lesson was performing Unix commands such as head
, tail
, wc
, and searching for a command to count the number of columns in a file (ends up being either awk
or a combo of head
and wc
). I didn’t plan for this, or test the lesson on a Windows machine, but half the class was using Windows. So that’s definitely something I will plan for in the future. I felt really bad.
Here’s what the students said could be improved.
Windows :(