- One day with python fundamentals with a best practices twist (TDD).
- The second day with a more biological data analysis focus.
Should electronics be repairable?
The question seems to ask for a resounding “of course” in my mind, but today I woke up to the following video by Louis Rossman about the right to repair bill and how legislators are trying to counter trade secret lobbyists. If you care about your right to repair your electronics, either by you or a professional, take some time to hear what Louis has to say as a talented independent electronics repairman:
Rossmann mentions how stupid it would be to not be able to repair your own electronics and share how you did so with the rest of the world as he does with Apple products. That thought resonated with me, specially when I decided to look into my faulty drone as a hobby repair recently.
So if you don’t care much about that bill and you are a tech geek like me, take a look at how I fixed my ~$300 drone by fixing a software glitch in one of its motor’s microcontrollers. Then please, rethink again how that bill would affect us all and raise your voice.
One motor not responding
A couple of years ago I was showing my drone to a friend when it started to wobble mid-air and crashed. I could not get to fly it again, nothing seemed wrong with the motors or anything. When re-connecting the battery, all propellers twitched (initialization sequence) except one:
How annoying is that? Everything seems to work except one (upon visual inspection) undamaged motor? Why?
Connecting to the drone (via telnet) and seeing the logs:
0.970751 NULL 6 909390645 BLC call for motor 1 1.970768 NULL 6 909390645 BLC motor 1 flash & start FAILED 2.030722 NULL 6 909390645 BLC call for motor 2 2.142945 NULL 6 909390645 BLC motor 2 soft version 1.43, hard version 3.0, supplier 1.1, lot number 11/10, FVT1 17/11/10 2.200720 NULL 6 909390645 BLC call for motor 3 2.312925 NULL 6 909390645 BLC motor 3 soft version 1.43, hard version 3.0, supplier 1.1, lot number 11/10, FVT1 17/11/10 2.370718 NULL 6 909390645 BLC call for motor 4 2.482919 NULL 6 909390645 BLC motor 4 soft version 1.43, hard version 3.0, supplier 1.1, lot number 11/10, FVT1 17/11/10 2.510740 NULL 6 0 BLC motor 1 dead 2.510942 NULL 6 0 BLC reflash required, perform off/on cycle (...) 1.005742 NULL 6 909260344 BLC call for motor 1 1.026300 NULL 6 -1096575148 BLC start flash 1.816389 NULL 6 -1096575148 BLC flash done 1.816557 NULL 6 -1 BLC verify 1.835135 NULL 6 -1 BLC verify FAILED - page 0 (...) !!! Emergency landing from /home/aferran/[...]/version/[...]/Soft/Build/../../Soft/Toy//Os/elinux/Control/motors.c:1593. Reason is Motors have not been initialized correctly
I performed the off/on cycle by reconnecting the battery several times, no luck. Reflash required… but how? There are no instructions for doing that reflash, just power off/on cycle which clearly does not work in my case.
At this point, all the manufacturer tells you is to buy a completely new motor, worth almost $50 plus shipping:
But I insist, nothing seems wrong with the motor itself, neither the few DMC3021LSD MOSFETs that are around the motor board, so it clearly seems like a software issue… with the Atmega8A microcontroller present in each of the 4 motor boards… the microcontroller datasheet states that they should work for 20 years and withstand 100.000 programming cycles. I definitely did not use it neither for that long nor that many times, so it got me curious: what if I could just fix it myself while I still have the right to?
Not bothering opening my drone
So at this point, many people would either buy that motor or throw that dead toy to a pile of e-waste.
First, before reaching to the screwdriver, let’s learn a bit about how that gadget is built by not opening it until it’s necessary via FCCID.io. There’s a great block diagram which details all its ins an outs:
Also external and internal photos on how the different boards and components look like.
See those MISO/MOSI/SCK and RESET test points in the motor pinout? Those can be used to communicate with our confused AVR microcontroller.
Time for some wiring up a couple of motors with a Raspberry PI
Using a Raspberry Pi one’s GPIO pins acting as a microcontroller’s programmer and AVRdude running on it (just a plain
apt-get install avrdude away on a recent raspbian), we can read and write the contents of the faulty motor board:
The pinouts in AVRdude must be defined in their physical mapping. There are tons of diagrams available online on how those are distributed in the different raspberry pi versions, so pick and choose your favorite rpi GPIO pins and tell AVRdude accordingly via
programmer id = "motor_1"; desc = "Use the Linux sysfs interface to bitbang GPIO lines"; type = "linuxgpio"; reset = 25; sck = 11; mosi = 10; miso = 9; ; programmer id = "motor_2"; desc = "Use the Linux sysfs interface to bitbang GPIO lines"; type = "linuxgpio"; reset = 14; sck = 4; mosi = 3; miso = 2; ;
If all goes well and it’s properly wired, you should get this from avrdude:
$ avrdude -p m8 -C /etc/avrdude.conf -c motor_2 -v (…) avrdude: AVR device initialized and ready to accept instructions
Reading ################################################## 100% 0.00s
avrdude: Device signature = 0x1e9307 (probably m8) avrdude: safemode: hfuse reads as DC
avrdude: safemode: hfuse reads as DC avrdude: safemode: Fuses OK (E:FF, H:DC, L:E4)
avrdude done. Thank you.
Then it’s just a matter of running avrdude to read the contents of the flash, eeprom, fuse and lock bits:
# avrdude -p m8 -C /etc/avrdude.conf -c ardrone_motor_2 -U flash:r:flash.hex:i -U eeprom:r:eeprom.hex:i # avrdude -p m8 -C /etc/avrdude.conf -c ardrone_motor_2 -U lock:r:lock.hex:i -U hfuse:r:hfuse.hex:i -U lfuse:lfuse.hex:i
How can we tell the dead from the living? Using UNIX diff.
Diffing the motor’s bits
Why did I connect two motors, that is one healthy and one “dead”, instead of just the faulty one?
As I learned from bioinformatics, comparing NORMAL vs TUMOUR tissue can reveal useful insights about biology. After this really stretched yet handy analogy which I probably should be embarassed about, let’s see what I found:
$ diff -u motor1/eeprom.hex motor2/eeprom.hex --- motor2/eeprom.hex 2016-06-06 08:47:26.815120557 +0000 +++ motor1/eeprom.hex 2016-06-06 09:35:17.664976258 +0000 @@ -1,4 +1,4 @@ -:20000000AC8A0001018A0A110BFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0F +:20000000FF030001010B0A120B0AFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFB6 :20002000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE0 :20004000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFC0 :20006000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFA0 diff -u motor1/lock.hex motor2/lock.hex --- motor1/lock.hex 2016-06-06 10:00:38.532028704 +0000 +++ motor2/lock.hex 2016-06-06 09:09:24.738241220 +0000 @@ -1,2 +1,2 @@ -:0100000003FC +:010000002FD0 :00000001FF
So the EEPROM holds information that I have no time to reverse engineer now (motor coil timing calibration? total flight hours?… no clue).
On the other hand, the lock bits got me interested. As Alexander and Boris say in their AVR workshops: “when in doubt, look at the datasheet!”.
So the datasheet states the following about lock bits near table 86 on page 215:
The ATmega8 provides six Lock Bits which can be left unprogrammed (“1”) or can be programmed (“0”) to obtain the additional features listed in Table 86. The Lock Bits can only be erased to “1” with the Chip Erase command.
Alright, so what if we just erase the chip with
avrdude’s -e command?
avrdude -p m8 -C /etc/avrdude.conf -c motor_1 -e
And then reflash the flash back?:
avrdude -p m8 -C /etc/avrdude.conf -c motor_1 -U flash:w:flash.hex:i
To be fair, those locks might be enforced when the firmware detects that there’s a serious mechanical issue with the rotor, which defaults to cutout/shutdown if there’s something wrong with it, preventing worse damage involving burned MOSFETs, destroyed motor coils, etc…
But since there are no further specs nor documentation from the manufacturer about this topic other than
"Motors have not been initialized correctly"… how should I know if I want to?
In any case, that’s it, I just saved the environment and $50 by unlocking an incorrectly software-locked hardware!
There are a few bits missing on how I debugged this issue and saved some followup reverse engineering work Hugo Perquin did on his blog. About reverse engineering, I might present some work at the first Radare2 conference.
But anyways, I hope to have raised some awareness about the right to repair while entertaining some nerds like me ;)
Organizing the workshop
Subject: Any interest in putting together a workshop for Stockholm this summer?
After getting the green light from WWCRC, my current employer, it did not took too long to include Oxana Sachenkova to the team and start planning the logistics, lessons, official PhD-level university credits and raise some money to support the event.
That is how the idea to do Carpentry with Software and Data came into fruition by the end of November 2015:
After innumerable emails, talks and commits the event was on the forge. Also the national swedish bioinformatics communities BILS and WABI supported us. We would like to thank both of the organisations for their financial support.
Day 1: Software Carpentry
For day one, Olav had some interactive python console sessions showing how basic Python data structures and control mechanisms look like.
Following up, Radovan prepared excellent TDD lessons, inspired on three sources:
The already mentioned Python Koans.
Some ideas borrowed from the BioPython’s comprehensive testsuite.
A late addition from an upcoming SWC TDD lesson, released just a few days before our workshop.
While the infamous installation problem is still an issue, students managed to follow through the lessons, getting the typical python installation issues, majorly solved by a proper installation of Miniconda.
The SWC installation tests, mostly distracted students since packages not being used in the workshop where flagged as uninstalled/failed (i.e: EasyMercurial). In general I perceived that students were getting overwhelmed by too much information from SWC default guides and stopped following up and reading the instructions early.
We need more TL;DR’s in software and data carpentry. Perhaps starting by the workshop template.
Day 2: BioData, Jupyter Notebooks, Pandas and Machine Learning
The morning is dedicated to brief students into the Pandas dataframe operations with Ethan’s White python-ecology dataset. Due to time constraints the merging and concatenation of dataframes is not covered but pointed out in the lesson. Now the students have enough knowledge to followup on Oxana’s Gene Expression dataset:
For which there are exercises for those students willing to earn swedish university credits. After some glitches with Python 2 vs Python 3 Jupyter notebooks, students get to know how to analyze data from the FANTOM5 consortium.
After getting some expression heatmaps and good insight from Oxana, Ahmed KachKach, currently interning at Spotify AB machine learning division, delights the audience with a detailed analysis of a toy dataset on breast cancer by using an extremely well documented introduction to machine learning notebook.
In order to explain PCA graphically to students, Ahmed uses an excellent web visualization to illustrate how variable decomposition/projection works in PCA.
Right after that machine learning introduction, I show how one can enact reproducible (and interactive!) notebooks via mybinder.org service by exploring a small scikit-allel dataset. Furthermore, more visualization techniques are shown via my current explorations of HivePlots as an alternative way of visualizing structural genomic variations in cancer samples.
On top of that, I had a talk prepared about structural variations processed with bcbio, but on the interest of time, I saved it for another event :)
Last but not least, Mikael Huss goes through a fantastic notebook showing some gene expression prediction techniques and clever feature engineering from his current efforts at WABI.
Thoughts and comments
Planned ahead of time, this workshop was a sustained effort to bring instructors and people together, and I am glad it worked.
A surprising early realization of this workshop is how high demand those courses could be: only a few minutes after announcing the event, we got around 40 individuals interested and signing up. The retention changed over time due to cancellations, but we managed to run the workshop with 35+ participants.
Regarding attendance, thanks to link shorteners on our announcement emails and twitter we could track the “funnel” of students that showed interest all the way down to those that were commited to actually show up and complete the courses.
Actual feedback from students
In our post-assessment polls we got an average rating of 8 over 10 on “General satisfaction with the workshop”, here are some selected comments:
I learned a lot of things these two days and the workshop really made me more motivated to use pandas next day in the office. :D
Overall it was a very nicely arranged and well prepared workshop. My only suggestion would be to simplify a bit the exercises for day 2 (perhaps by introducing some intermediary steps between two problems). Thanks for arranging such a nice workshop.
great event, I’ll recommend it further, should be on regular (annual?) basis.
And also some things to look after in the future:
I enjoyed particularly the first day, particularly the list of challenges/exercises that looked quite overwhelming at first but turned out to be manageable. Also very much appreciated: collection of ideas and questions on etherpad, post-its to request help. It would have been even better with little stricter time management.
such as clearly stating the (minimum) requirements to attend day 2 (intermediate/advanced):
My Python knowlegde was not high enough to follow. exercise.py was nice to learn Python, but didn’t help to learn testing process (I was stuck with the exercises). Other exercise had too difficult instructions. The python introduction was at a very basic level but the tasks were at intermediate or above level. This needs adjustment.
or again, not putting too much material in one day, no matter how exciting it sounds at first while preparing the lessons:
The first day was great (11/10). Intro to Python was too basic for me, but I understand it was necessary for some participants of the workshop. Intro to Git, test-driven development etc. was very well performed and I learned a lot. Second day was pretty good (7/10), but too hurried. I feel that there were too many things squeezed into the schedule. The visualization lab had a good premise, but also suffered from too little time.
HDF5 and Spark do not play well with each other…
… so what if we just use HDF5 for sharing (import/export) and Spark for the rest? That’s what Jeremy and Cyrille and me wondered while sitting in Janelia Farm labs… well, actually in its bar with our laptops, but now you are in context ;)
I am attending a neuro meeting at the fantastic Janelia Farm facilities to see how experts in the fields of electrophysiology and computer science among others, decide a common format to express recordings of neuronal activity and the surrounding experimental metadata.
The mandate and outline of NWB has a clear mission, timeline and particular steps:
- August 2014: Project Start.
- Phase 1: Identify use cases and evaluation criteria.
- Phase 2: Select/assemble most promising approaches and develop data format and test it.
- Phase 3: Test and fine-tune it.
- July 2015: Project ends.
Now, brace for impact. Here’s a small list of common e-phys file formats that were created by different labs:
For a more nuanced view of some of the main data formats, please have a look at the considerations for developing a standard for storing e-phys data in HDF5 and the NWB data and file formats summary.
Wouldn’t it be a massive win to choose a single data format and not fall in the traditional academic mantra that states: “different formats are good for different things”? Or even worse, create yet another competing standard?
How many of those formats are actually used in research publications? Which is the one seeing the most adoption in academic literature so far?
Why shouldn’t we just choose the top N that share the most mindshare for the greater good (reproducibility, data sharing, interoperability)?
Let’s see if we can find a fix for this e-phys Babel.
Several labs describe and present their custom ephys formats. Most of them have a fairly large overlap on attributes, structure and features. With varying specifications, the labs seem to revolve around HDF5, a hierarchical file format that stores all the attributes of the experiments, from images to timeseries, in varying degrees of complexity.
It is interesting to see how, being an event-processing problem at its core, there are very few mentions of industry and opensource event processing frameworks:
Software developers in the room recommend exposing a strongly typed API that deals with the raw data attributes via an intermediate representation instead of having to change the HDF5 container at every specification change or experimental novelties. This idea resonates quite well with the NEO format approach. An additional problem that arises with internal representations is keeping track of provenance since encapsulation might hide processing details that might be interesting to follow an experiment step by step.
It seems to me that e-phys recording can be approached as a large scale logging problem, therefore:
- Using a framework that aggregates events at scale is crucial to guarantee a smooth and fast data analysis experience. That is, including slicing by data recording sessions or any other criteria that the data (neuro)scientist decides.
- Leaving the internal (intermediate) representation of data in point 1 untouched is the most convenient approach. Specially when HDF5 does not play well with modern parallel frameworks.
- Exporting data from point 1 as HDF5 for sharing, given that is the most popular container within this science niche seems reasonable (to me, at least).
- Writing importers/exporters (serializers) from Thunder to HDF5 seems like an interesting Hackathon challenge. Adopting KWIK, already used by many, as a particular specification could be interesting w.r.t interoperability.
Software carpentry and learning
I am going through the 11th iteration of software carpentry for instructors and I am quite happy about the way Greg Wilson conducts his bi-weekly calls and focuses on evidence-based learning techniques.
At first my expectations from this course were those of simply learning how to teach specific software carpentry lessons, that is, traditional master classes on software engineering, tools and accompanying cookbooks.
During the first session I quickly realized that his approach goes far beyond that.
Greg is attacking the very roots of the learning process in order to provide a solid teaching base, ultimately offering robust, research-based, (scientific) software literacy to the world.
How motivation works
So I told my students on the first day of class, “This is a very difficult course. You will need to work harder than you have ever worked in a course and still a third of you will not pass”
Which I heard a lot of times myself and somewhat learned to ignore during my university days. Perhaps unsurprisingly, those words were followed by unintended consequences:
But to my surprise, they slacked off even more than in previous semesters (…) their test performance was the worst it had been for many semesters.
By using several research papers as a foundation to understand those situations, how learning works concludes that:
Limited chances of passing may fuel preexisting negative perceptions about the course, compromise her students’ expectations for success, and undermine their motivation to do the work necessary to succeed.
Again, I’ve seen this happening time after time on different academic contexts over the years.
This week’s assignment
So in this week’s course session, Greg asked us to describe a personal experience during our education where we saw a similar situation.
Sometime during my high school years, Iñaki, my physics teacher said something like:
If you don’t put more effort on physics, I think you should consider fixing motorbikes on FP instead.
FP (Formación Profesional) is the spanish educational route for those who want to skip university and pursue a more applied professional degree. FP is often mocked by spanish society (some teachers too) and regarded as “lower paid” and “less honorable” means to earn a living than going for more traditional academic degrees.
I believe Iñaki wanted me to succeed in physics and meant well, targeting at my self-esteem as a way to push me harder. Looking back, I see that it was a bad move that effectively de-motivated me. Although I did pass, it did not enjoy the subject as I should have, didn’t learn it as thoroughly and, therefore, didn’t earn higher scores.
I tend to obsess on topics I like. Curiosity keeps me awake at night, it’s like an unconscious autopilot that drives me towards higher understanding. As I discover and dig deeper on subjects I want to learn more about, I completely lose track of time. In my experience, frictionless, smooth learning almost invariably results in well cristalized knowledge and high scores.
Later on, while undergoing my computational biology masters degree in Sweden, I re-discovered (bio)physics while diving on the incredible world of ion channels and biomedicine in a very different learning environment.
I loved it.