Note: this is part 2 of a series of 2 articles I did about the hackathon. The first article lives here.
A few weekends ago I went to my first hackathon, the Clean Energy Data Science Challenge organized by the State Department and Booz Allen Hamilton. It was hosted at the Galvanize campus in San Francisco. The hackathon asked participants to use data to map out where people in Myanmar don't have access to electricity and where the best places would be to develop minigrids, which are small local grids not connected to the main grid.
In this post, I'll discuss my impressions of the event and what I learned for future hackathons that can hopefully help some other first-timers out there. For more about how my team approached the problem and what our solution was, check out this post.

(A team photo from the hackathon shown above)
Expectations meet reality
I went to the hackathon because I'm passionate about clean energy and wanted to get a feel for what data science might look like in this industry. I was also eager to learn some techniques from more experienced data scientists. Finally, I wanted to see what a hackathon was like and hoped I could make some meaningful contribution.
It was different from what I expected. I definitely learned a lot about the energy situation in Myanmar and the potential for off-grid renewables. Unfortunately I didn't learn too much new about data science because there was a dearth of experienced data scientists at the event - or, at the least I didn't run into them. During the networking session on Friday night when we were encouraged to form teams, I talked with almost a dozen participants, but none of them had professional experience in data science. It was only at the very end of the weekend, during the presentations, that I saw some people who clearly had some professional data science experience, but it seemed that they had come in pre-formed teams. I think it was positive that the event encouraged people without data science experience to attend, particularly those who had domain knowledge but not technical skills, but the imbalance made it difficult for people like me who were hoping to learn from more experienced practitioners.
Forming a team
Having been unable to find more seasoned data scientists to form a team with, I shopped around Saturday morning in search of a team that at least seemed organized. I ended up joining a group of 7 others who had already laid out some initial plans on a whiteboard, and they were happy to have someone with experience in data cleaning. It turned out that a little less than half the team were French, which was a great bonus for me since I recently returned from working in Paris and was eager to practice my French. My team had diverse backgrounds, including two graduate students in environmental planning, a couple of programmers and some project managers at a large energy company. I was the only one with professional experience in data science, which meant I handled a lot of the data cleaning and wrangling. It also meant that I had only myself and Google to turn to when questions came up, which was great in terms of getting practice with self-reliance but not so great in terms of the learning experience I was hoping for.
Lessons learned
I was pretty amazed at what some people accomplished at the end of the hackathon. The winning group had focused on organizing and visualizing the data that was available in an interactive webapp that allowed you to see detailed data for each township. Only a few teams tried to make models, and those that did had a pretty hard time given the lack of labeled training data. I was a little disappointed by that, but it was still a good exercise and I was able to learn some lessons for my next hackathon:
Try to form a team ahead of time
Despite my attempts to network at the event to find a team, I wasn't able to find more experienced data scientists to learn from. Some of this may have had to do with the particular nature of this event, which was open to people of all backgrounds and had no entry fee, but I think that next time I will try to do some networking online and begin to form a team ahead of time. My roommate who has done hackathons in the past said that people sometimes make groups on Facebook to find teams in advance, which is something I'll definitely look into. Sometimes events let you see who's attending in advance, and doing some LinkedIn networking based off the guest list could be valuable for future hackathons.
Get data into database and plan on how to combine data sources
There was tons of data provided by the organizers of the event, but it all was in different formats and in different locations. While it wasn't really an option with my group since most people didn't have any experience working with databases, I think that it would have been much easier to select the data we were going to use and load it into a SQL or MongoDB database. That way we wouldn't have to be passing around CSV files from person to person and we could more easily combine and aggregate data sources. We should also have thought in advance how the data each of us were working on would be eventually joined together at the end, which might have saved time when we were trying to match different spellings of township names.
Focus on a smaller, more specific goal
My team decided to tackle the most difficult of the challenges presented at the hackathon and to give a comprehensive solution. This might be a good tactic if you have one of the stronger teams at the competition with lots of experienced team members, but I think we could have gained from narrowing our task a little bit and doing a more thorough job on a smaller goal. Our solution ended up looking similar to a lot of other teams. I think that if we had decided to focus just on one aspect of the challenge, say how to model demand across villages, we could have come up with something more useful and impressive at the end.
All in all I'm proud of what my team accomplished given our level of experience. It was the first hackathon for most of us. I'm looking forward to applying what I learned to a future hackathon.
