Carl Shan home my books quotes

Week 1: It’s not about the data, it’s about the problems you’re solving.

8 June 2014 - Chicago

This is the first in a series of posts chronicling my reflections on participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago.

To get automatically notified about new posts, you can subscribe by clicking here. You can also subscribe via RSS to this blog to get updates.

This summer I am participating in the 2014 Data Science for Social Good Fellowship at the University of Chicago.

The Data Science for Social Good Fellowship is a summer fellowship sponsored by the Eric and Wendy Schmidt Foundation. It selects a cohort of individuals typically with quantitative or social science backgrounds to work in teams to tackle problems brought to them by non-profits, governments and NGOs.

The inaugural program was conceived of and put together by former Chief Scientist for the Obama 2012 campaign, Rayid Ghani. DSSG launched in the summer of 2013, bringing 36 Fellows together to spend three months working in Chicago.

The 2014 Fellowship, which I participated in, expanded its class to 48 Fellows, drawing a diverse set of graduate students, recent college graduates and working professionals coming from both the hard and social sciences.

In the 2014 Fellowship, there were a total of 12 projects sponsored by a varied set of partners. The projects ranged from detecting collusion, to evaluating how to best allocate city funds.

I’ll be able to provide more background and context on my own project as the fellowship progresses.

My Background

How I ended up as a Fellow is a combination of my academic and personal experience.

I graduated with a bachelors in statistics from UC Berkeley in 2013 while also taking a smattering of coursework in computer and information science.

While developing my knowledge of statistical and computational techniques, I also developed a desire to do socially meaningful work.

During my first semester in college, I volunteered serving as a mentor to fifth graders from a local elementary school. What I encountered when I first began volunteering shocked me: I had attended a high-achieving math and science powerhouse high school with a 99% graduation rate; however the students I mentored came from circumstances made it immensely difficult for them to enjoy the same type opportunities that had fallen into my lap as a student. Over 40% of the students in the school were on free or reduced lunch.

Throughout the course of my volunteering, it became more and more clear to me that the zip code you were born into could more or less determine whether you were able to live a full and rich life.

This led me to ponder the role of education as the mechanism for both increasing social mobility, as well as providing life-altering opportunities.

As a result, through college I worked in a variety of different educational organizations, hoping to increase my exposure to the field such that I could eventually improve life outcomes for students who, through the lottery of birth, did not receive the same lucky breaks I did.

During this time, I recognized that my technical background provided a unique advantage to create social change: many individuals drawn to the education or non-profit fields typically do not come from statistics or computer science. This fact, combined with the simultaneous rise of the importance of data in augmenting human decisions gave me a glimpse that by perhaps mastering the techniques and tools of a data wrangler, I could increase my individual leverage in impacting the progress of human civilization.

This all sounds rather high-minded, lofty, and abstract. Writing it makes me feel mildly embarrassed in the same way that I imagine self-aware salespeople feel when they use phrases like “synergistic partnership” or “paradigm shifting” – the phrases are so ambiguous and overused that they become self-mocking caricatures rather than signifying meaningful ideas.

Yet it would be disingenuous of me to omit my motivations simply because they sound cheesy. It also seems imprudent to be unwilling to discuss beliefs that sound naive, yet are nevertheless ones I hold to be true.

Coming into the Chicago Fellowship I hope to develop these skills in a hyper-applied context by working the exact problems I would want to be working on even post-Fellowship.

The First Week

I landed in Chicago the weekend before the Fellowship began. Flying in, I was slightly apprehensive – I did not feel like I had received an adequate amount of communication regarding program logistics such as what type of project I was assigned to, who my team members were, and an outline of the summer.

In fact, I had only been emailed details regarding these things the day before my flight.

However, once I began my first day as a Fellow I was happily surprised at the level of thought and organization that had been put into the summer.

In the morning of the first day, all the Fellows wrote down different things they were hoping to get out of the summer on sticky notes and arranged them on an empty wall. Examples included swimming in Lake Michigan, learning how to use cloud computing, and sampling Chicago city cuisine.

One particularly eye-catching sticky read “Learn what the hell data science is!”

Values and Culture

Afterwards, the program staff explained to us the key values of the Fellowship, which I’ve listed belong along with my own personal interpretation of each:

I was very happy to see these cultural tenets explicitly stated. These values lined very much up with my own.


On the first day, it was explained to us that while we were expected to work hard and produce useful output on our projects, it was not the goal of the program to actually build the most robust piece of software that solves our partners’ problem.

Rather, the goal of the program was focused more around (a) helping Fellows learn how to use various techniques and tools and (b) developing each Fellow’s personal interests working towards social good, open government and open science.

This surprised me, as one of my personal goals is to actually build and deploy a maintainable system for the partners I’m working with. Although I can see the sense in moderating my goal to being simply one of learning and growing, I’m reluctant to dial down my ambitions simply due to sake of feasibility.

Not only do I think that it’s reasonable to build out some usable and maintainable software or tool in the summer, I also personally think gunning hard for this goal would be the most productive way to learn.

The Fellows then got a chance to introduce themselves and intermingle. I learned that of the 48 Fellows, 33 were PhD or Masters students, 12 were recent undergraduates and 3 were working professionals.

Although the majority of Fellows seemed to come from quantitative or technical backgrounds, there were also Fellows with backgrounds in architecture, psychology and public health.

This was nice to see, as I subscribe to the belief that impact is produced through a combination of valuable skills and domain expertise. The Fellows with math, statistics or computer science backgrounds are the arsenal with which the social scientists could aim at the correct targets.

We also had a variety of social events, including a picnic near the beach, a kickoff party on the rooftop of the Civic Opera Building and a boat tour of the city of Chicago. I was very glad to see these social events planned; I believe that great work can only be done within a strong scaffold of tightly-formed relationships, and all of these activities gave me a glimpse into the personal lives and thoughts of each of the Fellows and mentors.

Overall, I ended the first week feeling fairly optimistic about the trajectory of the coming weeks.


The Good

Impact-oriented fellows: I’ve been involved a variety of different internships and programs and at many of them, I’ve felt out of place. At these organizations, my co-workers were not interested in the same set of ideas I was hoping to discuss, and didn’t share many common values or worldviews.

I was very happy to see that this was not true with the Fellowship. Many of the Fellows I spoke with shared common passions for social issues, and were interested in discussing ideas ranging from how to best optimize for making social impact to how to accelerate one’s rate of growth.

Openness to growth and feedback: The program directors seemed very open and receptive to feedback, creating a Hackpad document where Fellows could contribute ideas for how to improve the experience as well as holding a formal feedback session. This contributed towards building a culture of open transparency.

Collaborative work setting: The workspace is spacious, open (encouraging collaboration and conversation) and there’s a culture of casual collegiality between everyone.

The Bad

Direction-less events: There was a planned event that did not seem to directly serve a purpose: a series of non-profits that were not partners came in and gave lengthy talks.

I wasn’t sure as to whether these problems were ones I should be invested in helping solve, and with a single exception, the speakers didn’t share any useful mental models or suggestions for how to tackle our own projects.

I admit that this issue may be the fact that the purpose of this event was simply not communicated well to the Fellows, giving an impression of aimlessness.

The Ugly

No wifi: The wifi had not been installed when the Fellows arrived. As a result, we had to connect to DePaul university’s network (the building we worked in is owned by DePaul), causing traffic overload and nearly unusable connection speeds. This proved to be quite a bottleneck for teams looking to get set up quickly.

No data: A variety of different partners still had not yet shared data with their respective teams by the end of the week. Legal agreements were still in review, and processes were still being navigated. This had also happened in the previous year, although I was told that it was much worse then. However, this is a pretty serious showstopper. After all, it’s hard to do data science without data.

Untargeted workshops: The workshops should have been more segmented based on background – some workshops had students coming in from both the high and low end of the experience spectrum. Workshops also ran overtime, resulting in some Fellows being unable to attend a few due to an instructor having to leave to catch a flight.

In the future, I think it would be useful to plan multiple workshops for Fellows of different experience levels, and to enforce time constraints.


It’s important to balance everything I’ve said within the perspective that DSSG is still only in its second year of existence. Growing pains are still being massaged, and the level of thought and dedication that I’ve already seen being put forth by the Fellowship organizers assure me that my criticisms are ones that will not be true next year.

I write posts about data science applied to social causes. If you want to be notified when my next reflection is published, subscribe by clicking here.