Data Science Project Scoping

2 minute read

Published:

Yesterday, my boss gave us a fun exercise, to research ways of data science project scoping. I knew immediately that I’d love the exercise since this is something that I have been grappling with since I started working on data science projects.

Why project scoping?

When I started working at axialHealthcare, my first project was communicated verbally to me. Then I would sketch out a plan for the projects as the data scientist. Without a clear business ask, I found myself updating my sketch from time to time and sometimes abandoning the project altogether if the leadership was not satisfied with preliminary results. At that point, I realized that it is important to scope out the project upfront with business stakeholders.

Later when I started leading data science projects, I fully embraced the idea of interviewing key stakeholders to sketch out the project scope. Then throughout the implementation of the project, I kept revisiting the project scope to ensure that we are on track. Note that this is not to limit ourselves within the scope rigidly, the scope is always up for revision if we felt necessary.

What did I find?

After some googling, here is a summary of what I found.

Necessary items to include in a project scope:

  1. Deliverables. What are the deliverables?

  2. Business use case. How would the deliverables inform current or future actions?

  3. Data. What data do we have and what data do we need?

  4. Resources. What resources do we have? E.g., reference table support, engineering support.

  5. Analysis plan.

  6. Project timeline.

  7. Validation. What are some metrics and procedures to validate the result?

“A well-scoped project ideally has a set of actions that the organizations is taking that can be now be better informed using data science.” https://dsapp.uchicago.edu/home/resources/data-science-project-scoping-guide/

Admittedly, there is no one scope for all data science projects, but I have found that having a scope to hold onto throughout the project keeps us from straying and wasting efforts.

References I used. A good example of data science project scope: https://canvas.harvard.edu/courses/20897/files/3539194/download?wrap=1

12 tips on defining the scope of big data projects: http://www.ingrammicroadvisor.com/data-center/12-tips-on-defining-the-scope-of-big-data-projects

Thinking with Data by Max Shron, Chapter 1. Scoping: Why Before How. https://www.oreilly.com/library/view/thinking-with-data/9781491949757/ch01.html

Data Science Project Scoping Guide: https://dsapp.uchicago.edu/home/resources/data-science-project-scoping-guide/

Advice on reducing scope: https://www.kdnuggets.com/2016/10/tunkelang-reduce-scope.html