Posted on: December 3, 2020 | 3 min read

Top Processes to Enact for Effective Data Science

With the strategy and talent in place, your data science team can begin the core work of generating statistical models  that predict the future or prescribe the best possible action. 

Organizing your team to the right processes can lead to:

  1. Increased likelihood of analytics adoption
  2. Organizational trust in model outputs
  3. Clear standards for easy model revisions

Use Case Management

Your ROI depends on tackling the use of cases with the most impact. There should be a straightforward process for identifying and prioritizing data science use cases to enable your data scientists to do high-value work.

Use cases may come from executives, operational SMEs, or data scientists themselves. The whole organization should know how to send ideas to the data science team for consideration, and the ideas should form a backlog of potential projects.

Tip: Associate a value to each use case your team develops and compare the level of effort to the benefits gained. It can be intimidating to select the single-best use case to follow, especially when multiple opinions are involved. A well-documented backlog can prove to be fruitful, though, especially following your first successful implementation.

 

Model Development

You can use some industry-standard approaches to drive model development, like Gartner's Analytic Process, CRISP-DM, or Microsoft's Team Data Science Process. For each model, the development and tuning process should be documented in detail. Documentation should include analysis of inputs, expected outputs, different models considered, and a history of the various model design decisions and adjusted throughout development.

(UpX Academy)

Building a predictive model usually requires subjective decisions. Ensuring that the data scientist makes correct assumptions and includes all relevant data is critical to creating a useful and accurate model. Each model should be independently reviewed to ensure the documentation standards are met and the model is well-designed. Some industries, like banking, even have regulatory requirements governing the independent validation of models. If you have a large data science practice, you may dedicate a team to model validation. If your team is smaller, institute a peer review system, so all models are scrutinized before considering business use.

Following the independent validation, models should be approved by a governing board that includes executive sponsors. An inventory of active and retired models should be maintained to reflect the board's decisions. Internal audit should regularly review the full validation and approval process to ensure it functions appropriately.

Model Integration

When the model is ready, users will want predictions to blend simply into their current reporting and operations. Where possible, levels of confidence should be reported alongside predictions to help users evaluate the accuracy of the predictions  they receive and to  increase adoption.

Model Monitoring & Maintenance

Models may degrade over time as new influencing factors come into play or as data shifts. They need to be reviewed periodically to help them maintain their predictive power. In addition to scheduled reviews, thresholds may be established to trigger an alert if the model inputs or outputs are abnormal. As machine learning becomes more common, well-built models can continually self-adjust over time. However, these models should still be subject to monitoring to ensure they continue to provide value. The review board should retire models that no longer serve their purpose and record those changes on the model inventory.

"In many ways, the data science process is a process of trial and error.  Sometimes it is a manual effort, and sometimes tools like AutoML can make some of the trial and error for you.  It is an iterative process of testing new features, new algorithms, new parameters resulting in the best model for your data and for your business needs." – Alex Hagen, Senior Data Science Consultant, CCG

Written by CCG, an organization in Tampa, Florida, that helps companies become more insights-driven, solve complex challenges and accelerate growth through industry-specific data and analytics solutions.

Topic(s): Data Science
Return to Blog Home

Subscribe to our blog updates.