I have put together a list of data science projects I have worked on and their business impacts. This helps to highlight the depth and breath of my consulting experience. It may also be a useful supplement to my resume and for discussions.
List of projects with synopsis and key topics.
Boston Consulting Group
- Lighthouse: BCG feature store
- Development of a scalable, high frequency data and feature store for BCG case teams
- XFN teaming, leadership, workstream development
- Payroll contract pricing
- Assess contract pricing for new and existing customers to determine churn impact and reduce pricing cycles
- XFN teaming, pricing, data management, infra, ML, client training, pilot
- Wholesaler pricing
- Promo effect attribution and optimization for over $1 billion annual spend
- promotions, pricing, big data, ML, pilot
- BCG risk and best practice ring-fence
- Stakeholder engagement to assess project risk and ensure high quality deliverables
- infra, workstream development
- Survey data predictive value
- Demonstrate survey data value as an employment rate leading indicator
- ML, data management
- COVID employment recovery research (3 weeks)
- Estimate employment rate recovery at COVID onset in April 2020
- Consumer goods - Order replenishment (3 weeks)
- Analyze historical purchase behavior to provide customer recommendations for item replenishment
- High end fashion retailer
- $52 million personalization pilot to experiment on many products and sales offers.
- personalization, product analytics, big data, client training, pilot
Evolve Research
- Chatbot for survey engagement
- Development of survey chatbot to improve respondent engagement and research insight to support new business development
- ML, NLP, infra, leadership, workstream development
- Text analytics capability
- Development of text classification and sentiment scoring capability to support new business development
- ML, NLP, infra, workstream development
- Government infrastructure end-user research
- Measuring end-user satisfaction with large scale infrastructure project affecting all Australians
- ML, infra
Lighthouse: BCG feature store
Development of a scalable, high frequency data and feature store for BCG case teams.
Lighthouse empowers BCG case teams to access curated, third-party data sources. We have had several case teams use the data to high effect. I have contributed to the goals of the project by:
Payroll contract pricing
Analysis of customer contract pricing to determine churn impact and reduce pricing cycles.
We helped a payroll provider assess pricing to maximise profit and reduce churn. We analyzed which factors are important to determining contract pricing. This insight resulted in fewer pricing cycles, speeding up the contract process. Our results also indicated that customers were not very price sensitive. This gave the client confidence in approaching customer pricing negotiations and improved profitability. I contributed impact to this project by:
- Providing a robust analytical pricing recommendation, using ML and explainable AI principles. This included searching over 2000 possible features to determine the most important 15.
- Worked with team to integrate pricing and churn models to show impact of price changes on churn rate.
-
Lead the data management, coding and infrastructure solution for the project. This enabled us to deliver quality codebase with version control, configurability, and documentation.
-
XFN teaming
- Worked with several XFN partners. This included strategy focused client, BCG consultants and partners. Also included technical DS clients, DS team and DS lead.
- Looked to understand XFN perspectives and motivations. We needed to prove the data science process to non-technical stakeholders. We did this by being transparent about the setup process, and when initial results are ready. We supported their use-cases by designing a configurable codebase for fast analytics turnaround.
- Pricing
- It was difficult to get a true picture of customer pricing and profitability. This is due to variability in charge rates, bonus payments, pay periods.
- We created a robust data engineering pipeline to ingest, clean and analyse pricing over time.
- Data management
- We received over 50 different data sources in a variety of formats from the client. These included features such as payroll, insurance, and customer information.
- I implemented a data manifest process to standardize knowledge capture. This helped the team to dig into different areas, and provide useful summaries. The standards informed ownership, ingest method, data dictionary, and use in models.
- Infra
- We needed to work on client VM due to sensitive data, without ability to transfer any code or data.
- I set up a standard workflow to support best practice software development. This included codebase version control, environment management, and python packaging. This afforded us repeatability in our analysis and ease of deployment to client.
- ML
- We needed to analyze all available client data, while providing an explainable model.
- I proposed a three-step approach. A random forest model to find the predictive ceiling and most relevant features. Then, use the top features in an interpretable linear regression model. We worked on feature tweaks to reduce the predictive gap, such as log transformations.
- Client training
- The client was not used to working with python applications. There was also knowledge transfer needed for data preparation and modelling approach.
- We held weekly update and collaboration sessions with the client to discuss results. Towards project delivery, we held daily training sessions with client data science team. We prepared jupyter notebook tutorials to guide client DS through a familiar interface.
- Pilot
- We prepared project analysis outputs to support an upcoming pilot. We needed to prepare the client strategy and DS team to launch the pilot.
- We prepared an excel simulator with configurable controls. This gave the strategy team insightful tools to guide pricing decisions. As mentioned in client training, we provided deep knowledge transfer to the team.
Wholesaler pricing
Promo effect attribution and optimization for over $1 billion annual spend.
We helped a B2B distribution company to increase sales volume. We created models to attribute promotion performance and expected secondary effects. This allowed us to identify best and worse performing promotions. This guided decision making towards effective promotions and maximizing sales volume. I contributed to this project by:
BCG risk and best practice ring-fence
Stakeholder engagement to assess project risk and ensure high quality deliverables.
The team goal was to ensure that all active projects had completed a risk assessment. This was to ensure all projects were being delivered with quality standards, with no risk to BCG. During my time in this team, I helped course correct a few projects. My impact included:
Survey data predictive value
Demonstrate survey data value as an employment rate leading indicator.
The aim of this project was to work with a political polling data vendor to assess and recommend additional revenue streams for their data. We were successful in demonstrating a valuable use case (unemployment forecasting), culminating in ongoing use of the data for BCG clients. I contributed to the project in the following ways:
High end fashion retailer
$52 million personalization pilot to experiment on many products and sales offers.
The overall goal of the project was prove the effectiveness of personalized sales offers to increase revenue via sales lift %. Overall the project as a great success and we showed significant lift above client BAU (about 25%). I had the following impacts on the project.
Chatbot for survey engagement
Development of survey chatbot to improve respondent engagement and research insight to support new business development.
ML, NLP, infra, leadership, workstream development
- Goal
- Build a market research survey chatbot to increase respondent engagement and quality of feedback while retaining guardrail metrics, and highlight the company as a key market research innovator and competitive advantage.
- Impact
- We created a full chatbot solution along with metrics and guardrail metrics, including increased ‘quality’ of information as tested against explainability of overall satisfaction, quantity of text information, while maintaining survey completion rate and unsubscribe rate.
- The chatbot solution used NLP based models (keywords, sentiment, classification), state machines, and a docker based API to serve chatbot requests to users
- Challenges
- Hard to identify metrics and quantify improvement
- Technology challenges early in career
- Skills and learnings
- Learnt API, state machine frameworks
- Worked with junior analysts to create effective keyword based hierarchies
Text analytics capability
Development of text classification and sentiment scoring capability to support new business development.
ML, NLP, infra, workstream development
- Goal
- Open ended text responses in market research surveys a quality source of insights, however these require slow, expensive and manual review. I designed a text classification and sentiment scoring process to quickly and accurately compile insights, in a highly scalable way.
- Impact
- We successfully piloted the text classification process with a client, enabling them to save $30k per year on text analytics costs
- We sold several projects based on this capability to other clients, with new revenues of $120k in first year, and $300k in second.
- We provided scalable sentiment scoring analysis to all projects as client value add.
- Challenges
- Technical modelling challenges with small categories, complexity of feedback, working with unstructured data
- Developing a process which was transparent to accuracy, trade-offs, and value
- Skills and learnings
- Learnt strong python and machine learning fundamentals, advanced text analytics techniques
- Worked with business owner to productise and sell the innovation to clients
Government infrastructure end-user research
Measuring end-user satisfaction with large scale infrastructure project affecting all Australians,
ML, infra, PMO
- Goal
- Assist the client in understanding customer satisfaction with their service and root cause issues across segments
- Impact
- We surveyed millions of Australian consumers on their infrastructure satisfaction and several points in lifecycle including installation, 3 months, and 6 months into usage.
- Used robust data science approach to estimate root cause of customer satisfaction, across several segments
- Challenges
- Large amount of data and weighting requirements
- Creating automated data ingestion and analysis processes to support early week reporting
- Creating infrastructure to handle hundreds of linear regressions and random forest based feature importance
- Skills and learnings
- Python and pandas
- Ability to break down data and analytical problems into executable code
- Project mangagement