A hand holding bank notes in front of a calculator

When Data Science meets business and legal requirements.

Customer retention is a key business challenge, especially when making tough decisions like price increases. A company raised their subscription prices annually but lacked insight into how these increases affect customer loyalty and what price point would maximize profits. To address this challenge, they enlisted our team of predictive modeling experts.

This is never as simple as splitting a dataset and running a Train-Test model. Legal restrictions prevented us from using a conventional churn model to target customers for price increases. Instead, we needed to work with non-discriminatory customer characteristics only.

Approach

We began this project with a precise problem statement: How can we determine the optimal price increase for a client's subscription while minimizing the risk of churn?

Obviously, the zero risk assumption is flawed, as clients are price-sensitive, particularly in today's inflationary environment. We started by analyzing the impact of previous price increases. For this event study, we needed to define a clear time window. After consulting with the client, we agreed to examine customer churn within 4 months following each price increase, focusing specifically on Cancellation Requests (CR) submitted during a 2-month period.

Time window view

After completing this task, we built a comprehensive dataset containing diverse characteristics from our clients' historical and current behavioral patterns. This dataset captured the cumulative effects of previous price increases and included external variables like inflation rate. The final dataset was notably unbalanced, with a churn rate of only 1.5%.

Predictions?

Based on our thorough data analysis from the previous phase, we developed two separate predictive models—one for private clients and one for businesses. The analysis showed these groups had distinctly different behaviors. For both segments, we chose LGBM regressor algorithms for feature selection and predictions. Among the many available algorithms, we selected LGBM for two key reasons:

The dataset is unbalanced.
The data is transactional.

Rather than delve into the model selection details, let's focus on the prediction results. They were neither exceptional nor poor—simply adequate. This performance wasn't concerning since we couldn't legally use a predictive model anyway. What truly mattered was the feature selection.

Probability of churn, distributions for various price increases.

To determine sensitivity classes, we analyzed the variability in churn probability between 0% and 10% price increases. We simulated churn probabilities across price increases from 1% to 10%. As shown in the figure below, we identified the low-risk group (highlighted in pink). These customers demonstrated strong loyalty—even with a 10% price increase (our maximum tested value), they maintained a churning probability below 55%. Based on client-specified thresholds, we identified three distinct groups:

Low sensitivity: 47% of customers / 0.7% churn rate
Medium sensitivity: 32% of customers / 1% churn rate
High sensitivity: 21% of customers / 2.4% churn rate

The high-risk group is 3.5 times more likely to churn—a notable distinction given our unbalanced dataset.

Selecting low-risk customers.

Finding a way: Multinformation Criteria

You may have noticed that while we're using a predictive algorithm for predictions, we can't legally use it to determine price increases. Here's our solution: we ranked all our constructed features based on how well they distinguish between low, medium, and high-risk groups.

For example, let's look at how the feature 'Number of use last month' relates to these three customer classes:

Number of uses the month before the price increase, and sensitivity to price increase.

We applied the Multi-information Criteria to find the best combination of features that would capture data variance while eliminating redundancy. For practical business purposes, we selected ten features and combined them in groups of three, creating 120 possible combinations.

Solution & Results

We implemented the three-feature combination that best distinguished between Low, Medium, and High-risk groups. For instance, customers with zero usage consistently fell into the high-risk category. This approach enabled us to create groups based purely on user characteristics and analyze their historical churn rates. Each group also showed distinct price sensitivity levels, which we measured through linear regression.

Our final solution empowers the business team to select three features from a pool of ten to form sensitivity groups. The tool generates detailed reports for each group, displaying projected churn rates and recommended price increases. It calculates both the churn risk and expected profit increase from price adjustments, then determines the overall expected payoff by comparing these factors. We delivered to our client a detailed list of recommended price increases, backed by historical data showing their impact on customer retention. We also provided guidance on the optimal combination of user characteristics for effective group segmentation.

A final word

Though straightforward from a business perspective, this project presented unique challenges. While technical aspects like data analysis and predictive modeling are our forte, the real challenge lay in navigating legal constraints. Our team had to distill complex customer behavior into just three distinctive features—all while maintaining utmost accuracy. It was unfamiliar territory for us, but that's exactly what made it exciting!

Customer retention is a key business challenge, especially when making tough decisions like price increases. A company raised their subscription prices annually but lacked insight into how these increases affect customer loyalty and what price point would maximize profits. To address this challenge, they enlisted our team of predictive modeling experts.

This is never as simple as splitting a dataset and running a Train-Test model. Legal restrictions prevented us from using a conventional churn model to target customers for price increases. Instead, we needed to work with non-discriminatory customer characteristics only.

Approach

We began this project with a precise problem statement: How can we determine the optimal price increase for a client's subscription while minimizing the risk of churn?

Obviously, the zero risk assumption is flawed, as clients are price-sensitive, particularly in today's inflationary environment. We started by analyzing the impact of previous price increases. For this event study, we needed to define a clear time window. After consulting with the client, we agreed to examine customer churn within 4 months following each price increase, focusing specifically on Cancellation Requests (CR) submitted during a 2-month period.

Time window view

After completing this task, we built a comprehensive dataset containing diverse characteristics from our clients' historical and current behavioral patterns. This dataset captured the cumulative effects of previous price increases and included external variables like inflation rate. The final dataset was notably unbalanced, with a churn rate of only 1.5%.

Predictions?

Based on our thorough data analysis from the previous phase, we developed two separate predictive models—one for private clients and one for businesses. The analysis showed these groups had distinctly different behaviors. For both segments, we chose LGBM regressor algorithms for feature selection and predictions. Among the many available algorithms, we selected LGBM for two key reasons:

The dataset is unbalanced.
The data is transactional.

Rather than delve into the model selection details, let's focus on the prediction results. They were neither exceptional nor poor—simply adequate. This performance wasn't concerning since we couldn't legally use a predictive model anyway. What truly mattered was the feature selection.

Probability of churn, distributions for various price increases.

To determine sensitivity classes, we analyzed the variability in churn probability between 0% and 10% price increases. We simulated churn probabilities across price increases from 1% to 10%. As shown in the figure below, we identified the low-risk group (highlighted in pink). These customers demonstrated strong loyalty—even with a 10% price increase (our maximum tested value), they maintained a churning probability below 55%. Based on client-specified thresholds, we identified three distinct groups:

Low sensitivity: 47% of customers / 0.7% churn rate
Medium sensitivity: 32% of customers / 1% churn rate
High sensitivity: 21% of customers / 2.4% churn rate

The high-risk group is 3.5 times more likely to churn—a notable distinction given our unbalanced dataset.

Selecting low-risk customers.

Finding a way: Multinformation Criteria

You may have noticed that while we're using a predictive algorithm for predictions, we can't legally use it to determine price increases. Here's our solution: we ranked all our constructed features based on how well they distinguish between low, medium, and high-risk groups.

For example, let's look at how the feature 'Number of use last month' relates to these three customer classes:

Number of uses the month before the price increase, and sensitivity to price increase.

We applied the Multi-information Criteria to find the best combination of features that would capture data variance while eliminating redundancy. For practical business purposes, we selected ten features and combined them in groups of three, creating 120 possible combinations.

Solution & Results

We implemented the three-feature combination that best distinguished between Low, Medium, and High-risk groups. For instance, customers with zero usage consistently fell into the high-risk category. This approach enabled us to create groups based purely on user characteristics and analyze their historical churn rates. Each group also showed distinct price sensitivity levels, which we measured through linear regression.

Our final solution empowers the business team to select three features from a pool of ten to form sensitivity groups. The tool generates detailed reports for each group, displaying projected churn rates and recommended price increases. It calculates both the churn risk and expected profit increase from price adjustments, then determines the overall expected payoff by comparing these factors. We delivered to our client a detailed list of recommended price increases, backed by historical data showing their impact on customer retention. We also provided guidance on the optimal combination of user characteristics for effective group segmentation.

A final word

Though straightforward from a business perspective, this project presented unique challenges. While technical aspects like data analysis and predictive modeling are our forte, the real challenge lay in navigating legal constraints. Our team had to distill complex customer behavior into just three distinctive features—all while maintaining utmost accuracy. It was unfamiliar territory for us, but that's exactly what made it exciting!

Customer retention is a key business challenge, especially when making tough decisions like price increases. A company raised their subscription prices annually but lacked insight into how these increases affect customer loyalty and what price point would maximize profits. To address this challenge, they enlisted our team of predictive modeling experts.

This is never as simple as splitting a dataset and running a Train-Test model. Legal restrictions prevented us from using a conventional churn model to target customers for price increases. Instead, we needed to work with non-discriminatory customer characteristics only.

Approach

We began this project with a precise problem statement: How can we determine the optimal price increase for a client's subscription while minimizing the risk of churn?

Obviously, the zero risk assumption is flawed, as clients are price-sensitive, particularly in today's inflationary environment. We started by analyzing the impact of previous price increases. For this event study, we needed to define a clear time window. After consulting with the client, we agreed to examine customer churn within 4 months following each price increase, focusing specifically on Cancellation Requests (CR) submitted during a 2-month period.

Time window view

After completing this task, we built a comprehensive dataset containing diverse characteristics from our clients' historical and current behavioral patterns. This dataset captured the cumulative effects of previous price increases and included external variables like inflation rate. The final dataset was notably unbalanced, with a churn rate of only 1.5%.

Predictions?

Based on our thorough data analysis from the previous phase, we developed two separate predictive models—one for private clients and one for businesses. The analysis showed these groups had distinctly different behaviors. For both segments, we chose LGBM regressor algorithms for feature selection and predictions. Among the many available algorithms, we selected LGBM for two key reasons:

The dataset is unbalanced.
The data is transactional.

Rather than delve into the model selection details, let's focus on the prediction results. They were neither exceptional nor poor—simply adequate. This performance wasn't concerning since we couldn't legally use a predictive model anyway. What truly mattered was the feature selection.

Probability of churn, distributions for various price increases.

To determine sensitivity classes, we analyzed the variability in churn probability between 0% and 10% price increases. We simulated churn probabilities across price increases from 1% to 10%. As shown in the figure below, we identified the low-risk group (highlighted in pink). These customers demonstrated strong loyalty—even with a 10% price increase (our maximum tested value), they maintained a churning probability below 55%. Based on client-specified thresholds, we identified three distinct groups:

Low sensitivity: 47% of customers / 0.7% churn rate
Medium sensitivity: 32% of customers / 1% churn rate
High sensitivity: 21% of customers / 2.4% churn rate

The high-risk group is 3.5 times more likely to churn—a notable distinction given our unbalanced dataset.

Selecting low-risk customers.

Finding a way: Multinformation Criteria

You may have noticed that while we're using a predictive algorithm for predictions, we can't legally use it to determine price increases. Here's our solution: we ranked all our constructed features based on how well they distinguish between low, medium, and high-risk groups.

For example, let's look at how the feature 'Number of use last month' relates to these three customer classes:

Number of uses the month before the price increase, and sensitivity to price increase.

We applied the Multi-information Criteria to find the best combination of features that would capture data variance while eliminating redundancy. For practical business purposes, we selected ten features and combined them in groups of three, creating 120 possible combinations.

Solution & Results

We implemented the three-feature combination that best distinguished between Low, Medium, and High-risk groups. For instance, customers with zero usage consistently fell into the high-risk category. This approach enabled us to create groups based purely on user characteristics and analyze their historical churn rates. Each group also showed distinct price sensitivity levels, which we measured through linear regression.

Our final solution empowers the business team to select three features from a pool of ten to form sensitivity groups. The tool generates detailed reports for each group, displaying projected churn rates and recommended price increases. It calculates both the churn risk and expected profit increase from price adjustments, then determines the overall expected payoff by comparing these factors. We delivered to our client a detailed list of recommended price increases, backed by historical data showing their impact on customer retention. We also provided guidance on the optimal combination of user characteristics for effective group segmentation.

A final word

Though straightforward from a business perspective, this project presented unique challenges. While technical aspects like data analysis and predictive modeling are our forte, the real challenge lay in navigating legal constraints. Our team had to distill complex customer behavior into just three distinctive features—all while maintaining utmost accuracy. It was unfamiliar territory for us, but that's exactly what made it exciting!

The Churnover Chronicles: how price hikes turn customers into butterflies

The Churnover Chronicles: how price hikes turn customers into butterflies

Approach

Predictions?

Finding a way: Multinformation Criteria

Solution & Results

A final word

Approach

Predictions?

Finding a way: Multinformation Criteria

Solution & Results

A final word

Approach

Predictions?

Finding a way: Multinformation Criteria

Solution & Results

A final word

Ready to reach your goals with data?

Get started

Ready to reach your goals with data?

Get started

Ready to reach your goals with data?

Get started

Ready to reach your goals with data?

Get started

Insights. Actions. Results.

Join our newsletter

Insights. Actions. Results.

Insights. Actions. Results.

Join our newsletter