Net Promoter Score NPS Calculation Using Python

Visualizing data using python is one of the most used aspects of python. Its handy nature gives excellent ease to developers while analyzing the data. Now, this data may belong to several fields like surveys or records. They are easily analyzed using visualizing tools of python. Whenever we use these survey records to deduce some conclusion, we often use the averaging system. We take an average of the total records in this system and roam around that value for our work. However, now some other metrics are also used for visualization. One of such metrics is calculating NPS using Python. It is one of the best ways to visualize data and draw conclusions. Moreover, it gives us a better visualization of data and clarifies the data distribution better.

Contents

What is Net Promoter Score?

On the one hand, we use the traditional method of an average system to visualize data. On the other hand, in the NP score, we first distribute the data on a scale of 10 and then calculate NP Score for that. For that, we divide it into three buckets based on their score.

The best two scores, i.e., 9 and 10, are given to Promoters, then 7 and 8 are provided to Passives, and the rest of 6 are Detractors. Now, to understand it more clearly, think of a product rating. If the user rates it as 9 or 10, they are satisfied by the product and may promote that product.

However, if they rate it as 7 or 8, they are just okay with the product, and they neither promote it nor degrade it. If they rate it as 6 or lower, they are unsatisfied with the product and likely to speak badly about it.

Now once we get the data on the scale, we will calculate NPS Score for that. To do that, we will use the following formula:

Net Promoter Score NPS Calculation Formula — nps python formula

The result of the above equation ranges from 100 to -100. The smallest number is worse is the result. Now, once you have understood the concept, let’s see that on some data samples.

Creating Sample Data

So, the dataset we will use is created with the following properties.

The dataset contains the data ranging between 0 to 10.
Major part of data is 9 and 10.
7 and 8 are at second in count and less than 6 are the lowest.
These scores are distributed among random countries and travelere types like business or leisure. This will help us to analyze some trends.

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick

df1 = pd.DataFrame(np.random.randint(9,11,size=(1000, 1)), columns=['How likely are you to reccomend the product?']) #promoters
df2 = pd.DataFrame(np.random.randint(7,9,size=(400, 1)), columns=['How likely are you to reccomend the product?']) #passives
df3 = pd.DataFrame(np.random.randint(0,7,size=(100, 1)), columns=['How likely are you to reccomend the product?']) #detractors

df = pd.concat([df1,df2,df3], ignore_index=True)

df['Country Number'] = np.random.randint(1, 6, df.shape[0]) #assiging a random number to assign a country
df['Traveler Type Number'] = np.random.randint(1, 3, df.shape[0]) #assigning a random number to assign a traveler type

#Function to assign a country name
def country_name(x):
    if x['Country Number'] == 1:
        return 'United States'
    elif x['Country Number'] == 2:
        return 'Canada'
    elif x['Country Number'] == 3:
        return 'Mexico'
    elif x['Country Number'] == 4:
        return 'France'
    elif x['Country Number'] == 5:
        return 'Spain'
    else:
        pass

#Function to assign a traveler type
def traveler_type(x):
    if x['Traveler Type Number'] == 1:
        return 'Business'
    elif x['Traveler Type Number'] == 2:
        return 'Leisure'
    else:
        pass

#apply the function to the numbered columns
df['Country'] = df.apply(country_name, axis=1)
df['Traveler Type'] = df.apply(traveler_type, axis=1)

df[['How likely are you to reccomend the product?', 'Country', 'Traveler Type']] #view to remove the random number columns for country and traveler type

In the above dataset, we first imported the required library. After that, we created a data frame consisting of integers 9 and 10, and their count is 1000. Then, we created another dataframe with integers 7 and 8, and their count is 400.

After that, we created a third dataframe with an integer less than seven, and their count is 100. Once we are done with creating the dataframe, we merge them into one dataframe using the concat() function. Then we created two columns named “country number” and “traveler type number” and assigned a random integer to each of them. Then we created two functions named “country_name()” and “traveler_type” to convert those numbers into their corresponding name. In the end, we printed the whole dataframe with desired columns.

Melting Dataframe

melted_df = pd.melt(frame = df, id_vars = ['Country','Traveler Type'], value_vars = ['How likely are you to reccomend the product?'],value_name='Score', var_name = 'Question' )

melted_df = melted_df.dropna()

melted_df['Score'] = pd.to_numeric(melted_df['Score'])
melted_df

Once done with the above process, we will modify our dataframe for better visualization. To do that, we will use the melt() function to convert the ‘How likely are you to recommend the product?’ column into the “Score” column and then add that question to each row. Then, we drop all the unavailable values from the table using the dropna() method.

Categorizing Score

def nps_bucket(x):
    if x > 8:
        bucket = 'promoter'
    elif x > 6:
        bucket = 'passive'
    elif x>= 0:
        bucket = 'detractor'
    else:
        bucket = 'no score'
    return bucket

melted_df['nps_bucket'] = melted_df['Score'].apply(nps_bucket)

melted_df

Once we create our dataset, it’s time to categorize it into promoters, detractors, and passive. For that, we created a function named nps_bucket. Then add them as the column.

Calculating Net Promoter Score NPS Python

Once we are done with categorizing the data, we will calculate NPS for each categorical country and traveler type.

grouped_df = melted_df.groupby(['Country','Traveler Type','Question'])['nps_bucket'].apply(lambda x: (x.str.contains('promoter').sum() - x.str.contains('detractor').sum()) / (x.str.contains('promoter').sum() + x.str.contains('passive').sum() + x.str.contains('detractor').sum())).reset_index()

grouped_df_sorted = grouped_df.sort_values(by='nps_bucket', ascending=True)
grouped_df_sorted

Net Promoter Score NPS Python — nps python

We use the groupby() function, which groups the data based on ‘Country’, ‘Traveler Type’, ‘Question’. Then we calculated the NPS for each group.

Plotting the Data

It is time to chart the data using the seaborn and matplotlib library.

sns.set_style("whitegrid")
sns.set_context("poster", font_scale = 1)
f, ax = plt.subplots(figsize=(15,7))

sns.barplot(data = grouped_df_sorted, x = 'nps_bucket',y='Country',hue='Traveler Type',ax=ax)
ax.set(ylabel='',xlabel='', title = 'NPS Score by Country and Traveler Type')
ax.set_xlim(0,1)
ax.xaxis.set_major_formatter(plt.NullFormatter())
ax.legend()

#data labels
for p in ax.patches:
    ax.annotate("{:.0f}".format(p.get_width()*100),
                (p.get_width(), p.get_y()),
                va='center', 
                xytext=(-35, -18), #offset points so that the are inside the chart
                textcoords='offset points', 
                color = 'white')
    
plt.tight_layout()
plt.savefig('NPS by Country.png')
plt.show()

NPS Python score by country — nps python

So, in the above chart, we can see that we got the graph for each net promoter score. This chart gives us information about how much a person likes to suggest travel to their known ones or how do they rate a flight.

Python NPS in Django

This library in Django works as similar as the above described. However, we need not code this much to take it into use. We need to use the net_promoter_score() function to calculate it. Let’s see it.

Installation:

pip install django-nps

 UserScore.objects.filter(timestamp__month=12).net_promoter_score()

Conclusion

So, today in this article, we learned how to calculate Net Promoter Score for a dataset today. We have seen the data distribution while figuring out how they represent the data easily.