Sprocket Central Pty,Ltd-Data Analytics Using Tableau.

Thaker Namrata
4 min readJul 10, 2020

This is the demonstration of my work during the virtual internship at KPMG provided by Insidesherpa.

Sprocket has approached KPMG’s Lighthouse and Innovation Team, and as a consultant, I help the client ensure that their data is ready for analysis, as well as giving them a structure for the analysis process and finally, presenting our findings and recommendations.

Sprocket Central Pty Ltd , a medium size bikes and cycling accessories organisation, needs help with its customer and transactions data. The organisation has a large datasets relating to its customers, but their team is unsure how to effectively analyse it to help optimize its marketing strategy.

Three datasets are given:

1.Customer Demographic

2.Customer Addresses

3.Transactions data in the past 3 months

These internship comprises of three modules:

1.Data Quality Assessment

2.Data Insights

3.Data Insights and Presentation

1) Data quality Assessment:

Value Inconsistency

Inconsistent values such as use of ‘M’ or ‘F’ for gender instead of ‘Male’ or ‘Female’. To mitigate this issue, there can be a setup that automatically inputs the complete word when the initials are entered. In addition, the method of data collection for this attribute can be a multiple choice-type.

Unknown or blank values in gender, occupation and first and last names. This could be an issue when looking up a customer by name because of ambiguity. To mitigate this issue, there is a need to ask the customer more information such as industry or occupation to identify them.

Error in data input such as birth year of 1843, which falls outside reasonable value; misspelling of ‘Female’ in a particular input. These can be solved by doing a allowable value check and error checks.

Transactions Dataset

Approved transactions with missing values for product line, product class, brand and product size with product id of 0. If these are proven not to be actual transactions, they may be removed from the dataset.

Missing values for online order attribute.

Inconsistent product ID number with similar product brand, line, class, size. There is a need to figure what product ID stands for, whether it varies for certain traits such as color to know whether product ID should remain or be corrected accordingly.

Customer Address

Inconsistent values entered for the ‘state’ attribute such as ‘NSW’ and ‘New South Wales’ or ‘VIC’ and ‘Victoria’. Can be solved by setting up automatic input/ correction. Data collection can be done in multiple choices to select to ensure consistency.

Questionable house/ lot number of ‘0’ in many addresses. Setup of allowable value check.

Questionable large house/ lot number in many addresses such as ‘93405’ or ‘02663’. Use of automatic address generator with validity check to mitigate this issue.

2) Data Insights:

2.1)Data Exploration:

Age Distribution:

q1: Under 25; q2: 25 to 48; q3: 48 to 59; q4: Over 59

Here we can see that our new customers are majorly from the age group of 25–48 years.

Their is a very big drop of customers from the age of group 48–59 years.

Well the percentage of people under 25 have not changed.

2.2)Model Development:

Bike purchases in three years:

Percentage of new customers are mostly females.

So we should advertise more towards female customers.

2.3)Interpretation:

Job industry category:

Our new customers are mostly from manufacturing and finance sector.

We should try to advertise in such a way so that other sector may relate it and get attracted.

Car purchases:

NSW car should be consider since the number of customer do not own car is more than that own.

Qld is own more so we need to find the flaws in that and should implement in our model so that customer gets attracted to the bike features more.

3)Presentation through Dashboard:

Hope you have found this work useful and informative.

--

--