Order Now

Customer Analytics In Large Organisations Assignment Sample

6500 Project Delivered
4.9/5 User Rating

Customer Analytics In Large Organisations Assignment Sample


Get free samples written by our Top-Notch subject experts for taking online Assignment Help UK services.

In this report, three years dataset has been analyzed with the help of the R programming language in order to determine the sales and predict the regular customer based on the dataset./ here several analyses has been done and data visualization is taken place. To get the proper data here, a total of three years' data has been collected and defined properly. In this case, for each day a single dataset has been provided and in total 1073 CSV files are there which contain all the details of the customers, their orders as well as the quantity they purchase from the supermarket. Here, for the analysis purpose, three CSV file of the same date has been taken and an analysis has been done. Some data visualization is done in order to meet the research criteria and meet all the requirements of the research question. Here several transactional datasets have been given based on three years which are 2013, 2014, and 2015. Based on these three years transactional information related to supermarkets has been analyzed to develop suitable results. 

Dataset description

In this research process, customer analysis has been done based on some transactional datasets. There are three-year details information of transactions has been given in the given datasets. In this research process total of nine datasets have been used to access the details process of analysis. Here R code language has been used to execute the analysis process based on this dataset that is given. Here three years of date-wise information has been given in the dataset. All details that are given in the datasets are supermarket-related information. All details have been mentioned in the detail’s analysis process all type of activity has been done on this given dataset. Within this large dataset nine data sets, three for each year have been chosen to execute the entire research process. All types of supermarket transactional details have been mentioned in this given dataset. All details information has been given in the datasets based on each day. Here holiday details have been missing details has been mentioned in this research process all type of activity has been databased on this dataset all type of activity has been done in R coding language where linear regression will be done to meet the requirements (Kumari and Yadav, 2018). In this research process first, three days of data have been taken from this huge number of the dataset. 

Here 2013, 2014, and 2015 transactional details of the supermarket. The data set that contains information for 2013 2nd January has been described in a detail’s way. Here sales date, time, receipt number, customer number, and other transactional details have been mentioned in this dataset the data set also contains information on total sales, and total records based on whether this commodity there has any offer or not. In this dataset both quantitative and qualitative values are present. Hence in R code data set preprocessing has been needed. All type of commodity details has been mentioned. After analyzing the dataset it is understood the maximum commodity does not contain any offer. There are very few commodities that have offers. Here all the details are not in proper form hence the dataset need to convert into a suitable form. here a large number of the dataset that contains information related to transactional details has been mentioned. Here each dataset contains the same information about transactions in the supermarket. Here details analysis has been done to understand the customer analytics of an organization. Barcode details for each product have been mentioned in the given dataset. In 2014 dataset contains information about the offer that is given based on the type of commodity. Here maximum commodities do not contain any offer. In this research process, the total number of purchase and total money spent on each purchase has been calculated. The dataset contains both numerical and string information. This dataset also contains various imperfect information this data set needs to convert into a suitable form to execute the entire research process. The regression analysis has been done on this given dataset to develop a suitable result (Maulud. and Abdulazeez, 2020). Data set contain information about customer abs product of supermarkets. Clustering analysis and neural network analysis have been executed in\ this research process. All type of activity has been done by the R studio software platform. 


Discussions Research Question supported by Required Numerical Outputs

In order to meet the requirements, linear regression will be done where all the taken datasets will be used and the proper accuracy score will be evaluated (Liu et al., 2019). The research questions have been answered through data visualization in all the three-year supermarket datasets and regression analysis has been done over the target column offer based on the value “total-sale-amount-inclusive GST”. The developed result can calculator higher profit rates for the organization. 

Tables and Data Visualizations

Data visualization comes with the transformation of data into a better understandable format as it identifies outliers, trends, and patterns between a set of data attributes. In this data visualization process different charts, graphs, and mapped data have been generated that easily describe the patterns in these large data attributes. After merging all the data into one data frame visualization process has been implemented as described above. Various packages have been imported to execute this entire research process. 

 The above figure describes the margin process of datasets. Here three different year datasets have been given based on day basis transactional details. Within this large informational set, nine datasets have been chosen to execute regression analysis. The above figure describes the details data set reading in the R studio platform. 

The above figure describes the data concatenation where all three large datasets have been merged into a single dataset. Visualization has been done over this merge that provides a comparative analysis between the data attributes. 

Data processing is an important process in this analysis that has been mentioned here. In this preprocessing null value dropping, and numerical dataset conversion has been done by using R coding. Here "is. null" comment is used to check all null values that are present in the datasets. 

The above pictorial representation delivers the merged data charts where it shows all the data values as to the particular attributes. Here data concatenation has been done. 

The above figure shows the categorical value conversion from string to numerical value as it produces better visualization at the time of prediction. The offer column in the large dataset has been transformed for proper visualization based on these data attributes.

The above figure displays the histogram plot that has been generated based on the 2014 transactional dataset. 

Here histogram plot has been displayed that is developed based on three transactional datasets. Here customer-based histogram plot has been developed using R coding.

The above figure shows the total state amount including all three years based on the three transactional datasets. 

This is the scatter plot that has been obtained with the help of the data where the offer and the quantity have been taken into consideration 

The above figure describes the scatter plot, where all the data for the year 2014 are taken into consideration and an evaluation, is done. 

The above figure shows the mean, median, and max value generation based on these datasets. This dataset has been used here to develop a better output result. 

In this figure, the actual price is compared with the predicted price where all the related prices are taken into consideration 

The above figure shows the linear regression process that has been executed on the R studio platform. The regression analysis has been executed based on the transactional dataset. 

The above figure shows the linear regression graph through a red regression line that shows the possible changes over prediction. The relationship between these two variables has been mapped through this graphical representation. 


In this project, the market analysis is done where a large organization data is taken into consideration and all the data is evaluated in the R programming language. Regression analysis between the offer and total sales amount has been mapped through a graphical representation. Based on the data visualization trends, outliers and patterns have been identified. The data of 2013, 2014, and 2015 is taken into consideration and a process is done and various visualization is being done in addition to this, logical regression is done in order to get the accuracy and prediction for the customer rate of the supermarket.


Maulud, D. and Abdulazeez, A.M., 2020. A review on linear regression comprehensive in machine learning. Journal of Applied Science and Technology Trends, 1(4), pp.140-147.

Kumari, K. and Yadav, S., 2018. Linear regression analysis study. Journal of the Practice of Cardiovascular Sciences, 4(1), p.33.

Liu, S., Lu, M., Li, H. and Zuo, Y., 2019. Prediction of gene expression patterns with a generalized linear regression model. Frontiers in Genetics, 10, p.120.

It’s Time to Boost Your Grades with Professional Help
  • Improved Scores

    Get Better Grades In Every Subject

  • Timely Delivery

    Submit Your Assignments On Time

  • Experienced Writers

    Trust Academic Experts Based in UK

  • Safety is Assured

    Your Privacy is Our Topmost Concern

Rapid Assignment Help
Just Pay for your Assignment
  • Turnitin Report
  • Proofreading and Editing
  • Formatting
  • Unlimited revisions
  • Quality Check
  • Total
Let's Start
35% OFF
Get best price for your work
  • 6500+ Projects Delivered
  • 503+ Experts 24*7 Online Help

offer valid for limited time only*