Annalisa Occhipinti Page | 1
Big Data and Business Intelligence
Dr Annalisa Occhipinti
Business Intelligence Solution and Report
15th January 2021
Deadline Time: 4:00pm
Library Support for Academic Skills
Did you know you can book an individual 30 minute tutorial in the Learning Hub with an adviser to help you with
your academic skills, writing or numeracy? Or that there are loEDT of really useful workshops available to help you
with your studies and assessments? Have a look at the [email protected] workshops for more details.
FULL DETAILS OF THE ASSIGNMENT ARE ATTACHED
INCLUDING MARKING & GRADING CRITERIA
Central Assignments Office (Middlesbrough Tower M2.08) Notes:
• All work (including CDs etc) needs to be secured in a plastic envelope or a folder and clearly marked with the
student name, number and module title.
• An Assignment Front Sheet should be fully completed before the work is submitted.
• When an extension has been granted, a fully completed and signed Extension form must be submitted to the
Online Submission Notes:
• Please follow carefully the instructions given on the Assignment Specification
• When an extension has been granted, a fully completed and signed Extension form must be submitted to the
Annalisa Occhipinti Page | 2
Big Data and Business Intelligence
“Business Intelligence (BI) is the use of computing technologies, applications, and practices for the
collection, integration, analysis, and presentation of business information. Business Intelligence
solutions provide current, historical, and predictive views of internally structured data for products
and departments by establishing more effective decision-making and strategic operational insights.”
For this In-Course Assessment (ICA) you are required to design and implement a Business
Intelligence (BI) Solution from an Industry based dataset using Microsoft Power BI.
You will submit a final written Report* to present your BI design and solution.
*Report: The term report is used to denote a Word document but you may utilise Powerpoint to
present part of the design and implementation of your BI Solution with supporting artefacts.
You are required to submit your ICA to the submission link on Blackboard by the due date.
ICA Requirements & Logistics
The assessment is individual based and you are required to produce a Business
Intelligence Report covering a BI Design and Solution to an industry-based dataset.
The report should mainly consists of images or screenshots with wording kept bullet
point or summative. The report is to be about 1200 words and including the
following two sections:
Section 1: Business Intelligence Design
Students will be required to identify the Business Intelligence Scope and outline the BI
Questions and BI Data Source Description. In addition, the student is able to demonstrate
the ability for data pre-processing in terms of both data cleansing and data modelling.
Demonstrate the ability to prepare a dataset for data analytics and data visualisation.
1a: BI Questions and BI Data Source Description
1b: BI Data Pre-Processing and Data Cleansing
1c: BI Data Modelling via Star Schema
Annalisa Occhipinti Page | 3
This will assess the student’s ability to demonstrate data integration and analysis skills
required to support business intelligence solutions using Power BI (assess Learning Outcome
1, 2 & 3).
Section 1 should include annotated screenshots of the BI design, data cleansing and the
data modelling of the dataset in PowerBI.
The annotation should be in bullet point format or a basic instruction set in order to
replicate your BI design, data cleansing and the modelling of the dataset in PowerBI.
You should aim to have a cleansed dataset loaded within MS Power BI so you can
undertake the development of a BI analytical and visualisation solution as outlined in the
Section 2: Business Intelligence Solution Report. The section of the report will be used to
present the completed data analytics and data visualisation phases of the Power BI
solution. Students will have to report the implementation of a Business Intelligence
solution for the chosen dataset that address the business question raised in section 1.
This will assess the student’s ability to demonstrate data analytics, visualisation and
reporting skills required to support any Business Intelligence solution (assess Learning
Outcome 1, 3, 4, 5, 6 & 7)
Ongoing review meetings with individual students will provide the opportunity for
formative feedback as they develop their solutions to support Business Intelligence.
Annalisa Occhipinti Page | 4
ICA Section 1 – Business Intelligence Design
Section 1 of the written report must include a description of the dataset selected. This includes the
data source, provided as link, database name, tables name and column name, a screenshot of the
data in cvs or excel format must also be included.
Your report must also cover the rationale behind the choice of your dataset. For example, the
reasons why you selected that specific dataset: What is the main focus of your BI project?
Which specific features are you going to focus on? Will this dataset help you in developing
specific business skills? These questions will define the main direction of your BI project.
Provide the Business Intelligent Scope for your dataset. Identify the Data Source, Data Descriptions
and the BI requirements. In addition, demonstrate the ability to load the dataset into PowerBI and
undertake data pre-processing in terms of data cleansing and data modelling.
Essentially, you are demonstrating the ability to prepare a dataset for BI data analytics and data
visualisation as required in section 2.
You may choose a Case Study/Dataset from the list at the end of this document or choose your
own dataset from Industry, Research or Government community. More information about
this will be given in lesson.
Section 1: Business Intelligence Design – Section details
This section of the report should address the following:
1a: BI Data Source Description and BI Requirements
Data Source: What are the sources of the data or where does the data originate from? A
description of the dataset selected. This includes the data source, provided as link,
database name, tables name and column name, a screenshot of the data in cvs or excel
format must also be included.
BI Requirements: The BI Questions will help determine the Key Performance Indicator (KPI) for
analytics and visuals. This is intended as a high-level scope of the initial discovery in order to
present an understand of the following:
• What are the sources for the data or where does the data originate from?
• General descriptors of the data.
• What business processes are most critical to measure – KPI?
• What kinds of business questions/problems are you trying to answer/solve?
• Who are the key user groups of this data/report?
Annalisa Occhipinti Page | 5
• Why is this information needed? (Think about the broader process)
1b: BI Data Pre-Processing or Data Cleansing
• A description of the data pre-processing steps. This will include any steps performed
to cleanse your data, such as removing NAs, renaming columns, changing data types,
removing errors, removing columns, merging tables etc.
• You should include screenshots of your Power BI project in your presentation to
illustrate the effect of each pre-processing step.
1c: BI Data Modelling via Star Schema – Facts and Dimensions.
• Using your BI requirements list or BI questions identify suitable data from your dataset and
present as Star Schema Facts and Dimensions. Commonly used dimensions are people,
products, location, date and time or demographics.
• A description of the Business Intelligence data modelling process. This will include the
description of all the steps performed to develop a well-structured Star Schema Facts
and dimensions data model, such as working with multiple tables, creating
relationships, modifying relationships or data normalisation.
• You should include screenshots of the Star Schema data model with facts and
dimensions from your Power BI project to illustrate the effect of the data modelling
• Measures and Calculated Columns – The core of the dimensional model and data
elements that can be summed, averaged, or mathematically manipulated.
ICA Section 2 – Business Intelligence Solution
Using your design from Section 1, demonstrate your ability to build a Business
Intelligence Solution and present your dashboard analytics, visualisation, and findings in
a Technical Report. Since this part is the actual business report, it would be better
submitting this section as a Word document.
The BI Solution report must include the following:
• Title page This will include the title of your report, project and your details
• Executive Summary This is a condensed version of your report. To help busy people
understand what the problem is, the executive summary includes the key findings and
your recommendations. It is common practice to include one or two charts from the
o Introduction This will illustrate the questions you are addressing in your report and
will give some information about the data collected. Include a screenshot of
Annalisa Occhipinti Page | 6
the data model with relationships and provide a short description of the
dataset used. This section can be shorter than usual considering that you might
be working with sample data, which are not properly related to any business.
o Finding based on analysis and evaluation This section will cover the key findings
based on analysis and evaluation: this is the most import section of your ICA.
This section must include:
(1) data analysis steps (either using M language or DAX formulae) to add
calculated columns and measures
(2) your Power BI visuals with the description of the type of data you are
displaying and why you are using such metrics. Screenshots and
description of each Power BI visuals must be included.
(3) a description of the Power BI dashboard (full collection of visuals) and
how the content of the Power BI pages is organised.
(4) the key findings from parts (1), (2) and (3).
• Conclusions and Recommendations Summarise your report and provide some
recommendations based on your findings.
You will need to submit a single zip folder containing the following:
• The complete report (either in PDF or word format).
• The Power BI project file.
• Any Excel or csv file used to import the data.
A submission link will be available via Blackboard under the Assessments link.
You must submit your files by the due date reported on the front page of this document.
Use the following naming convention studentnumber_lastname.firstname.zip (e.g.
Please also make sure that submitted documentation has been tested and verified and is
not corrupted in any way which would prevent access for marking.
Annalisa Occhipinti Page | 7
This assessment has been designed to assess the following learning outcomes:
Personal and Transferable Skills
1. Reflect on and critically appraise own performance and skills development
during the module.
Research, Knowledge and Cognitive Skills
2. Examine and evaluate system level software architectures, tools and techniques
for big data systems.
3. Demonstrate a critical understanding of the issues associated with business
intelligence and big data.
4. Integrate and synthesise diverse concepts and theory on system level software
architectures for big data systems to design the data processing requirements of
big data system.
5. Research an emerging database technology and communicate the findings in
writing in an academic context.
6. Select and implement appropriate BI tools and evaluate the results of their
application to a given scenario.
7. Autonomously plan, design and implement a big data system to meet an
enterprise’s information requirements and business rules for big data.
Annalisa Occhipinti Page | 8
Element 1: Business Intelligent Design Assessment Guideline Criteria (30 marks)
GRADE Characteristics of Response
[professional | outstanding]
The BI Design report meets Microsoft Developer Network (MSDN) and Industry
standards. The design is outstanding and meets ALL learning outcomes at an
outstanding level. Clear demonstration and outstanding awareness of DW and BI
needs for industry. All design steps clearly identified and of outstanding quality.
%: 50 – 69
The BRS report meets most of MSDN and Industry standards. The design is
exemplary and meets MOST learning outcomes at an exemplary level. Clear
demonstration and exemplary awareness of DW and BI needs for industry. Most
design steps clearly identified and of exemplary quality.
Generally, you needed to consider more depth or details to one or more sections.
You may also have considered provide more complex dimensions to your fact
<50 Marks: (0-11) [inadequate] The submitted work was insufficiently well-developed for this level of study and could be improved substantially by using more thoughtful inspection and use of more design requirements and/or solutions to support BI. Element 1: Detailed assessment criteria (30 marks) The following detailed assessment criteria have been provided to help you checking that you have included the required elements in your ICA. Parts Checklist Section 1: Business Intelligent Design (3 passes/30 points) A) Data Source Description and Business Questions (10 points) • A description of the dataset is included. What is the dataset about? • The data reported in each table (columns) is described • A screenshot of the data is included The following questions have been answered: • Why did you select this specific dataset? • Will this dataset help you in developing specific business skills? • What questions do you seek to answer with your BI project? ICA Specification Annalisa Occhipinti Page | 9 • Which specific features are you going to focus on? • Does this dataset address the Big Data problem? B) Data Pre-Processing and Data Cleansing (10 points) Evidence of the data pre- processing steps. Include screenshots of your Power BI project in your presentation to illustrate the effect of each pre-processing step. Is there any evidence of steps performed to cleanse the data? For example: • Removing NAs, • Renaming columns • Changing data types • Removing errors • Removing columns • Merging tables etc. C) Data Modelling – Star Schema Facts and Dimensions (10 points) A description of the data modelling process. You should include screenshots of the data model from your Power BI project to illustrate the effect of the data modelling process. If your database is already well-structured and it does not need any modification, show the steps in this section by deleting at least one relationship and show that you can perform the steps for adding relationships in Power BI. This will meet the first target in this section (creating new relationships). However, you will not receive the full 10 points in this case. Hence, we would suggest you show all the three steps if possible. There is evidence of some steps performed to develop a well- structured data model, such as: • Creating new relationships • Modifying relationships Splitting tables to normalise data. ICA Specification Annalisa Occhipinti Page | 10 Element 2: Business Intelligence Solution Assessment Guideline Criteria (70 marks) GRADE Characteristics of Response %: 70-100 Marks: (49-70) 70%+ [professional | outstanding] DW-BI Solution and interpretation of the data was undertaken using a dashboard approach employing a suitable software environment and graphical presentation to highlight critical elements in the data. The dashboard provided some dynamic data manipulation features and employed a variety of supporting statistical calculations and modelling. Supporting reports were very well designed and developed and accompanied by a well-developed action agenda supporting the DW-BI Solution. %: 50 – 69 Marks: (35-48) [exemplary] DW-BI Solution of the data was undertaken using some simple statistical processing and calculation, but it could be significantly enhanced with more in- depth statistical DW-BI Solution and by using a suitable software tool. Results of the DW-BI Solution and proposals for a responsive action could also be more considered. Supporting reports were acceptable but would benefit from more considered and detailed content <50 Marks: (0-34) [inadequate] The submitted work was insufficiently well-developed for this level of study and could be improved substantially by using more thoughtful inspection and use of more DW & BI tools. Element 2: Detailed assessment criteria The following detailed assessment criteria have been provided to help you checking you have included the required elements in your ICA. Section 2: Business Intelligence Solution (7 passes/70 points) D) Title page and Executive Summary and Introduction (10 points) • The project has a title page with the title of the report, project and your details • The executive summary provides a clear summary of the report • The key findings are presented • At least one chart is reported • The recommendations are included • The questions addressed in the report are included • There is a brief description of the data used in the model. • A screenshot of the data model is provided ICA Specification Annalisa Occhipinti Page | 11 E) Finding based on analysis and evaluation – 1 (20 points) M language and DAX Expressions • At least three different formulae have been used in the analysis (either DAX or M language, you don’t need to replicate the same formula twice in DAX and M Language, but both languages must be used. You could for example write one formula in DAX and 2 in M language, or vice-versa). • There is a clear understanding of how the formulae have been used. This is shown by providing a clear written explanation of how the applied formulas work. • At least one new calculated column has been added into the model • At least one new measure has been added into the model D) Finding based on analysis and evaluation - 2 Power BI Report (20 points) Visuals, KPI, Infographics and Animated charts including buttons. Formatting options and structure of the Dashboard. Ask a Question Tool • At least three different visuals have been used in Power BI and the screenshot are provided • There is a proper justification of the use of such visuals • The description of the data presented in the chart has been reported for at least two charts • At least one of the previously added calculated columns or measures has been included in a visual • The key findings from the visuals created are reported • The structure of the dashboard follows the tips covered in lessons. • The KPIs are clearly used and visualised in the report. They have also been used to analyse the data depending on the BI questions addressed in the report. • Infographics, Animated Charts, KPIs and Buttons have been correctly used. • The ask a question tool has been appropriately used. E) Data Analytics (Forecast, Analysing trends). Artificial Intelligence and Machine Learning (10 points) • Data Analytics and Artificial Intelligence/Machine Learning tools have been included in the final dashboard/report to further investigate the data. • The results of the Data Analytics, Artificial Intelligence/Machine Learning charts have been described in detail showing a good understanding of the applied tools. F) Conclusions and Recommendations (10 points) • A summary of the report is provided • Recommendations based on findings have been included. • Personal conclusionschallenges have been included The rubric used for your final marking can be accessed here: link https://liveteesac-my.sharepoint.com/:w:/g/personal/a_occhipinti_tees_ac_uk/EWoPh9M9789Bqm-iOlr7QDUBhdbrRcPIb3Zw8j9p6aG6wQ?e=oP3c6c ICA Specification Annalisa Occhipinti Page | 12 Business Intelligence Dataset For your ICA, you can use a data set of your choice. In the module, you will be introduced to different data sources. Some examples of database are listed below. • AdventureWorks Database (you can find this database on Blackboard under the Resources tab in the left menu.) • Norhtwind Database (to access this dataset, click Get Data on Power BI, then look for OData Feed type and insert the link https://services.odata.org/V3/Northwind/Northwind.svc/ - you can then select the columns that you prefer) Below are research and government websites where you can also find and select suitable data set to develop a Business Intelligence solution using PowerBI. • Kaggle • Machine Learning Repository – UCI • GitHub • Government Data • GitHub – Second repository More links are available under the Resources tab on Blackboard. Covid 19 Dataset Coronavirus: Data Analytics and Data Visualisation on Covid 19 Patient Datasets Why not consider developing a PowerBI solution on Covid 19 dataset for your ICA? You could develop a PowerBI - Data Analytics and Data Visualisations solutions to support the science and health community. Overview: Under the directive of the Worlds Health Organisation (WHO) many of the countries are providing the science and research community a series of Covid19 patient datasets with cumulative counts of coronavirus cases. We have access to Covid patient datasets from: • Wuhan (82k record), New York, Italy, India, UK https://services.odata.org/V3/Northwind/Northwind.svc/ https://www.kaggle.com/datasets http://archive.ics.uci.edu/ml/datasets.php https://github.com/awesomedata/awesome-public-datasets#healthcare https://ckan.publishing.service.gov.uk/dataset?res_format=CSV https://vincentarelbundock.github.io/Rdatasets/datasets.html https://covid19.who.int/?gclid=CjwKCAjwh472BRAGEiwAvHVfGtPPve7-yieYwaC_y6w1p2co7xhtSKjpxnhL-II5BlHpi_zXGPT3tBoCJVIQAvD_BwE https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcovid19.who.int%2F%3Fgclid%3DCjwKCAjwh472BRAGEiwAvHVfGtPPve7-yieYwaC_y6w1p2co7xhtSKjpxnhL-II5BlHpi_zXGPT3tBoCJVIQAvD_BwE&data=02%7C01%7CW9183907%40tees.ac.uk%7C269fdf02a3294c228fa608d7fbe672b7%7C43d2115ba55e46b69df7b03388ecfc60%7C0%7C0%7C637254839504793251&sdata=d3pBgw0mixOjZHGIn8yIBEmbL%2B5CvpoIX365gpVP3AA%3D&reserved=0 https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fglobalcitizen%2F2019-wuhan-coronavirus-data&data=02%7C01%7CW9183907%40tees.ac.uk%7C269fdf02a3294c228fa608d7fbe672b7%7C43d2115ba55e46b69df7b03388ecfc60%7C0%7C0%7C637254839504803247&sdata=JU30PMMWpwsyzSlJ3%2B19ZO5x%2FqML9wFXU2A6uAKMKGc%3D&reserved=0 https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnytimes%2Fcovid-19-data&data=02%7C01%7CW9183907%40tees.ac.uk%7C269fdf02a3294c228fa608d7fbe672b7%7C43d2115ba55e46b69df7b03388ecfc60%7C0%7C0%7C637254839504803247&sdata=12t1cml8ZDzqEK1o9ppDGqN6joIai%2BmYdRAboUGsy08%3D&reserved=0 https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FDavideMagno%2FItalianCovidData&data=02%7C01%7CW9183907%40tees.ac.uk%7C269fdf02a3294c228fa608d7fbe672b7%7C43d2115ba55e46b69df7b03388ecfc60%7C0%7C0%7C637254839504813242&sdata=fx5w%2Fn%2Fx0ggHkkrXD3SPrgLBk22%2BJl3cFkzCEOxZ5hU%3D&reserved=0 https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsomeshkar%2Fcovid19india-cluster&data=02%7C01%7CW9183907%40tees.ac.uk%7C269fdf02a3294c228fa608d7fbe672b7%7C43d2115ba55e46b69df7b03388ecfc60%7C0%7C0%7C637254839504813242&sdata=Zkl5NyfffOZnrf6bzbbTvWrqqGjaUkWqlekEEZftXzc%3D&reserved=0 file:///C:/Lectures/BSc_EP/Group%20Guides/_EPsamples/github.com/tomwhite/covid-19-uk-data ICA Specification Annalisa Occhipinti Page | 13 Additional data on images are also being released to the science community: • open database of COVID-19 cases with chest X-ray or CT images The data provided is used to power the maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak. Utilise your ICA to support the Data Science and Health community to help compiling additional time series covid19 data from national or local governments and health departments to provide a complete record of the ongoing outbreak. Consult the following links if you would like to review some of the data analytics and data visuals required: Coronavirus Worldmeters New York Times John Hopkins University Your ICA PowerBI solution (the data analytics and data visual models or dashboard) you provide can be released back to the science, health and research community in order to progress our understanding of this new virus. https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fieee8023%2Fcovid-chestxray-dataset&data=02%7C01%7CW9183907%40tees.ac.uk%7C269fdf02a3294c228fa608d7fbe672b7%7C43d2115ba55e46b69df7b03388ecfc60%7C0%7C0%7C637254839504813242&sdata=q3fmSfX82Fgzm4RuaT%2FYigjzPtkiMUUfcI6QewC3qIo%3D&reserved=0 https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.worldometers.info%2Fcoronavirus%2F%23countries&data=02%7C01%7CW9183907%40tees.ac.uk%7C269fdf02a3294c228fa608d7fbe672b7%7C43d2115ba55e46b69df7b03388ecfc60%7C0%7C0%7C637254839504823241&sdata=4jg8wlUONFRTCxmMYSCThHQxu1JcbA7oa3IR2UFFz0c%3D&reserved=0 https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.nytimes.com%2Finteractive%2F2020%2Fus%2Fcoronavirus-us-cases.html&data=02%7C01%7CW9183907%40tees.ac.uk%7C269fdf02a3294c228fa608d7fbe672b7%7C43d2115ba55e46b69df7b03388ecfc60%7C0%7C0%7C637254839504823241&sdata=GphQc7%2BkvoaWI58joYmiZKi4GNsrmOxvptYVZfJ5WNY%3D&reserved=0 https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jhu.edu%2F&data=02%7C01%7CW9183907%40tees.ac.uk%7C269fdf02a3294c228fa608d7fbe672b7%7C43d2115ba55e46b69df7b03388ecfc60%7C0%7C0%7C637254839504833231&sdata=EB4Z81EeKXaljsvrOGi%2FeZ9V4kgXlAY6smTHNFq1EFU%3D&reserved=0