INFT 5028 · Team A · NSCC

NS Health &
Population Analytics

An integrated BI solution surfacing regional health patterns, demographic shifts, and forecasted healthcare demand across Nova Scotia.

Course INFT 5028 Capstone
Institution NSCC IT Campus
Instructor Patrick Dolinger
Version v0.1 · Week 1
01

Project Overview

Nova Scotia is experiencing significant demographic shifts — an aging population, changing socioeconomic conditions, and increasing pressure on its healthcare system. Despite this, there is no consolidated, data-driven view connecting population trends, socioeconomic indicators, and health outcomes across NS regions.

This project addresses that gap. Using publicly available data from Statistics Canada, the NS Open Data Portal, and CIHI, Team A builds an integrated Business Intelligence solution that surfaces regional health patterns, demographic trends, and forecasted healthcare demand — delivered as an interactive Power BI dashboard.

Key analytical questions
Q1How has Nova Scotia's population age distribution shifted over the past decade?
Q2Which regions have the highest rates of chronic disease and hospitalization?
Q3Is there a correlation between socioeconomic indicators (income, education) and health outcomes?
Q4How do Nova Scotia's health indicators compare to national averages?
Q5Which demographic segments are expected to drive future healthcare demand?
02

Project Progress

Week 1
Scoping & Setup
  • Project charter
  • GitHub repo
  • Data acquisition
  • Sprint plan
● In Progress
Week 2
Data Wrangling
  • SQL/Python cleaning
  • Power Query ETL
  • Data dictionary
  • Schema design
○ Upcoming
Week 3
Analysis & Modelling
  • Pandas EDA
  • Trend regression
  • Forecast CSV
  • PBI wireframe
○ Upcoming
Week 4
BI Dashboard
  • Power BI build
  • DAX measures
  • Slicers & drill
  • Internal review
○ Upcoming
Week 5
Final Delivery
  • Final report PDF
  • Repo tag v1.0
  • Peer evaluation
  • Reflection
○ Upcoming
03

Business Understanding

CRISP-DM Phase 1

Nova Scotia health authorities and policy planners currently lack a single integrated view of how demographic change relates to health outcomes at the regional level. This project provides that view to support better resource allocation and planning decisions.

StakeholderInterest
Government of Nova ScotiaPolicy planning, healthcare resource allocation
NS Health AuthoritiesOperational planning, regional health monitoring
Data Analysts / ResearchersData-driven insights and modelling
Instructor/EvaluatorAcademic assessment
General PublicDeployment of production-level AI systems
In scope
  • Nova Scotia population demographics data
  • Health indicators (hospital usage, chronic diseases, aging
  • Data cleaning and preprocessing(Python/SQL)
  • Exploratory Data Analysis (EDA)
  • Dashboard development (Power BI)
  • Basic forecasting or trend analysis
  • Documentation and GitHub repository
Out of scope
  • Real-time hospital system intergration
  • Clinical diagnosis or medical predictions
  • Patient patient-level data
  • Deployment of production-level AI systems
04

Data Understanding

CRISP-DM Phase 2
SourceURLCoverage
NS Open Data Portal data.novascotia.ca NS health authority datasets, demographics
Statistics Canada www150.statcan.gc.ca Population, socioeconomic, health survey data
CIHI Open Data cihi.ca Hospitalization, wait times, health workforce

Data types collected: population by age/gender, life expectancy by region, chronic disease prevalence rates, hospitalization statistics, income and education indicators.

📄 Data dictionary: /data/data_dictionary.md Week 2

05

Data Preparation

CRISP-DM Phase 3

All raw data is stored in /data/raw/ and must not be edited directly. Cleaned outputs are written to /data/cleaned/.

ScriptPurposeStatus
scripts/01_cleaning.sqlSQL cleaning, joins, aggregationsWeek 2
scripts/02_cleaning.pyPython/pandas cleaning pipelineWeek 2
scripts/03_power_query.mdPower Query M-code documentationWeek 2
Planned star schema
Dimension
DimRegion
Dimension
DimDemographic
Fact Table
FactHealthOutcome
Dimension
DimDate
Dimension
DimCondition
06

Modelling

CRISP-DM Phase 4

📓 Python notebook: /notebooks/analysis.ipynb Week 3

StepDescription
EDAAge group distributions by region, correlation matrix of socioeconomic vs. health variables, seaborn heatmaps
Trend AnalysisTime-series trend fitting per age cohort and health region
Predictive ModelLinear or polynomial regression — 5–10 year population and healthcare demand forecast
ExportForecast table exported as CSV for Power BI import
07

Dashboard

CRISP-DM Phase 5 & 6

📊 Power BI file: /powerbi/NS_Health_Analytics.pbix Week 4

PageKey Visuals
Executive Summary4 KPI cards: NS Population, Avg Life Expectancy, Top Health Region, YoY Change
Population TrendsArea chart by age group, population pyramid, NS choropleth map
Health OutcomesChronic disease bar chart, hospitalization line chart, income vs. health scatter
Demographic Deep DiveStacked age/gender bar chart, community-level indicator table
ForecastPython-generated forecast overlay with confidence interval ribbon
08

Repository Structure

ns-health-population-analytics/ │ ├── data/ │ ├── raw/ # Original downloaded datasets — do not edit │ ├── cleaned/ # Cleaned, transformed outputs │ └── data_dictionary.md # Field definitions and transformation notes │ ├── scripts/ │ ├── 01_cleaning.sql # SQL cleaning and joins │ ├── 02_cleaning.py # Python/pandas cleaning pipeline │ └── 03_power_query.md # Power Query M-code documentation │ ├── notebooks/ │ └── analysis.ipynb # EDA + predictive model + CSV exports │ ├── powerbi/ │ └── NS_Health_Analytics.pbix │ ├── docs/ │ ├── project_charter.docx │ ├── internal_review_log.md │ └── final_report.pdf │ ├── .gitignore └── README.html
SQL / SQLite
Data cleaning, joins, aggregations
Python
pandas, scikit-learn, seaborn — EDA and modelling
Power Query
ETL transformations and reshaping
Power BI
Interactive dashboard and DAX measures
GitHub
Version control for all scripts and docs
Jupyter
Notebook for EDA and forecast model
09

Team

KT
Pham Thi Kim Thanh
Project Manager
Coordination, timeline management, communication
AM
Anne Mwangi
Data Lead
Data sourcing, structure, and schema design
CO
Chukwuma Okoro
SQL / ETL Analyst
Data cleaning and transformation pipelines
JA
Juliana Ayide
Reporting Lead
Power BI dashboard development and DAX
MO
Vacancy
QA / Documentation
Data validation, documentation, review log