Decision Advocacy Challenge

Challenge Overview

Your Mission: Create a concise, punchy Quarto document (aim for 1-5 printed pages) that analyzes historical productivity data from Patrick’s auto shop network. Your report should give Patrick clear, actionable recommendations he can understand and use. Write it so an auto-mechanic can follow it—no jargon, no confusion, no boredom. Then render the document to HTML and deploy it via GitHub Pages.

Key Requirements:

Simplicity: Patrick is an auto-mechanic, not a statistician. Write so he can understand.
Conciseness: 1-5 printed pages. Get to the point.
Beautiful visualizations: Show all 250 data points clearly.
Logical recommendations: Defensible advice based on the data.
Confidence assessment: Explain uncertainty in plain terms. We only have 250 days of data—that’s not a lot. Help Patrick understand what ranges of differences he can expect.

The Business Problem 🎯

Patrick runs a network of 5 auto repair shops. He’s been tracking productivity data (number of cars fixed per day) across all shops, along with whether he (the boss) was present at each shop on each day.

The Core Question: When and where should Patrick schedule his presence to maximize productivity and revenue?

Key Considerations:

Patrick can only be at one shop per day
Different shops may respond differently to his presence
Revenue depends on number of cars fixed (assume a fixed price per car)
Historical patterns may not predict future performance perfectly
Patrick loves visiting his brother who runs Shop 3

🎯 The Key Insight: Data-Driven Decision Making

The problem: Patrick needs to make scheduling decisions based on data, not intuition. He needs to understand:

Which shops benefit most from his presence?
Are there patterns by shop that suggest different strategies?
What are the revenue and profit implications of different scheduling scenarios?
How confident can he be in these recommendations?

Why this matters: Poor scheduling decisions can lead to:

Wasted time (Patrick at shops that don’t need him)
Lost revenue (shops that need him but don’t get him)

The connection: This is a real-world decision problem where data analysis can directly impact business outcomes. Your job is to make the data tell a clear, actionable story.

The Data 📊

The dataset contains 250 observations of daily productivity across 5 shops:

carsDF.head(10)

Table 1: Overview of the productivity data

	observation	shopID	boss	carsFixed
0	1	1	0	8
1	2	2	0	22
2	3	3	0	32
3	4	4	1	64
4	5	5	0	53
5	6	1	1	21
6	7	2	0	20
7	8	3	0	42
8	9	4	0	31
9	10	5	0	55

Data Dictionary:

observation: Observation number (1-250)
shopID: Shop identifier (1-5)
boss: Binary indicator (0 = boss absent, 1 = boss present)
carsFixed: Number of cars fixed that day

summary = carsDF.groupby(['shopID', 'boss'])['carsFixed'].agg(['count', 'mean', 'std', 'min', 'max']).round(2)
summary

Table 2: Summary statistics by shop and boss presence

		count	mean	std	min	max
shopID	boss
1	0	40	11.05	2.93	5	17
1	1	10	18.00	3.13	13	23
2	0	45	22.13	4.82	11	35
2	1	5	32.20	2.59	29	35
3	0	35	32.46	5.59	23	49
3	1	15	35.40	5.00	25	44
4	0	45	37.02	5.61	26	51
4	1	5	49.80	10.08	39	64
5	0	35	45.51	7.24	32	60
5	1	15	53.73	5.31	45	62

Challenge Requirements 📋

Minimum Requirements for Any Points on Challenge

Create a Quarto Document: Write a comprehensive quarto markdown file structured as a professional business report. Your final rendered HTML should be a polished, client-ready document that Patrick can use to make decisions. Important: Your final rendered HTML should contain only your analysis and recommendations—all challenge instructions, setup guides, and grading rubrics should be removed from the final report.
Render to HTML: You must render the quarto markdown file to HTML.
GitHub Repository: Create a repository named decAdvocacyChallenge in your GitHub account. Upload your rendered HTML files to this repository.
GitHub Pages Setup: The repository should be made the source of your GitHub Pages:
- Go to your repository settings (click the “Settings” tab in your GitHub repository)
- Scroll down to the “Pages” section in the left sidebar
- Under “Source”, select “Deploy from a branch”
- Choose “main” branch and “/ (root)” folder
- Click “Save”
- Your site will be available at: https://[your-username].github.io/decAdvocacyChallenge/
- Note: It may take a few minutes for the site to become available after enabling Pages

Getting Started: Repository Setup 🚀

📁 Quick Start

Step 1: Create a new repository named decAdvocacyChallenge in your GitHub account by forking the starter repository at https://github.com/flyaflya/decAdvocacyChallenge.git

Step 2: Clone the repository locally using Cursor (or VS Code)

Step 3: The data file carsFixed.csv is already included in this repository. The code will load it from the local file.

Step 4: You’re ready to start! Modify this index.qmd file and begin your analysis.

💡 Why This Approach?

Benefits:

Real-world application: You’re solving an actual business problem
Data-driven insights: Learn to extract actionable insights from data
Professional presentation: Create a report that a real client would use
Visual storytelling: Master the art of data visualization for decision-making

Grading Rubric 🎓

📊 What You’re Really Being Graded On

This is a report for Patrick, an auto-mechanic, not a statistics professor. Your job is to give him clear, actionable advice he can understand and use. Think of this as a brief consultation report—something he can read in 5 minutes and immediately know what to do.

Report Format:

Conciseness: Aim for 1-5 printed pages. Patrick is busy—get to the point quickly.
Simplicity: Write so an auto-mechanic can understand. Avoid jargon. Use plain language.
Delete All Challenge Instructions: Once you’ve completed your analysis, remove all challenge instructions, setup guides, and grading rubrics from your final rendered HTML. The final report should contain only your analysis, visualizations, and recommendations—nothing else.
Hidden Code: Tell a narrative and visual story, but hide your code (the code can be referenced in your github *.qmd source file if needed).
Show All Data Points: When plotting historical data, display all 250 observations (not just summaries). Use transparency, jitter, or other techniques to handle overlapping points.

What makes a great report:

Beautiful visualizations: Clean, clear charts that tell the story at a glance
Logical recommendations: Defensible advice based on the data, not speculation
Confidence assessment: Explain uncertainty in plain terms. What ranges of differences can Patrick expect? How confident can he be? (Remember: we only have 250 days of data—that’s not a lot.)
Punchy and focused: Don’t bore Patrick with unnecessary analysis. Focus on what matters for his decision.

What we’re looking for: A report Patrick can read, understand, and act on. If he’s confused or bored, you’ve missed the mark.

What Your Report Should Include

Your report should answer these questions in a clear, concise way:

What does the data show?
- Visualize all 250 data points showing productivity by shop and boss presence
- Which shops benefit most from Patrick’s presence? Which don’t?
What should Patrick do?
- Clear, specific recommendations: Which shops should he prioritize? Why?
- What’s the potential financial impact? (Revenue or profit implications)
How confident can Patrick be?
- Explain uncertainty in plain terms an auto-mechanic can understand
- What ranges of differences can he expect? (We only have 250 days—that’s not a huge sample)
- What could go wrong? What assumptions are you making?
What does the future look like?
- Show at least one visualization of projected outcomes under different scheduling scenarios
- Help Patrick see the potential impact of following your recommendations

Remember: Less is more. A focused, punchy report that Patrick can understand and act on is better than a long, complex analysis that confuses him.

Technical Implementation Preferences 💡

Setting Up Your Analysis

Use pandas for data manipulation
Use matplotlib and seaborn for visualizations
Use numpy for numerical calculations if needed
Consider using scipy for statistical tests if helpful

Visualization Preferences

Professional Styling: Use consistent colors, clear labels, readable fonts, and informative titles
Show All Points: When displaying historical data, ensure all 250 observations are visible (use transparency, jitter, or other techniques)
Color Coding: Use consistent color schemes (e.g., one color for boss present, another for boss absent)
Interactive Elements: Consider using plotly for interactive visualizations if helpful (optional)

Data Loading

The data file carsFixed.csv is included in your repository. Load it using pandas:

import pandas as pd
carsDF = pd.read_csv("carsFixed.csv")

Submission Checklist ✅

Technical Requirements:

Forked repository decAdvocacyChallenge from the starter repository
Cloned repository locally using Cursor (or VS Code)
Quarto document created with clear narrative (use this index.qmd as a starting point)
Document rendered to HTML successfully
HTML files uploaded to your repository
GitHub Pages enabled and working
Site accessible at https://[your-username].github.io/decAdvocacyChallenge/

Report Quality Requirements:

Conciseness: Report is 1-5 printed pages (not a novel)
Simplicity: Written so an auto-mechanic can understand (no unnecessary jargon)
Beautiful visualizations: Clean, clear charts that tell the story with persuasive titles, meaningful labels, and useful annotations.
All 250 data points visible: Historical data shows all observations, not just summaries (on one of the visualizations)
Logical recommendations: Defensible advice based on the data
Confidence assessment: Uncertainty explained in plain terms Patrick can understand
Future scenarios: At least one visualization showing projected outcomes
Clean final report: All challenge instructions, setup guides, and grading rubrics removed from final HTML
Hidden code: Code is hidden in final report (must be in source .qmd file)

Resources

Quarto Markdown: quarto.org/docs/authoring/markdown-basics.html
Quarto Documentation: quarto.org/docs
Python Data Science Handbook: jakevdp.github.io/PythonDataScienceHandbook
Pandas Documentation: pandas.pydata.org/docs
Matplotlib Gallery: matplotlib.org/stable/gallery
Seaborn Tutorial: seaborn.pydata.org/tutorial.html