Realm Map
Choose your quest. Each path teaches a key dimensional modelling concept and includes a visit from the Shapeshifter.
⚔️ Quest 1: Grain Declaration Trial
Thornwood Trading Company needs to know their total sales for January in the North region. Below is the messy dataset they've handed you. Try to calculate the sales manually — you may be surprised by the result!
Rows highlighted in red indicate duplicate orders that will artificially inflate your totals.
To get this right, you need to decide the proper grain for the Sales fact table. Which grain makes sense?
Here's what happens with each grain choice:
- Order grain: loses product level detail and returns total order value only.
- Line item grain: correct – the atomic level, can roll up to orders, and analyze by product.
- Shipment grain: duplicates order value when orders ship in multiple packages.
Line item grain is the right choice. By modelling each line item separately, you retain detail and avoid double counting shipments. You can still get order totals by summing line items.
The CEO now asks: "Who is our biggest customer?"
Here are the customer records:
Before you can answer, you must decide whether C100, C101 and C102 are the same customer. This is the Shapeshifter's challenge: entity resolution.
Are these three rows the same real company?
Practice: The CFO wants to track shipping costs. What should be the grain of the Shipping fact table?
⏳ Quest 2: Slowly Changing Enigma
This quest explores Type 1 and Type 2 slowly changing dimensions. For brevity, this implementation summarises the key concepts; feel free to expand it further.
Customer History
Question
How much revenue did Sarah Miller generate in 2023?
🏰 Quest 3: Conformed Dimension Summit
In a full version of the app you would build a conformed Region and Customer dimension here. Use the Grimoire to review entity resolution techniques.
🔥 Quest 4: Forge of Transformation
The Forge of Transformation guides you through cleansing messy source data and resolving fuzzy duplicates. Due to space, the interactive sub‑quest has been summarised; check the Grimoire for details.
⭐ Quest 5: Star Schema Smithy
This quest would normally include a drag‑and‑drop schema builder for the Starfall University case. Here, consider which dimensions you would need: Student, Course, Instructor, Date, and perhaps a Master Student identifier.
🌉 Quest 6: Bridge Table Trial
Here you would build a bridge table to correctly allocate sales across multiple authors. Beware of duplicate authors – the Shapeshifter lurks here too.
🔮 Shapeshifter's Grimoire
Entity resolution is the art of determining when two records refer to the same real‑world entity. Below are key techniques and concepts. Use this knowledge throughout your quests.
Techniques
- Exact matching: compare natural keys directly (customer_id).
- Normalized matching: upper‑case, trim whitespace, remove punctuation before comparing.
- Fuzzy string matching: algorithms like Levenshtein distance or Jaro‑Winkler for spelling differences.
- Phonetic matching: algorithms like Soundex or Metaphone for names that sound alike.
- Domain knowledge: email domains, addresses, phone numbers help confirm matches.
Concepts
- Match confidence scores: quantify the probability that two records refer to the same entity.
- Threshold tuning: adjust match thresholds to balance false positives and false negatives.
- Survivorship rules: determine which source attribute values become the “golden record.”
- Golden record: the canonical representation of an entity after resolution.
- Ongoing process: entity resolution is not one‑and‑done. New data may merge or split existing entities.
📚 Lexicon
A glossary of terms used in dimensional modelling and entity resolution.
- Grain
- The level of detail represented by a row in a fact table.
- Fact Table
- A table containing measurable, numeric data about a business process.
- Dimension Table
- A table containing descriptive attributes that give context to facts.
- Slowly Changing Dimension (SCD)
- A technique for tracking how dimension attributes change over time.
- Conformed Dimension
- A dimension that is shared across multiple fact tables to enable cross‑process analysis.
- Bridge Table
- A table used to resolve many‑to‑many relationships between a fact and a dimension.
- Entity Resolution
- The process of determining when two or more records refer to the same real‑world entity.
🏆 Achievements
Your badges for accomplishments in the academy.
- 🏅 Grain Master: complete Quest 1.
- 🏅 Keeper of History: complete Quest 2.
- 🏅 Realm Unifier: complete Quest 3.
- 🏅 Master Smith: complete Quest 4.
- 🏅 Star Forger: complete Quest 5.
- 🏅 Bridge Builder: complete Quest 6.
- 🏆 Dragon Tamer: complete all quests.
⚙️ Settings
Adjust your preferences and manage your progress.