Analyzing Building Functions by Neighborhood

Duration: 45-60 minutes

Learning Goals

By the end of this tutorial, you will be able to:

Classify buildings by function type (single vs. multi-functional)
Use Field Calculator to create conditional classifications
Perform spatial selection based on location (within/intersect)
Execute spatial joins to append neighborhood attributes
Calculate geometric attributes (area)
Generate statistics by category (count and sum)
Export and visualize aggregated data

Software and Data Required

Software: - QGIS (version 3.x or higher) - Web browser with internet access - RAWGraphs 2.0: https://app.rawgraphs.io/

Data files: - buildings.gpkg - Rotterdam building footprints with function data - buurten.gpkg - Rotterdam neighborhoods (buurten) ————————————————————————

Part 1: Preparing Building Function Data

Step 1.1: Load and Explore Your Data

Start QGIS and create a new project
Set the CRS to EPSG:28992 (Amersfoort / RD New - Dutch national grid)
Load the layers:
- Add buildings.gpkg stored in the part3_data/vector folder
- Add buurten.gpkg (You have created this file in the first exercise - using data source manager))

Figure 1 - Loaded building and neighborhood layers

Explore the building attribute table:
- Right-click buildings layer → Open Attribute Table
- Identify the column containing building functions (e.g., functie, function, or gebruiksdoel)
- Notice how functions are stored:
  - Single function: "woonfunctie" (residential)
  - Multiple functions: "woonfunctie, winkelfunctie" (residential, retail)

Figure 2 - Building attribute table showing function column

Step 1.2: Classify Buildings by Function Type

We’ll create a new field that identifies whether a building has a single function or multiple functions.

Open the attribute table of the buildings layer
Enable editing mode (click the pencil icon or press Ctrl+E)
Open Field Calculator (click the abacus icon or press Ctrl+I)
Configure the new field:
- ☑ Check Create a new field
- Output field name: function_type
- Output field type: Text (string)
- Length: 50
Enter this expression:

CASE 
    WHEN "function_column" IS NULL OR "function_column" = '' THEN 'Unknown'
    WHEN length("function_column") - length(replace("function_column", ',', '')) >= 1 
        THEN 'Multi-functional'
    ELSE trim("function_column")
END

⚠️ Important: Replace "function_column" with your actual field name (e.g., "functie" or "gebruiksdoel")

What this does: - Counts commas to detect multiple functions - If 1 or more commas found → labels as “Multi-functional” - If no commas → uses the single function name - Handles null/empty values → labels as “Unknown”

Figure 3 - Field Calculator creating function_type field

Click OK
Save edits (click the save icon)
Toggle off editing mode (click the pencil icon again)
Verify your results:
- Scroll through the attribute table
- Check that buildings with multiple functions are labeled “Multi-functional”
- Check that single-function buildings show their specific function name

Part 2: Spatial Selection and Join

Step 2.1: Select Buildings Within Neighborhoods

We need to identify which buildings fall within neighborhood boundaries.

Go to: Vector → Research Tools → Select by Location
Configure the selection:
- Select features from: buildings
- Where the features: are within (or intersect if you want to include partially overlapping buildings)
- By comparing to features from: buurten
- Modify current selection by: creating new selection

Understanding the geometric predicates:

- Are within: Only buildings completely inside neighborhood boundaries (recommended for accuracy) - Intersect: Buildings that touch or overlap boundaries (may include buildings on borders)

Click Run
Check your selection:
- Look at the status bar: it shows “X features selected”
- Selected buildings are highlighted in yellow on the map
- In the attribute table, selected rows are highlighted

💡 Tip: If you want to keep only buildings within neighborhoods, you can export the selection as a new layer: Right-click buildings → Export → Save Selected Features As...

Step 2.2: Spatial Join - Append Neighborhood Names

Now we’ll add the neighborhood name to each building record.

Go to: Vector → Data Management Tools → Join Attributes by Location
Configure the spatial join:
- Input layer: buildings (or your selected buildings if exported)
- Join layer: buurten
- Geometric predicate: ☑ within (or intersects)
- Fields to add: Click the ... button
  - ☑ Select only the neighborhood name field (e.g., buurt_naam, BU_NAAM)
  - ☑ You may also want neighborhood code (e.g., BU_CODE)
  - Uncheck other fields to keep the data clean
- Join type: Create separate feature for each matching feature (one-to-many)
- Joined layer: Save as buildings_with_neighborhoods.gpkg

Figure 5 - Join Attributes by Location dialog

Click Run
Verify the join:
- Open the attribute table of the new buildings_with_neighborhoods layer
- Check that the neighborhood name column has been added
- Verify that buildings have the correct neighborhood assigned

⚠️ Troubleshooting: - If buildings show NULL in neighborhood field: They may be outside all neighborhood boundaries - If you see duplicate buildings: Check your geometric predicate and join type settings

Part 3: Calculate Building Areas

Step 3.1: Add Area Field

Open the attribute table of buildings_with_neighborhoods
Enable editing mode
Open Field Calculator
Configure the area field:
- ☑ Create a new field
- Output field name: area_m2
- Output field type: Decimal number (real)
Enter this expression:

$area

This calculates the area in square meters (since your CRS is EPSG:28992).

Click OK
Save edits and disable editing mode

💡 Alternative expressions: - round($area, 2) - Round to 2 decimal places - $area / 10000 - Convert to hectares - For other CRS: area($geometry) is more explicit

Part 4: Calculate Statistics by Function and Neighborhood

Now we’ll aggregate the data to answer key questions: - How many buildings of each function type are in each neighborhood? - What is the total area of each function type per neighborhood?

Step 4.1: Statistics by Categories - Building Count

Go to: Processing Toolbox → Vector Analysis → Statistics by Categories
Configure for building counts:
- Input vector layer: buildings_with_neighborhoods
- Field to calculate statistics on: fid (or any unique ID field)
- Field(s) with categories:
  - Click ... and select both:
    1. buurt_naam (neighborhood name)
    2. function_type (our created classification)
- Statistics to calculate:
  - ☑ Count (this counts buildings)
- Output: building_count_by_function.csv
  
  Note
  
  The tool doesn’t ask this to specify, but if the field to calculate on is not specified, it will only do count.

Figure 7 - Statistics by Categories for counts

Click Run
Review the output:
- Open the CSV in QGIS or a text editor
- You should see columns: buurt_naam, function_type, count
- Each row represents: X buildings of type Y in neighborhood Z

Step 4.2: Statistics by Categories - Area Sum

Run Statistics by Categories again with different settings:
Configure for area totals:
- Input vector layer: buildings_with_neighborhoods
- Field to calculate statistics on: area_m2
- Field(s) with categories:
  - buurt_naam
  - function_type
- Statistics to calculate:
  - ☑ Sum (total area)
  - ☑ Mean (average building size - optional but useful)
  - ☑ Count (number of buildings - to verify)
- Output: building_area_by_function.csv

Figure 8 - Statistics by Categories for areas

Click Run

Step 4.3: Create a Comprehensive Summary Table (Alternative Method)

For more control, use the Aggregate tool:

Go to: Processing Toolbox → Vector Analysis → Aggregate
Configure:
- Input layer: buildings_with_neighborhoods
- Group by expression: Click ε button and build expression:

"buurtnaam" || ' - ' || "function_type"

This creates unique groups for each neighborhood-function combination.

Aggregates: Click ... to configure multiple statistics:

Add these aggregations:

Expression	Aggregate	Name	Type	Length	Precision
`fid`	-	`building_count`	Integer (64 bit)	10	2
`"area_m2"`	`sum`	`total_area_m2`	Decimal (Double)	10	2
`"area_m2"`	`mean`	`avg_area_m2`	Decimal (Double)	10	2
`"area_m2"`	`min`	`min_area_m2`	Decimal (Double)	10	2
`"area_m2"`	`max`	`max_area_m2`	Decimal (Double)	10	2

Also add these to preserve grouping info:

Expression Aggregate Name Type Length Precision

"buurtnaam" first_value neighborhood Text (string) 50 0

"function_type" first_value function Text (string) 50 0

Expression	Aggregate	Name	Type	Length	Precision
`"buurtnaam"`	`first_value`	`neighborhood`	Text (string)	50	0
`"function_type"`	`first_value`	`function`	Text (string)	50	0

Aggregated layer: Save as building_stats_summary.gpkg
Click Run
Review the results:
- Open the attribute table
- You now have comprehensive statistics for each neighborhood-function combination
- Each row = one function type in one neighborhood with count, total area, and size statistics

Part 5: Export and Visualize Data

Step 5.1: Prepare Data for Visualization

Open the attribute table of your statistics layer (building_stats_summary)
Export to CSV:
- Right-click the layer → Export → Save Features As...
- Format: Comma Separated Value [CSV]
- File name: rotterdam_building_stats.csv
- Geometry: Select No geometry (we only need the statistics)
- Select fields to export:
  - ☑ neighborhood
  - ☑ function
  - ☑ building_count
  - ☑ total_area_m2
  - ☑ avg_area_m2
  - Uncheck fid and other unnecessary fields

Click OK
Verify your export:
- Open the CSV in a text editor or spreadsheet
- Check headers are clear
- Verify data looks correct

Step 5.2: Create Visualizations in RAWGraphs

Visualization 1: Bar Chart - Building Count by Function

Open RAWGraphs: https://app.rawgraphs.io/
Load your data: Upload rotterdam_building_stats.csv
Choose chart: Select Bar chart
Map dimensions:
- Bars: function
- Size: building_count
- Color: neighborhood (optional - will show stacked bars)
Customize:
- Sort by: Value (descending) to show most common functions first
- Orientation: Horizontal (easier to read function names)
- Color scheme: Choose a categorical palette
Export: Download as SVG

Figure 11 - Bar chart of building counts by function

Visualization 2: Treemap - Area Distribution

Start new chart in RAWGraphs
Choose chart: Select Treemap
Map dimensions:
- Hierarchy:
  1. First level: function
  2. Second level: neighborhood
- Size: total_area_m2
- Color: function
- Label: neighborhood
Customize:
- Adjust padding for readability
- Choose color scheme (qualitative)
- Enable/disable labels based on size
Export: Download as SVG

Figure 12 - Treemap showing area distribution

Visualization 3: Grouped Bar Chart - Compare Neighborhoods

Start new chart
Choose chart: Select Bar chart
Map dimensions:
- X axis: neighborhood
- Y axis: building_count
- Color: function
- Series: Leave empty for grouped bars
Customize:
- Sort by: Total value to rank neighborhoods
- Enable legend
- Adjust colors for function types
Export: Download as SVG

Figure 13 - Grouped bar chart comparing neighborhoods

Summary Table: Statistics You’ve Calculated

Statistic	Method	Answers the Question
Building Count by Function	Statistics by Categories	How many buildings of each type in each neighborhood?
Total Area by Function	Statistics by Categories	What’s the total floor area per function type?
Average Building Size	Aggregate tool	What’s the typical building size per function?
Neighborhood Rankings	Sort aggregated data	Which neighborhoods have most buildings/area?

Key QGIS Concepts Used

1. Field Calculator Expressions

Conditional Classification:

CASE 
    WHEN condition THEN result
    ELSE alternative
END

String Functions: - length() - Count characters - replace() - Replace text - trim() - Remove whitespace

2. Spatial Predicates

Predicate	Description	Use Case
within	Feature completely inside	Buildings entirely within neighborhoods
intersects	Features touch or overlap	Include buildings on borders
contains	Opposite of within	Which neighborhood contains building

3. Aggregation Functions

Function	Purpose
`count()`	Count features
`sum()`	Add up values
`mean()`	Calculate average
`min()`/`max()`	Find extremes

Common Issues and Solutions

Issue 1: Buildings missing neighborhood names after join

Cause: Buildings are outside all neighborhood boundaries

Solution: - Check your spatial selection first - Verify CRS matches for both layers - Use “intersects” instead of “within” if buildings are on boundaries - Inspect the map visually to confirm coverage

Issue 2: Duplicate buildings in results

Cause: Join type is set to “one-to-many” and buildings overlap multiple neighborhoods

Solution: - Use “within” predicate (more strict) - Check for buildings actually on boundaries - decide which neighborhood they should belong to - Use “Create separate features for each located feature (one-to-one)” join type

Issue 3: Area values are enormous or tiny

Cause: CRS is in degrees instead of meters, or area unit confusion

Solution: - Verify your CRS is EPSG:28992 (projected coordinate system) - Check Project → Properties → CRS - Reproject layers if needed: Vector → Data Management Tools → Reproject Layer

Issue 4: Function names are inconsistent

Cause: Raw data has variations in naming or spelling

Solution: - Use Field Calculator to standardize:

CASE 
    WHEN "function" ILIKE '%woon%' THEN 'Residential'
    WHEN "function" ILIKE '%winkel%' THEN 'Retail'
    -- etc.
END

Extension Activities

Temporal Analysis: If you have multi-year building data, compare how building functions change over time
Density Calculations: Calculate building density (buildings per hectare) and function diversity per neighborhood
Advanced Classification: Create more detailed function categories:
- Residential (single-family vs. multi-family)
- Commercial (retail vs. office)
- Create a function_category hierarchy
Spatial Statistics: Calculate nearest neighbor distances for each function type
Dashboard Creation: Combine multiple visualizations with descriptive text in a report
Interactive Mapping: Use QGIS2Web or qgis2threejs to create interactive maps showing statistics

Checklist

Before finishing, verify you have:

☑ Created function_type field classifying single vs. multi-functional buildings
☑ Performed spatial selection of buildings within neighborhoods
☑ Executed spatial join to append neighborhood names
☑ Calculated building areas in square meters
☑ Generated statistics: building count per function per neighborhood
☑ Generated statistics: total area per function per neighborhood
☑ Exported clean CSV with relevant fields only
☑ Created at least 2 different visualizations in RAWGraphs
☑ Saved all intermediate and final outputs in your project folder

Additional Resources

QGIS Documentation: - Vector analysis tools - Field calculator functions - Spatial queries

RAWGraphs: - Learning resources - Chart gallery

Data Visualization: - ColorBrewer - Choose appropriate color schemes - Data Viz Project - Encyclopedia of visualization types

--- title: "Analyzing Building Functions by Neighborhood" format: html editor: visual --- Duration: 45-60 minutes ## Learning Goals By the end of this tutorial, you will be able to: - Classify buildings by function type (single vs. multi-functional) - Use Field Calculator to create conditional classifications - Perform spatial selection based on location (within/intersect) - Execute spatial joins to append neighborhood attributes - Calculate geometric attributes (area) - Generate statistics by category (count and sum) - Export and visualize aggregated data ## Software and Data Required **Software:** - QGIS (version 3.x or higher) - Web browser with internet access - RAWGraphs 2.0: https://app.rawgraphs.io/ **Data files:** - `buildings.gpkg` - Rotterdam building footprints with function data - `buurten.gpkg` - Rotterdam neighborhoods (buurten) ------------------------------------------------------------------------ ## Part 1: Preparing Building Function Data ### Step 1.1: Load and Explore Your Data 1. **Start QGIS** and create a new project 2. **Set the CRS** to EPSG:28992 (Amersfoort / RD New - Dutch national grid) 3. **Load the layers:** - Add `buildings.gpkg` stored in the `part3_data/vector` folder - Add `buurten.gpkg` (You have created this file in the first exercise - [using data source manager](using_data_source_manager.qmd))) ![Figure 1 - Loaded building and neighborhood layers](statistics_and_data_visualization/fig%201.png) 4. **Explore the building attribute table:** - Right-click `buildings` layer → **Open Attribute Table** - Identify the column containing building functions (e.g., `functie`, `function`, or `gebruiksdoel`) - Notice how functions are stored: - Single function: `"woonfunctie"` (residential) - Multiple functions: `"woonfunctie, winkelfunctie"` (residential, retail) ![Figure 2 - Building attribute table showing function column](statistics_and_data_visualization/fig%202.png) ### Step 1.2: Classify Buildings by Function Type We'll create a new field that identifies whether a building has a single function or multiple functions. 1. **Open the attribute table** of the `buildings` layer 2. **Enable editing mode** (click the pencil icon or press `Ctrl+E`) 3. **Open Field Calculator** (click the abacus icon or press `Ctrl+I`) 4. **Configure the new field:** - ☑ Check **Create a new field** - **Output field name:** `function_type` - **Output field type:** `Text (string)` - **Length:** `50` 5. **Enter this expression:** ``` sql CASE WHEN "function_column" IS NULL OR "function_column" = '' THEN 'Unknown' WHEN length("function_column") - length(replace("function_column", ',', '')) >= 1 THEN 'Multi-functional' ELSE trim("function_column") END ``` **⚠️ Important:** Replace `"function_column"` with your actual field name (e.g., `"functie"` or `"gebruiksdoel"`) **What this does:** - Counts commas to detect multiple functions - If 1 or more commas found → labels as "Multi-functional" - If no commas → uses the single function name - Handles null/empty values → labels as "Unknown" ![Figure 3 - Field Calculator creating function_type field](statistics_and_data_visualization/fig%203.png) 6. Click **OK** 7. **Save edits** (click the save icon) 8. **Toggle off editing mode** (click the pencil icon again) 9. **Verify your results:** - Scroll through the attribute table - Check that buildings with multiple functions are labeled "Multi-functional" - Check that single-function buildings show their specific function name ![](statistics_and_data_visualization/fig%20x1.png) ------------------------------------------------------------------------ ## Part 2: Spatial Selection and Join ### Step 2.1: Select Buildings Within Neighborhoods We need to identify which buildings fall within neighborhood boundaries. 1. **Go to:** `Vector` → `Research Tools` → `Select by Location` 2. **Configure the selection:** - **Select features from:** `buildings` - **Where the features:** `are within` (or `intersect` if you want to include partially overlapping buildings) - **By comparing to features from:** `buurten` - **Modify current selection by:** `creating new selection` ![Figure 4 - Select by Location dialog](statistics_and_data_visualization/fig%204.png) ::: callout-important ## Understanding the geometric predicates: \- **Are within:** Only buildings completely inside neighborhood boundaries (recommended for accuracy) - **Intersect:** Buildings that touch or overlap boundaries (may include buildings on borders) ::: 3. Click **Run** 4. **Check your selection:** - Look at the status bar: it shows "X features selected" - Selected buildings are highlighted in yellow on the map - In the attribute table, selected rows are highlighted **💡 Tip:** If you want to keep only buildings within neighborhoods, you can export the selection as a new layer: Right-click `buildings` → `Export` → `Save Selected Features As...` ### Step 2.2: Spatial Join - Append Neighborhood Names Now we'll add the neighborhood name to each building record. 1. **Go to:** `Vector` → `Data Management Tools` → `Join Attributes by Location` 2. **Configure the spatial join:** - **Input layer:** `buildings` (or your selected buildings if exported) - **Join layer:** `buurten` - **Geometric predicate:** ☑ `within` (or `intersects`) - **Fields to add:** Click the `...` button - ☑ Select only the neighborhood name field (e.g., `buurt_naam`, `BU_NAAM`) - ☑ You may also want neighborhood code (e.g., `BU_CODE`) - Uncheck other fields to keep the data clean - **Join type:** `Create separate feature for each matching feature (one-to-many)` - **Joined layer:** Save as `buildings_with_neighborhoods.gpkg` ![Figure 5 - Join Attributes by Location dialog](statistics_and_data_visualization/fig%205.png) 3. Click **Run** 4. **Verify the join:** - Open the attribute table of the new `buildings_with_neighborhoods` layer - Check that the neighborhood name column has been added - Verify that buildings have the correct neighborhood assigned ![](statistics_and_data_visualization/fig%20x2.png) **⚠️ Troubleshooting:** - If buildings show NULL in neighborhood field: They may be outside all neighborhood boundaries - If you see duplicate buildings: Check your geometric predicate and join type settings ------------------------------------------------------------------------ ## Part 3: Calculate Building Areas ### Step 3.1: Add Area Field 1. **Open the attribute table** of `buildings_with_neighborhoods` 2. **Enable editing mode** 3. **Open Field Calculator** 4. **Configure the area field:** - ☑ **Create a new field** - **Output field name:** `area_m2` - **Output field type:** `Decimal number (real)` 5. **Enter this expression:** ``` sql $area ``` This calculates the area in square meters (since your CRS is EPSG:28992). 6. Click **OK** 7. **Save edits** and **disable editing mode** ![Figure 6 - Calculating building areas](statistics_and_data_visualization/fig%206.png) **💡 Alternative expressions:** - `round($area, 2)` - Round to 2 decimal places - `$area / 10000` - Convert to hectares - For other CRS: `area($geometry)` is more explicit ------------------------------------------------------------------------ ## Part 4: Calculate Statistics by Function and Neighborhood Now we'll aggregate the data to answer key questions: - How many buildings of each function type are in each neighborhood? - What is the total area of each function type per neighborhood? ### Step 4.1: Statistics by Categories - Building Count 1. **Go to:** `Processing Toolbox` → `Vector Analysis` → `Statistics by Categories` 2. **Configure for building counts:** - **Input vector layer:** `buildings_with_neighborhoods` - **Field to calculate statistics on:** `fid` (or any unique ID field) - **Field(s) with categories:** - Click `...` and select **both:** 1. `buurt_naam` (neighborhood name) 2. `function_type` (our created classification) - **Statistics to calculate:** - ☑ **Count** (this counts buildings) - **Output:** `building_count_by_function.csv` ::: callout-note The tool doesn't ask this to specify, but if the field to calculate on is not specified, it will only do count. ::: ![Figure 7 - Statistics by Categories for counts](statistics_and_data_visualization/fig%207.png) 3. Click **Run** 4. **Review the output:** - Open the CSV in QGIS or a text editor - You should see columns: `buurt_naam`, `function_type`, `count` - Each row represents: X buildings of type Y in neighborhood Z ### Step 4.2: Statistics by Categories - Area Sum 1. **Run Statistics by Categories again** with different settings: 2. **Configure for area totals:** - **Input vector layer:** `buildings_with_neighborhoods` - **Field to calculate statistics on:** `area_m2` - **Field(s) with categories:** - `buurt_naam` - `function_type` - **Statistics to calculate:** - ☑ **Sum** (total area) - ☑ **Mean** (average building size - optional but useful) - ☑ **Count** (number of buildings - to verify) - **Output:** `building_area_by_function.csv` ![Figure 8 - Statistics by Categories for areas](statistics_and_data_visualization/fig%208.png) 3. Click **Run** ### Step 4.3: Create a Comprehensive Summary Table (Alternative Method) For more control, use the **Aggregate** tool: 1. **Go to:** `Processing Toolbox` → `Vector Analysis` → `Aggregate` 2. **Configure:** - **Input layer:** `buildings_with_neighborhoods` - **Group by expression:** Click `ε` button and build expression: ``` sql "buurtnaam" || ' - ' || "function_type" ``` This creates unique groups for each neighborhood-function combination. 3. **Aggregates:** Click `...` to configure multiple statistics: **Add these aggregations:** | Expression | Aggregate | Name | Type | Length | Precision | |------------|------------|------------|------------|------------|------------| | `fid` | \- | `building_count` | Integer (64 bit) | 10 | 2 | | `"area_m2"` | `sum` | `total_area_m2` | Decimal (Double) | 10 | 2 | | `"area_m2"` | `mean` | `avg_area_m2` | Decimal (Double) | 10 | 2 | | `"area_m2"` | `min` | `min_area_m2` | Decimal (Double) | 10 | 2 | | `"area_m2"` | `max` | `max_area_m2` | Decimal (Double) | 10 | 2 | 4. **Also add these to preserve grouping info:** | Expression | Aggregate | Name | Type | Length | Precision | |------------|------------|------------|------------|------------|------------| | `"buurtnaam"` | `first_value` | `neighborhood` | Text (string) | 50 | 0 | | `"function_type"` | `first_value` | `function` | Text (string) | 50 | 0 | ![Figure 9 - Aggregate tool configuration](statistics_and_data_visualization/fig%209.png) 5. **Aggregated layer:** Save as `building_stats_summary.gpkg` 6. Click **Run** 7. **Review the results:** - Open the attribute table - You now have comprehensive statistics for each neighborhood-function combination - Each row = one function type in one neighborhood with count, total area, and size statistics ![](statistics_and_data_visualization/fig%20x3.png) ------------------------------------------------------------------------ ## Part 5: Export and Visualize Data ### Step 5.1: Prepare Data for Visualization 1. **Open the attribute table** of your statistics layer (`building_stats_summary`) 2. **Export to CSV:** - Right-click the layer → `Export` → `Save Features As...` - **Format:** `Comma Separated Value [CSV]` - **File name:** `rotterdam_building_stats.csv` - **Geometry:** Select `No geometry` (we only need the statistics) - **Select fields to export:** - ☑ `neighborhood` - ☑ `function` - ☑ `building_count` - ☑ `total_area_m2` - ☑ `avg_area_m2` - Uncheck `fid` and other unnecessary fields ![Figure 10 - Export CSV configuration](statistics_and_data_visualization/fig%2010.png) 3. Click **OK** 4. **Verify your export:** - Open the CSV in a text editor or spreadsheet - Check headers are clear - Verify data looks correct ![](statistics_and_data_visualization/fig%20x4.png) ### Step 5.2: Create Visualizations in RAWGraphs #### Visualization 1: Bar Chart - Building Count by Function 1. **Open RAWGraphs:** https://app.rawgraphs.io/ 2. **Load your data:** Upload `rotterdam_building_stats.csv` 3. **Choose chart:** Select **Bar chart** 4. **Map dimensions:** - **Bars:** `function` - **Size:** `building_count` - **Color:** `neighborhood` (optional - will show stacked bars) 5. **Customize:** - **Sort by:** Value (descending) to show most common functions first - **Orientation:** Horizontal (easier to read function names) - **Color scheme:** Choose a categorical palette 6. **Export:** Download as SVG ![Figure 11 - Bar chart of building counts by function](statistics_and_data_visualization/fig%2011.png) #### Visualization 2: Treemap - Area Distribution 1. **Start new chart** in RAWGraphs 2. **Choose chart:** Select **Treemap** 3. **Map dimensions:** - **Hierarchy:** 1. First level: `function` 2. Second level: `neighborhood` - **Size:** `total_area_m2` - **Color:** `function` - **Label:** `neighborhood` 4. **Customize:** - Adjust padding for readability - Choose color scheme (qualitative) - Enable/disable labels based on size 5. **Export:** Download as SVG ![Figure 12 - Treemap showing area distribution](statistics_and_data_visualization/fig%2012.png) #### Visualization 3: Grouped Bar Chart - Compare Neighborhoods 1. **Start new chart** 2. **Choose chart:** Select **Bar chart** 3. **Map dimensions:** - **X axis:** `neighborhood` - **Y axis:** `building_count` - **Color:** `function` - **Series:** Leave empty for grouped bars 4. **Customize:** - **Sort by:** Total value to rank neighborhoods - Enable legend - Adjust colors for function types 5. **Export:** Download as SVG ![Figure 13 - Grouped bar chart comparing neighborhoods](images/grouped_barchart.png) ------------------------------------------------------------------------ ## Summary Table: Statistics You've Calculated | Statistic | Method | Answers the Question | |--------------------|-------------------|---------------------------------| | **Building Count by Function** | Statistics by Categories | How many buildings of each type in each neighborhood? | | **Total Area by Function** | Statistics by Categories | What's the total floor area per function type? | | **Average Building Size** | Aggregate tool | What's the typical building size per function? | | **Neighborhood Rankings** | Sort aggregated data | Which neighborhoods have most buildings/area? | ------------------------------------------------------------------------ ## Key QGIS Concepts Used ### 1. Field Calculator Expressions **Conditional Classification:** ``` sql CASE WHEN condition THEN result ELSE alternative END ``` **String Functions:** - `length()` - Count characters - `replace()` - Replace text - `trim()` - Remove whitespace ### 2. Spatial Predicates | Predicate | Description | Use Case | |-----------------------|---------------------------|----------------------| | **within** | Feature completely inside | Buildings entirely within neighborhoods | | **intersects** | Features touch or overlap | Include buildings on borders | | **contains** | Opposite of within | Which neighborhood contains building | ### 3. Aggregation Functions | Function | Purpose | |-----------------|-------------------| | `count()` | Count features | | `sum()` | Add up values | | `mean()` | Calculate average | | `min()`/`max()` | Find extremes | ------------------------------------------------------------------------ ## Common Issues and Solutions ### Issue 1: Buildings missing neighborhood names after join **Cause:** Buildings are outside all neighborhood boundaries **Solution:** - Check your spatial selection first - Verify CRS matches for both layers - Use "intersects" instead of "within" if buildings are on boundaries - Inspect the map visually to confirm coverage ### Issue 2: Duplicate buildings in results **Cause:** Join type is set to "one-to-many" and buildings overlap multiple neighborhoods **Solution:** - Use "within" predicate (more strict) - Check for buildings actually on boundaries - decide which neighborhood they should belong to - Use "Create separate features for each located feature (one-to-one)" join type ### Issue 3: Area values are enormous or tiny **Cause:** CRS is in degrees instead of meters, or area unit confusion **Solution:** - Verify your CRS is EPSG:28992 (projected coordinate system) - Check Project → Properties → CRS - Reproject layers if needed: Vector → Data Management Tools → Reproject Layer ### Issue 4: Function names are inconsistent **Cause:** Raw data has variations in naming or spelling **Solution:** - Use Field Calculator to standardize: ``` sql CASE WHEN "function" ILIKE '%woon%' THEN 'Residential' WHEN "function" ILIKE '%winkel%' THEN 'Retail' -- etc. END ``` ------------------------------------------------------------------------ ## Extension Activities 1. **Temporal Analysis:** If you have multi-year building data, compare how building functions change over time 2. **Density Calculations:** Calculate building density (buildings per hectare) and function diversity per neighborhood 3. **Advanced Classification:** Create more detailed function categories: - Residential (single-family vs. multi-family) - Commercial (retail vs. office) - Create a function_category hierarchy 4. **Spatial Statistics:** Calculate nearest neighbor distances for each function type 5. **Dashboard Creation:** Combine multiple visualizations with descriptive text in a report 6. **Interactive Mapping:** Use QGIS2Web or qgis2threejs to create interactive maps showing statistics ------------------------------------------------------------------------ ## Checklist Before finishing, verify you have: - ☑ Created `function_type` field classifying single vs. multi-functional buildings - ☑ Performed spatial selection of buildings within neighborhoods - ☑ Executed spatial join to append neighborhood names - ☑ Calculated building areas in square meters - ☑ Generated statistics: building count per function per neighborhood - ☑ Generated statistics: total area per function per neighborhood - ☑ Exported clean CSV with relevant fields only - ☑ Created at least 2 different visualizations in RAWGraphs - ☑ Saved all intermediate and final outputs in your project folder ------------------------------------------------------------------------ ## Additional Resources **QGIS Documentation:** - [Vector analysis tools](https://docs.qgis.org/latest/en/docs/user_manual/processing_algs/qgis/vectoranalysis.html) - [Field calculator functions](https://docs.qgis.org/latest/en/docs/user_manual/working_with_vector/expression.html) - [Spatial queries](https://docs.qgis.org/latest/en/docs/user_manual/processing_algs/qgis/vectorselection.html) **RAWGraphs:** - [Learning resources](https://www.rawgraphs.io/learning) - [Chart gallery](https://www.rawgraphs.io/gallery) **Data Visualization:** - [ColorBrewer](https://colorbrewer2.org/) - Choose appropriate color schemes - [Data Viz Project](https://datavizproject.com/) - Encyclopedia of visualization types