Aggregating Land Use Classes to a Regional Grid in Europe
Classification: Intermediate GIS Analysis | Spatial Data Processing | Environmental Assessment | Land Use Analysis
Learning Level: Intermediate
Time Estimate: 3–4 hours
Software Required: QGIS 3.40+ with Dissolve with stats plugin
Author: Updated from original by Claudiu Forgaci (2024)
Overview
This tutorial guides you through aggregating detailed land use classification data to a regular grid at the regional scale across Europe. You will work with Copernicus CLC (Corine Land Cover) data and a 2×2 km reference grid of the Netherlands to identify the dominant land use class in each grid cell, producing a simplified, grid-based representation of the landscape.
Learning Outcomes:
By completing this tutorial, you will be able to:
✅ Import and prepare large vector datasets for regional analysis
✅ Extract Level-1 land use classes from hierarchical CLC classifications
✅ Dissolve polygons by class to reduce geometric complexity
✅ Fix invalid geometries in spatial data
✅ Perform spatial intersection operations between raster-derived and grid geometries
✅ Calculate polygon areas and identify dominant classes within grid cells
✅ Use advanced dissolve and join operations to aggregate statistics by spatial unit
✅ Produce a final grid layer with dominant land use classifications
System Requirements & Setup
Software Installation
QGIS: - Download version 3.24 or later from https://qgis.org/download/ - Installation instructions vary by operating system
Required QGIS Plugins:
- Dissolve with stats – For dissolving geometries while computing summary statistics
- Go to Plugins → Manage and Install Plugins
- Search for “Dissolve with stats”
- Click Install and restart QGIS
Data Requirements
You will need the following datasets:
- Copernicus CLC2018 – Corine Land Cover classification layer (vector polygons)
- Available from: https://land.copernicus.eu/en/products/corine-land-cover
- Should contain a
Code_18field with Level-3 classification codes
- Netherlands 2×2 km Reference Grid – Regular grid covering the Netherlands
- You can import it from the data folder. Alternatively, to create it, you can follow the tutorial on how to create a fishnet
- Should contain a
CellCodefield for unique cell identification
Coordinate Reference System:
- Both datasets must be reprojected to EPSG:28992 (Amersfoord / RD New) before analysis
- This ensures accurate area calculations and spatial operations
Spatial Extent
The example in this tutorial covers:
Bounding Box (GDAL format, in EPSG:3035 coordinates): CHECK
3,500,000.0 (West), 2,680,000.0 (South)
4,490,000.0 (East), 3,670,000.0 (North)
Workflow Overview
The analysis follows a sequential workflow moving from detailed to aggregated data:
| Phase | Tasks | Output |
|---|---|---|
| Data Preparation | Import and examine CLC data; identify Level-1 codes | Verified layers in EPSG:3035 |
| Classification Simplification | Extract Level-1 classes; identify invalid geometries | 5 major land use classes |
| Geometry Repair | Fix invalid polygons; dissolve by class | Valid, simplified CLC layer |
| Spatial Intersection | Overlay simplified CLC with grid; calculate areas | Intersection layer with areas |
| Statistical Aggregation | Identify dominant class per grid cell; join attributes | Final grid with land use classification |
| Visualization | Style and export results for presentation | Cartographic output |
Background: Understanding CLC Classification Levels
The Copernicus CLC dataset uses a hierarchical classification with three levels:
- Level 1 (5 classes): Broad categories (Artificial surfaces, Agricultural areas, Forest, Herbaceous, Water)
- Level 2 (15 classes): Intermediate detail (e.g., Urban fabric, Industrial areas, Pastures)
- Level 3 (44 classes): Detailed categories (e.g., Continuous urban fabric, Discontinuous urban fabric)
This tutorial focuses on Level 1 for regional-scale analysis. The first digit of any Level-3 code represents the Level-1 class.
Example: Code 112 (Discontinuous urban fabric) → Level-1 class = 1 (Artificial surfaces)
Step 1: Import and Prepare Data
1.1 Import Layers
Launch QGIS 3.40+
Import the CLC2018 layer:
- Go to Layer → Add Layer → Add Vector Layer
- Navigate to your CLC2018 shapefile or GeoPackage
- Click Open
Import the Dutch 2×2 km grid:
- Go to Layer → Add Layer → Add Vector Layer
- Navigate to the grid layer
- Click Open
Both layers should now appear in the Layers Panel
1.2 Verify Coordinate Reference System
Both layers must be in EPSG:28992 for accurate results.
Check CRS for each layer:
- Right-click the CLC2018 layer and select Layer CRS → Set Layer CRS
- If not EPSG:3035, go to Layer → Set Layer CRS → EPSG:3035 (not a full reproject—only for display)
- Repeat for the grid layer
Reproject if necessary:
If layers are in different CRS:
- Go to Processing → Toolbox and search for Reproject layer
- Double-click to open the tool
- Input layer: Select the CLC2018 layer
- Target CRS: Select EPSG:3035
- Click Run to reproject
- Repeat for the grid layer if needed
1.3 Examine the CLC Attribute Table
- Right-click the CLC2018 layer and select Open Attribute Table
- Locate the
Code_18column- This contains Level-3 codes (3 digits, ranging from 111 to 522)
- Note other attribute fields for reference (e.g.,
OBJECTID,Area_HA) - Close the attribute table (you’ll modify it in the next step)
Step 2: Extract Level-1 Classification Codes
Level-1 codes are represented by the first digit of the Level-3 code. We’ll create a new field to store this simplified classification.
2.1 Open Field Calculator
- Right-click the CLC2018 layer and select Open Attribute Table
- In the Attribute Table toolbar, click the Pencil icon to enter Edit Mode
- The pencil icon turns yellow, indicating edit mode is active
- In the toolbar, click Field Calculator (calculator icon) or go to Table → Field Calculator
2.2 Create a New Field for Level-1 Code
In the Field Calculator dialog:
Check the Create a new field option
Output field name: Enter
Level_1Output field type: Select Whole number (integer)
Expression: Enter the following formula:
to_int(LEFT(Code_18, 1))
This extracts the first character of the Code_18 field and converts it to an integer.
Explanation: - LEFT(Code_18, 1) – Extracts the leftmost character from the Code_18 field - to_int() – Converts the text character to an integer (1, 2, 3, 4, or 5)
- Click OK to apply the formula
2.3 Verify the New Field
- In the Attribute Table, scroll right to find the new
Level_1column - Review several rows to confirm:
- Code_18 = 112 → Level_1 = 1 ✓
- Code_18 = 311 → Level_1 = 3 ✓
- Code_18 = 512 → Level_1 = 5 ✓
- Click the Pencil icon to exit edit mode
- When prompted, click Save Changes to confirm
Step 3: Simplify CLC Classification by Dissolving
Dissolving merges adjacent polygons with the same classification, reducing the total number of features from ~44 classes (Level 3) to 5 (Level 1).
3.1 Check for Invalid Geometries
Before dissolving, we must verify and fix any invalid geometries, which are common in large datasets like CLC.
Run Validity Check:
- Go to Processing → Toolbox
- Search for Check validity
- Double-click to open the tool
Configure the tool:
Input layer: Select CLC2018
Method: Select GEOS
Output valid geometries: Create temporary layer (for inspection only)
Output invalid geometries: Create temporary layer
Output errors: Create temporary layer
Click Run
Interpret Results:
- The tool generates three output layers
- Invalid output and Error output layers show which features have problems
- Review these to understand the nature of invalid geometries (overlaps, self-intersections, etc.)
- Note: You will not use these outputs; they are for information only. In case you encounter any problem with features, go to 3.2. If not, go to 3.3
3.2 Fix Invalid Geometries
- Go to Processing → Toolbox
- Search for Fix geometries
- Double-click to open the tool
Configure the tool:
- Input layer: Select CLC2018 (the original layer with invalid geometries)
- Output: Click the dropdown and select Save to File (or create a temporary layer)
- If saving to file:
- Click the folder icon to choose a location
- Enter filename:
CLC2018_fixed.shp
- Click Run
Result:
- A new layer
CLC2018_fixed(or temporary layer) with corrected geometries will be created - Invalid geometries are repaired using geometric algorithms
- Some topology may be altered, but the spatial extent and coverage remain valid
3.3 Dissolve by Level-1 Class
Now dissolve the fixed CLC layer by the Level_1 field to merge all polygons of the same class.
Open Dissolve Tool:
- Go to Vector → Geoprocessing Tools → Dissolve
- Alternatively, search for Dissolve in the Processing Toolbox
Configure Dissolve:
Input layer: Select
CLC2018_fixed(the layer with corrected geometries)Dissolve field: Click the dropdown and select
Level_1- This will merge all polygons sharing the same Level_1 code
Output: Click the dropdown and select Save to File (recommended for large datasets)
- Choose a location and enter:
CLC2018_Level1_dissolved.shp
- Choose a location and enter:
(Optional) Aggregate statistics:
- If you want to preserve information from dissolved polygons (e.g., total area), you can configure aggregation here
- For now, leave unchecked unless you need additional statistics
Click Run
Result:
- A new layer with exactly 5 features (one per Level-1 class) will be created
- Each feature represents all land of that class type, merged into single multi-part polygons
- Attribute table will show the Level_1 code and any aggregated statistics
3.4 Verify the Dissolved Layer
- Open the Attribute Table of the dissolved layer
- You should see exactly 5 rows (one for each Level-1 class):
- 1 = Artificial surfaces
- 2 = Agricultural areas
- 3 = Forest and herbaceous vegetation
- 4 = Wetlands
- 5 = Water
- Close the attribute table
Step 4: Intersect Dissolved CLC with Grid
Intersection overlays the simplified CLC classes with the regular grid, creating a new layer where each polygon represents the portion of a grid cell covered by one land use class.
4.1 Run Intersection
- Go to Vector → Geoprocessing Tools → Intersection
- Alternatively, search for Intersection in the Processing Toolbox
Configure Intersection:
Input layer: Select the dissolved CLC layer (
CLC2018_Level1_dissolved)Overlay layer: Select the EEA grid layer
Output: Click the dropdown and select Save to File
- Choose location and enter:
CLC_Grid_Intersection.shp
- Choose location and enter:
(Optional) Input layer fields to keep: Leave blank to keep all fields from both layers
Click Run
Processing:
- This operation may take several minutes depending on dataset size and complexity
- A progress bar will show processing status
- Once complete, a notification appears: “Finished”
Result:
- A new layer
CLC_Grid_Intersectionis created - Each polygon represents the spatial overlap of a grid cell and a CLC Level-1 class
- The attribute table contains fields from both input layers:
- Grid fields:
CellCode, grid geometry fields - CLC fields:
Level_1(the land use class)
- Grid fields:
4.2 Verify Intersection Results
- Open the Attribute Table of the intersection layer
- Scroll right and confirm you can see:
CellCodefield (from grid)Level_1field (from CLC)
- Note the total number of features: This may be several thousand, as each grid cell can intersect multiple land use classes
Step 5: Calculate Polygon Areas
To identify the dominant land use class in each grid cell, we must calculate the area of each intersection polygon. The dominant class is the one with the largest area within the cell.
5.1 Add an Area Field
- Right-click the intersection layer and select Open Attribute Table
- Click the Pencil icon to enter Edit Mode
- Click the Field Calculator (calculator icon)
In Field Calculator:
Check Create a new field
Output field name: Enter
Area_m2Output field type: Select Decimal number (real)
Expression: Enter:
$area
This built-in QGIS variable calculates the area of each polygon in the coordinate system units (square meters in EPSG:28992).
- Click OK
5.2 Verify Area Calculations
- Scroll right in the Attribute Table to see the new
Area_m2column - Review several values:
- All should be positive numbers
- Larger cells should have larger values
- No null or zero values should appear (if they do, there may be invalid geometries)
- Exit Edit Mode:
- Click the Pencil icon
- Click Save Changes when prompted
Step 6: Generate Centroids from Polygons
Centroids (center points) of intersection polygons will be used to join attributes back to the original grid.
6.1 Create Centroids
- Go to Vector → Geometry Tools → Centroids
- Alternatively, search for Centroids in the Processing Toolbox
Configure Centroids:
Input layer: Select the intersection layer (
CLC_Grid_Intersection)Output: Click dropdown and select Save to File
- Choose location and enter:
CLC_Grid_Intersection_Centroids.shp
- Choose location and enter:
Click Run
Result:
- A new point layer is created with one point at the center of each intersection polygon
- The attribute table includes all fields from the input (intersection) layer:
CellCode,Level_1,Area_m2, etc.
6.2 Verify Centroid Layer
- In the main canvas, the centroids should appear as points distributed across your study area
- Open the Attribute Table to confirm fields are preserved
- The number of points should equal the number of features in the intersection layer
Step 7: Join Centroids to Original Grid
This step transfers attributes from the centroids back to the original grid layer using a spatial join.
7.1 Run Spatial Join
- Go to Processing → Toolbox
- Search for Join Attributes by Location
- Double-click to open
Configure Join:
- Input layer: Select the original EEA grid layer
- This is the target layer that will receive new attributes
- Join layer: Select the centroids layer (
CLC_Grid_Intersection_Centroids)- This provides the attributes to join
- Geometric predicate: Select Contains (or Intersects)
- This means: for each grid cell, find centroids that fall within it
- Join type: Select One-to-many or Take attributes of the first located feature (depending on your QGIS version)
- This ensures one grid cell can be joined to multiple centroid features
- Output: Click dropdown and select Save to File
- Enter:
Grid_with_Centroid_Attributes.shp
- Enter:
- Click Run
Result:
- A new layer is created containing the original grid with added fields from centroids
- Each grid cell may now have multiple records if it intersects multiple land use classes
- Attributes include:
CellCode,Level_1,Area_m2, and other fields
Step 8: Install Dissolve with stats Plugin
This plugin is essential for the next step, as it dissolves features while computing summary statistics (in this case, identifying the maximum area).
8.1 Install Plugin
Go to Plugins → Manage and Install Plugins
In the search bar, type:
Dissolve with statsClick on the plugin in the results list
Click Install plugin
Once installed, a notification appears: “Plugin installed successfully”
Restart QGIS to activate the plugin
8.2 Verify Installation
After restart:
- Go to Processing → Toolbox
- Search for
Dissolve with stats - You should see the plugin’s tool listed
- If not visible, enable it: Plugins → Manage and Install Plugins → check the plugin is enabled
Step 9: Identify Dominant Land Use Class per Grid Cell
Using the Dissolve with stats plugin, we dissolve the joined layer by grid cell and identify which land use class has the maximum area within each cell.
9.1 Open Dissolve with stats
- Go to Processing → Toolbox
- Search for Dissolve with stats
- Double-click to open the tool dialog
Configure Dissolve with stats:
Input layer: Select the joined layer (
Grid_with_Centroid_Attributes)Dissolve field: Click dropdown and select
CellCode- This groups all records by grid cell
Statistics to calculate:
- Locate the Area_m2 field
- For this field, select Max from the function dropdown
- This identifies the largest area (dominant class) per grid cell
Output: Click dropdown and select Save to File
- Enter:
Grid_Dominant_Class_Stats.shp
- Enter:
Click Run
Result:
- A new layer is created with one record per grid cell
- A new field (e.g.,
Area_m2_max) shows the maximum area value for each cell - This identifies which land use class dominates each grid cell
Step 10: Join Dominant Class Information Back to Grid
We must now join the maximum area information back to the full dataset to filter for only the dominant classes.
10.1 Run Field Value Join
- Go to Processing → Toolbox
- Search for Join Attributes by Field Value
- Double-click to open
Configure Join:
Input layer: Select the layer from Step 7 (
Grid_with_Centroid_Attributes)- This is the layer with all grid cell-land use class combinations
Table field: Select
CellCode(in the input layer)Input layer 2: Select the result from Step 9 (
Grid_Dominant_Class_Stats)- This contains the maximum area values
Table field 2: Select
CellCode(matching field in input layer 2)Join type: Select One-to-many or Concatenate joined fields
- This allows one grid cell to be matched to its dominant class record
Output: Click dropdown and select Save to File
- Enter:
Grid_with_Max_Area_Info.shp
- Enter:
Click Run
Result:
- A new layer with joined maximum area information
- Each record now has access to the
Area_m2_maxvalue for its grid cell - Records where
Area_m2equalsArea_m2_maxrepresent dominant land use classes
Step 11: Filter for Dominant Classes
Now filter the joined layer to keep only the records where the area of the intersection polygon equals the maximum area in that grid cell. These represent the dominant land use classes.
11.1 Apply Attribute Filter
- Right-click the joined layer (
Grid_with_Max_Area_Info) and select Filter…- Alternatively, go to Vector → Filter…
- In the Query Builder dialog:
- Build a filter expression:
Area_m2 = Area_m2_max - (Exact field names may vary; use the field names from your dataset)
- Build a filter expression:
- Click OK to apply the filter
Result:
- The layer now displays only features where area equals maximum area
- These represent the dominant land use class in each grid cell
11.2 Save Filtered Results
Right-click the filtered layer and select Export → Save As
Save as:
- Format: Shapefile (.shp) or GeoPackage (.gpkg)
- Filename:
Grid_Dominant_CLC_Final.shp
Check Save only selected features (to save the filtered results)
Click Save
Result:
- A new layer containing only the dominant land use class per grid cell
- This is your final output for mapping and analysis
Step 12: Visualize and Style Results
12.1 Style the Final Grid Layer
Select the final grid layer (
Grid_Dominant_CLC_Final) in the Layers PanelOpen Layer Styling (View → Panels → Layer Styling or press F7)
In the Symbology tab, change to Categorized classification
Value: Select
Level_1(the land use class field)Color ramp: Choose a categorical colour scheme (e.g., Set 1, Pastel1)
Click Classify to generate symbols for each class
Assign meaningful colours:
- Level 1 (Artificial surfaces): Red or grey
- Level 2 (Agricultural): Yellow or tan
- Level 3 (Forest): Green
- Level 4 (Wetlands): Blue/purple
- Level 5 (Water): Blue
Double-click colours to customize as desired
Click Apply to visualize
12.2 Add Map Elements
In a Print Layout, add:
- Title: “Dominant Land Use Classes – Regional Grid Analysis”
- Legend: Showing the 5 Level-1 classes with colours
- Scale bar: Appropriate for your study region
- North arrow: For geographic orientation
- Source attribution: Credit to Copernicus/EEA
12.3 Export Final Map
- Go to Layout → Export as PDF or Export as Image
- Set resolution to 300 DPI for print quality
- Choose output location and filename
- Click Save
Troubleshooting Guide
| Problem | Cause | Solution |
|---|---|---|
| Dissolve operation fails | Invalid geometries in CLC layer | Run “Check validity” and “Fix geometries” before dissolving. Use the fixed layer for subsequent operations. |
| Intersection produces very large output | High-resolution CLC polygons create many intersections | This is normal. Consider subsetting to a smaller study area if processing is too slow. |
| Area calculations show zero or null values | Geometries are invalid or in wrong CRS | Ensure CRS is EPSG:3035. Run “Fix geometries” if needed. Recalculate area field. |
| Dissolve with stats not found | Plugin not installed or activated | Install via Plugins → Manage and Install Plugins. Restart QGIS. Ensure plugin is enabled. |
| Join produces unexpected results | Mismatched field names or incorrect join type | Verify field names match exactly (case-sensitive). Check join type is “One-to-many”. Inspect sample records. |
| Dominant class identification is incorrect | Floating-point comparison errors; Area_m2 ≠ Area_m2_max due to rounding | Use Area_m2 >= Area_m2_max * 0.99 in filter to account for rounding errors. |
| Final grid has missing cells | Some grid cells don’t contain any CLC data | This is normal if grid extends beyond CLC coverage. Consider masking to valid areas only. |
| Processing is very slow | Large dataset; insufficient system memory | Reduce study area; work with smaller regions sequentially. Close other applications. Consider using “Create Temporary Layer” rather than saving large intermediate files. |
Interpretation Guidelines
Understanding the Results:
- One Level-1 class per grid cell: The final grid shows the single most prevalent land use type in each 2×2 km cell
- Spatial patterns: Visible gradients reveal regional land use structure (urban cores, agricultural zones, forested areas)
- Level of detail: Level-1 classification is suitable for regional planning and policy; Level-2 or Level-3 needed for local detail
Using the Grid for Analysis:
- Statistical summaries: Count grid cells by dominant class; calculate percentages by region
- Change detection: Compare grids from different years (if CLC data available for multiple epochs)
- Integration with other data: Join socioeconomic, demographic, or climate data to explore correlations
- Planning applications: Identify expansion opportunities or conservation priorities
Extensions and Advanced Topics
Once you complete this tutorial, consider:
Multi-level Classification: Repeat the workflow using Level-2 or Level-3 classes for finer detail in specific regions
Proportion-based Aggregation: Instead of dominant class, calculate the percentage of each land use type per grid cell
Temporal Analysis: Repeat for CLC2012 and CLC2018 to assess land use change over time
Accuracy Assessment: Compare dominant classes with ground-truth data or high-resolution imagery
Zonal Statistics Alternative: Use QGIS’s Zonal Statistics tool as an alternative workflow (may be faster for large datasets)
Custom Grid Resolutions: Replace the 2×2 km grid with finer (1×1 km) or coarser (10×10 km) grids for different analytical scales
Export for Web Mapping: Convert results to GeoJSON or Tiles for interactive web-based visualization
References & Resources
Data Sources:
- Copernicus CLC2018: https://land.copernicus.eu/en/products/corine-land-cover
- CLC User Manual: https://land.copernicus.eu/en/technical-library/clc-product-user-manual/
QGIS Documentation:
- Vector Geoprocessing Tools: https://docs.qgis.org/latest/en/docs/user_manual/processing_algs/qgis/vectorgeoprocessing.html
- Processing Toolbox: https://docs.qgis.org/latest/en/docs/user_manual/processing/toolbox.html
- Dissolve with stats Plugin: Check plugin repository or GitHub
Related Tutorials:
- QGIS official training manual: https://docs.qgis.org/latest/en/docs/training_manual/
- European Environment Agency GIS resources: https://www.eea.europa.eu/data-and-maps
Quick Reference: Key Tools
| Task | Tool Path | Purpose |
|---|---|---|
| Add layer | Layer → Add Layer | Import CLC and grid data |
| Edit attributes | Right-click layer → Open Attribute Table | Create Level-1 field |
| Field calculator | Attribute Table → Field Calculator | Extract Level-1 code; calculate area |
| Check validity | Processing → Check validity | Identify invalid geometries |
| Fix geometries | Processing → Fix geometries | Repair invalid polygons |
| Dissolve | Vector → Geoprocessing Tools → Dissolve | Merge polygons by class |
| Intersection | Vector → Geoprocessing Tools → Intersection | Overlay CLC with grid |
| Centroids | Vector → Geometry Tools → Centroids | Create center points |
| Spatial join | Processing → Join Attributes by Location | Link grid to centroids |
| Dissolve with stats | Processing → Dissolve with stats | Find maximum area per cell |
| Field value join | Processing → Join Attributes by Field Value | Link dominant class info |
| Filter | Vector → Filter | Select dominant classes only |
Appendix: Understanding CLC Level Codes
Level-1 Classes (First Digit):
| Code | Class Name | Characteristics |
|---|---|---|
| 1 | Artificial surfaces | Urban, industrial, mining, construction |
| 2 | Agricultural areas | Arable land, permanent crops, pastures |
| 3 | Forest and herbaceous | Forests, herbaceous vegetation, sparse vegetation |
| 4 | Wetlands | Inland and coastal wetlands |
| 5 | Water | Rivers, lakes, coastal water |
Example Level-3 Codes:
- 111 = Continuous urban fabric
- 112 = Discontinuous urban fabric
- 211 = Non-irrigated arable land
- 231 = Pastures
- 311 = Broad-leaved forest
- 312 = Coniferous forest
- 511 = Rivers
- 512 = Lakes
Document Version: 1.0 (Updated February 2026)
QGIS Version: 3.40+
Difficulty Level: Intermediate
Expected Time: 2–3 hours
Data Scale: Regional (Europe)
Output Scale: 10×10 km grid cells
Feedback & Support
If you encounter issues or have questions:
- Review the Troubleshooting Guide in this document
- Check QGIS documentation at https://docs.qgis.org/
- Consult Copernicus and EEA documentation for data-specific questions
- Contact your instructor or GIS support team