Field-based vegetation mapping is important for environmental assessments. Often, the area covered by a species is estimated visually within a reference frame. However, such assessments are prone to observer bias and a large variability.
We developed a deep learning pipeline relying on YOLOv8 models to segment species and estimate the percentage cover (%) of Vaccinium myrtillus (blueberry) and Vaccinium vitis-idaea (lingonberry), two key understory species in boreal forests. We used 138 nadir and downward-looking images of the forest floor captured in correspondence with 50 × 50 cm vegetation sub-plots assessed within National Forest Inventory (NFI) plots. First, we trained a bounding-box frame detection model to crop the image to the same area assessed in the field. Second, we trained an instance segmentation model to classify species. Third, we flattened the class values into a semantic raster and estimated the species-specific cover by pixel counting.
We evaluated our method against an independent test set of 156 images and found a root mean squared error (RMSE) of 8.82% for blueberry and 3.49% for lingonberry and no substantial systematic errors. An additional comparison with ocular estimation by various field workers for the same plots showed that the model estimates were within the range of estimates by field workers 8 out of 9 times for blueberry and 7 out of 9 times for lingonberry.
The developed method shows promise in reducing observer bias and variability in vegetation surveys, thereby improving their consistency while significantly reducing the time needed for species-specific coverage estimation. This is particularly beneficial for repeated measurements and monitoring vegetation cover dynamics. However, as the method relies on RGB data, it is limited to estimating the percentage of visible species that are not obscured by others. Expanding the method to include a broader range of cover classes (e.g. grasses, rocks, logs) or species could automate the capture of crucial information from widely available ground-based images, thereby enhancing our ability to characterize a broader range of ecosystems.