Plant Segmentation Final Project for CS 639 Computer Vision

Marianne Bjorner and Carter Sifferman

Github Repository / Project Proposal / Midterm Report / Final Presentation / Final Presentation Video

Introduction / Motivation

Current methods used to assess plant growth and phenotype in botany experiments are time-consuming and laborious. Using image segmentation techniques to replace manual plant measurements would increase automation and decrease needs for destructive sampling, thus decreasing labor and cost.

Of those algorithms available, many have been developed and tested on datasets available from the European Conference on Computer Vision’s (ECCV) workshop on Computer Vision Problems in Plant Phenotyping (CVPPP). The dataset comprises overhead images of two species (of genera Arabidopsis and Nicotiana), commonly used as model organisms for botany and horticulture experiments. Given this limited dataset, we wanted to expand the range of available plant and image types to evaluate the differential performance of segmentation algorithms.

Goals

Existing approaches to the plant and leaf segmentation problem proposed by CVPPP include:

Outside the realm of CVPPP, plant leaf segmentation has also been explored in the context of plant identification. It has been integrated into apps such as LeafSnap, in which individual leaf photos are used for species identification.

CVPPP2017 Dataset

The CVPPP2017 Dataset is a collection of overhead images of Arabidopsis and Tobacco plants, which are both leafy, green dicotyledons and have a rosette structure, meaning they grow in a circular pattern. Because of these features, the CVPPP dataset is not variable, and would be insufficient to test the broader applicability of phenotyping algorithms.

Our Dataset

This vacuum in available plant image data led us to develop our own dataset comprised of more challenging plant photos. Some features of these plants and photos include:

This dataset is comprised of 68 overhead plant images as well as their annotated binary mask counterparts. Binary masks created with Photoshop and GIMP. Each is 500x500 pixels, and a tags.json file is included to differentiate plant and image features, useful for testing an algorithm against a subset of image categories.

Implementing Existing Algorithms

PlantVision

Developed as Joint Multi-Leaf Segmentation, Alignment, and Tracking for Fluorescence Plant Videos, this tackles the problem of plant leaf segmentation and counting from fluorescent images, with the option of counting them over multiple frames to create a tracked “video” of plant growth.

The original PlantVision algorithm applied multiple filters (e.g. Gaussian followed by a 2-D median filter) before creating an edge image, and then applies chamfer matching using leaf templates which match the input images’ rosette leaf structures of arabidopsis and tobacco plants.

Difficulties Encountered / Caveats

Because this was developed for fluorescent images of plants, RGB inputs of images needed heavy alteration, and the inbuilt thresholding mechanisms of the algorithm required changes as well. Despite these changes, the algorithm managed to recover few leaves from RGB images, in many cases resulting in blank binary masks.

These changes are reflected in both the image input and entrance file of the original PlantVision algorithm.

Our Segmentation Approaches

Green Channel Thresholding

A green channel tresholding algorithm was implemented as a baseline. This approach rests on the assumption that - hey! plants are green! - and therefore images of plants can be separated based on how green these pixels are.

RGB image Ground Truth Binary Mask Green Threshold Result
rgb image from CVPPP2017 dataset, A018 fg image from CVPPP2017 dataset, A018 output of mask following the green threshold algorithm

Difficulties Encountered / Caveats

Many obvious pitfalls occur when classifying leaves solely using a single green channel’s values. This approach fails readily. Green backgrounds and bright colors result in false positives. Additionally, this approach does not work on images of non-green plants.

K-means Clustering

K-means Clustering creates a set of k clusters of datapoints, or in this case pixels, which minimizes the within-cluster variance of the clusters in an iterative fashion until convergence is reached.

RGB image Ground Truth Binary Mask Intermediate KMC Result K-Means Clustering Result
rgb image from CVPPP2017 dataset, ID 0418 fg image from CVPPP2017 dataset, ID 0418 intermediate results of the k-means clustering algorithm binary mask results of the k-means clustering algorithm

Difficulties Encountered / Caveats

Due to its reliance on RGB values of pixels, as k increases, it approximates the per-pixel logistic regression method.

Per-Pixel Logistic Regression

Python package sklearn’s LinearRegression tool was used to train a model on CVPPP2017 data. It recovered the following equation:

logistic regression equation recovered

These also use the values of the RGB channels, applying a negative weight to red and blue channels, and a positive weight to the green channel. If the pixel meets a threshold, it is classified as a plant pixel; background otherwise.

RGB image Ground Truth Binary Mask Per-Pixel Logistic Regression
rgb image from CVPPP2017 dataset, ID 0418 fg image from CVPPP2017 dataset, ID 0418 binary mask result of the per pixel logistic regression

Difficulties Encountered / Caveats

Some photos had background noise, where false positives were recovered in the soil surrounding the plant.

RGB image Ground Truth Binary Mask Per-Pixel Logistic Regression
rgb image from CVPPP2017 dataset, ID 020 fg image from CVPPP2017 dataset, ID 020 binary mask result of the per pixel logistic regression

Smoothed and Denoised Per-Pixel Regression

To address the noise in resulting masks of the per-pixel logistic regression, binary masks were post-processed in an attempt to remove background noise.

RGB image Ground Truth Binary Mask Smoothed and Denoised Per-Pixel Logistic Regression
rgb image from CVPPP2017 dataset, ID 0418 fg image from CVPPP2017 dataset, ID 0418 binary mask result of the smoothed and denoised per pixel logistic regression

Difficulties Encountered / Caveats

In some cases, smoothing and denoising resulted in strange artifacts, and led to a higher number of falsely identified plant pixels.

Results

We analyzed our results by assigning a Jaccard Index to each image. The Jaccard Index, also known as intersection over union, compares the binary mask output of each algorithm and compares them to the ground truth masks.

Calculations of Jaccard Index and Dice Coefficient driven by evalutate_segmentation.py

Jaccard Index by Method and Dataset

Method CVPPP2017* Our Dataset
Green Threshold 0.31 0.32
Per-Pixel Regression 0.75 0.56
Per-Pixel Regression + Smooth 0.85 0.66
K-Means Clustering 0.73 0.45

* Links lead to result graphs

Conclusions / Future Work

Our methods focus on a pixelwise classification to segment plants from their backgrounds. This can be extended and refined through additional input image and binary mask manipulation. However, it is important to acknowledge that as the parameters increase, so does the performance of a segmentation. Additional local parameters could include brightness, texture, or position. Others which are specific to plant structure include correlation to a template, proximity to an edge, angle between leaves, or distance between leaves. While the present algorithm could easily be extended to track phenotypes related to size and green-ness of the plant (a proxy for health and chlorophyll content), another datapoint of interest is also leaf number.

With such additional parameters, individual leaf segmentation and tracking would also be possible. Of the algorithms we implemented, k-means clustering was the most promising, as the clustering performed is analagous to leaf segmentation. Of course, there are other methods used for image segmentation such as Mean Shift.

Another approach would be to use feature detection methods to segment leaves, such as SIFT.

The current state-of-the-art includes high-performing algorithms that commonly fall above a Jaccard Index of 0.95. These commonly utilize a convolutional neural network, such as that in Kuznichov et. al.’s Data Augmentation for Leaf Segmentation and Counting Tasks in Rosette Plants. This approach used the matterport implementation of Mask R-CNN, which is available on Github.

References

1 Computer Vision Problems in Plant Phenotyping

2 PlantVision Github Repository

3 Yin X, Liu X, Chen J, and Kramer DM. Joint Multi-Leaf Segmentation, Alignment, and Tracking for Fluorescence Plant Videos. 2018. IEEE Transactions on Pattern Analysis and Machine Intelligence.

4 Kuznichov D, Zvirin A, Honen Y, and Kimmel R. Data Augmentation for Leaf Segmentation and Counting Tasks in Rosette Plants. 2019. The Computer Vision Foundation

5 Matterport Github Repository for Mask R-CNN

6 Pape JM and Klukas C. 3-D Histogram-Based Segmentation and Leaf Detection for Rosette Plants. 2015. ECCV 2014. Springer. pp. 61-74.

7 LeafSnap application webpage

8 CVPPP2017 Dataset

9 Ute Kramer. The Natural History of Model Organisms: Planting molecular functions in an ecological context with Arabidopsis thaliana. 2015.

10 Shibayama M, Sakamoto T, Takada E, Inoue A, Morita K, Yamaguchi T, Takahashi W and Kimura A. Estimating Rice Leaf Greenness (SPAD) Using Fixed-Point Continuous Observations of Visible Red and Near Infrared Narrow-Band Digital Images 2012. Plant Production Science. pp. 293-309.

11 He K, Gkioxari G, Dollár P, and Girshick R. Mask R-CNN. 2017. IEEE.

Resources:

1 Github Repository

2 Project Proposal

3 Midterm Report

4 Final Presentation

5 Final Presentation Video

6 CVPPP2017 LSC Dataset

7 Our Dataset