Predicting future satellite imagery with AI for agricultural monitoring

Continuous monitoring of crops and forecasting crop conditions through time series analysis is crucial for effective agricultural monitoring and management. Traditional time series interpolation methods are commonly used for reconstructing historical missing images. However, these methods often struggle with data quality issues, such as cloud cover, that can obscure critical data, especially in optical satellite sensors.

To be able to accurately reconstruct missing satellite images and to predict the next few images, we use an LSTM-based method, called bi-directional attention LSTM, or BiLSTM (Fig.1). An attention mechanism enhances the model’s ability to handle short sequences in a sequence-to-one forecasting framework. This approach builds on the classic LSTM networks in handling temporal data gaps by learning from both past and future contexts, which is particularly useful for predicting cloud-free images acquired in user-defined time.




Figure 1. The architecture of a bi-directional LSTM model for multi-band imagery prediction.

The model’s input layer is configured to be able to work with different dimensions of the input sequence, specifically tuned to the time steps and feature counts of the dataset in use. It is followed by a series of LSTM layers, where the initial layer is bidirectional, enabling the network to learn from the sequence data in both forward and reverse directions. Each LSTM layer has batch normalization and ReLU activation functions, which are critical for stabilizing the neural network by normalizing the activations and introducing non-linearity.

We incorporate an attention mechanism, which selectively weighs the importance of different time steps across the input sequence. This is achieved through a dense layer with softmax activation that outputs attention probabilities. These probabilities are then element-wise multiplied with the LSTM outputs to emphasize features of higher relevance to the task at hand. Post-attention application, the model leverages a Global Average Pooling 1D layer to condense the feature map across time steps into a single vector, reducing the model’s complexity and computational demands while retaining essential temporal features. The output layer, constructed with a dense layer, is tailored with an adjustable number of neurons corresponding to the desired outputs and utilizes a linear activation function to produce the final prediction.

This approach allows us to predict from 1 to 12 bands of a satellite image, depending on the problem in hand. For example, it can be used to predict both missing a future Normalised Difference Vegetation Index, widely used by agronomists to estimate plant health (Fig.2.)

 Figure 2. An example of a predicted NDVI image with the attention BiLSTM method.

Additionally, the model can predict all the bands of the Sentinel-2 satellite or other vegetation indices, respectively (Fig.3).

Figure 3. Multibands time series mean and standard deviation across different bands. Top: original time series. Bottom: time series vegetation indices

You can read more on the structure of the model in our paper: https://arxiv.org/pdf/2407.00834 

Next
Next

Optimise your nutrient application using variable rate application (VRA) maps to maximise yields and reduce costs