🔧 AI in AgricultureMarch 27, 2026✅ Tests passing

Crop Yield Predictor

This tool uses machine learning models to predict crop yields based on input features like soil quality, climate data, and historical yield data. It helps AI developers build and test prediction models for agricultural datasets, promoting smart farming solutions.

View on GitHub Download ZIP

Share:X / Twitter LinkedIn Reddit Hacker News

What It Does

Preprocess input CSV data containing soil quality, temperature, rainfall, and historical yield.
Train machine learning models (Random Forest or Neural Network) to predict crop yields.
Save trained models and scalers for future use.
Predict crop yields and save predictions to an output CSV file.

Installation

Install the required Python packages using pip:

pip install pandas numpy scikit-learn joblib pytest

Usage

Run the script from the command line with the following arguments:

python crop_yield_predictor.py --input <input_csv> --model <model_type> [--output <output_csv>] [--save_model <model_file>] [--n_estimators <num_trees>] [--hidden_layer_sizes <sizes>]

Arguments

--input: Path to the input CSV file containing the data.
--model: Model type to use for training (random_forest or neural_network).
--output: (Optional) Path to save the predictions as a CSV file.
--save_model: (Optional) Path to save the trained model and scaler.
--n_estimators: (Optional) Number of trees for the Random Forest model (default: 100).
--hidden_layer_sizes: (Optional) Comma-separated hidden layer sizes for the Neural Network model (default: "100").

Input File Format

The input CSV file must contain the following columns:

soil_quality
temperature
rainfall
historical_yield

Example

#### Training a Random Forest Model

python crop_yield_predictor.py --input data.csv --model random_forest --n_estimators 200 --output predictions.csv --save_model model.joblib

#### Training a Neural Network Model

python crop_yield_predictor.py --input data.csv --model neural_network --hidden_layer_sizes 100,50 --output predictions.csv --save_model model.joblib

Source Code

import argparse
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler
from joblib import dump
import os

def preprocess_data(input_file):
    try:
        data = pd.read_csv(input_file)
        if data.empty:
            raise ValueError("Input CSV file is empty.")

        required_columns = ['soil_quality', 'temperature', 'rainfall', 'historical_yield']
        for col in required_columns:
            if col not in data.columns:
                raise ValueError(f"Missing required column: {col}")

        X = data[['soil_quality', 'temperature', 'rainfall']]
        y = data['historical_yield']

        scaler = StandardScaler()
        X_scaled = scaler.fit_transform(X)

        return X_scaled, y, scaler
    except Exception as e:
        raise ValueError(f"Error processing input file: {e}")

def train_model(X, y, model_type, **kwargs):
    if model_type == 'random_forest':
        model = RandomForestRegressor(**kwargs)
    elif model_type == 'neural_network':
        model = MLPRegressor(max_iter=500, **kwargs)
    else:
        raise ValueError("Unsupported model type. Choose 'random_forest' or 'neural_network'.")

    X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
    model.fit(X_train, y_train)

    predictions = model.predict(X_val)
    mse = mean_squared_error(y_val, predictions)
    print(f"Validation Mean Squared Error: {mse}")

    return model

def predict_and_save(model, scaler, input_file, output_file):
    try:
        data = pd.read_csv(input_file)
        if data.empty:
            raise ValueError("Input CSV file is empty.")

        required_columns = ['soil_quality', 'temperature', 'rainfall']
        for col in required_columns:
            if col not in data.columns:
                raise ValueError(f"Missing required column: {col}")

        X = data[['soil_quality', 'temperature', 'rainfall']]
        X_scaled = scaler.transform(X)
        predictions = model.predict(X_scaled)

        data['predicted_yield'] = predictions

        if output_file:
            data.to_csv(output_file, index=False)
            print(f"Predictions saved to {output_file}")
        else:
            print(data[['soil_quality', 'temperature', 'rainfall', 'predicted_yield']])
    except Exception as e:
        raise ValueError(f"Error during prediction or saving: {e}")

def main():
    parser = argparse.ArgumentParser(description="Crop Yield Predictor")
    parser.add_argument('--input', required=True, help="Path to input CSV file")
    parser.add_argument('--model', required=True, choices=['random_forest', 'neural_network'], help="Model type")
    parser.add_argument('--output', help="Path to save predictions (optional)")
    parser.add_argument('--save_model', help="Path to save trained model (optional)")
    parser.add_argument('--n_estimators', type=int, default=100, help="Number of trees for Random Forest (if applicable)")
    parser.add_argument('--hidden_layer_sizes', type=str, default="100", help="Hidden layer sizes for Neural Network (comma-separated, if applicable)")

    args = parser.parse_args()

    try:
        X, y, scaler = preprocess_data(args.input)

        if args.model == 'random_forest':
            model = train_model(X, y, 'random_forest', n_estimators=args.n_estimators)
        elif args.model == 'neural_network':
            hidden_layer_sizes = tuple(map(int, args.hidden_layer_sizes.split(',')))
            model = train_model(X, y, 'neural_network', hidden_layer_sizes=hidden_layer_sizes)

        if args.save_model:
            dump((model, scaler), args.save_model)
            print(f"Model and scaler saved to {args.save_model}")

        predict_and_save(model, scaler, args.input, args.output)

    except Exception as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    main()

Community

Downloads

···

Rate this tool

No ratings yet — be the first!

Details

Tool Name: crop_yield_predictor
Category: AI in Agriculture
Generated: March 27, 2026
Tests: Passing ✅
Fix Loops: 4

Quick Install

Clone just this tool:

git clone --depth 1 --filter=blob:none --sparse \
  https://github.com/ptulin/autoaiforge.git
cd autoaiforge
git sparse-checkout set generated_tools/2026-03-27/crop_yield_predictor
cd generated_tools/2026-03-27/crop_yield_predictor
pip install -r requirements.txt 2>/dev/null || true
python crop_yield_predictor.py

Links

View source on GitHub Raw README.md