🔧 AI-Generated Content ModerationMay 31, 2026✅ Tests passing

Content Guard

Content Guard is a CLI tool and Python library for detecting harmful or inappropriate text generated by AI models. It uses pre-trained NLP models to classify content into categories like hate speech, toxicity, or explicit material, helping developers filter problematic outputs effectively.

View on GitHub Download ZIP

Share:X / Twitter LinkedIn Reddit Hacker News

What It Does

Detects harmful or inappropriate text using pre-trained NLP models.
Classifies content into categories such as hate speech, toxicity, or explicit material.
Outputs flagged content as JSON.

Installation

Install the required dependencies:

pip install transformers pytest

Usage

CLI

To use Content Guard as a CLI tool:

python content_guard.py --input <input_file_path> --output <output_file_path>

--input or -i: Path to the input text file.
--output or -o: Path to save the flagged content as JSON. If not provided, the flagged content will be printed to the console.

Python Library

You can also use Content Guard as a Python library:

from content_guard import classify_text
from transformers import pipeline

classifier = pipeline('text-classification', model='unitary/toxic-bert')
text = "This is a toxic comment."
flagged = classify_text(text, classifier)
print(flagged)

Source Code

import json
import sys
import argparse
from transformers import pipeline

def classify_text(text, classifier):
    """
    Classifies the given text using the provided classifier.

    Args:
        text (str): The input text to classify.
        classifier: The NLP pipeline for text classification.

    Returns:
        list: A list of flagged results with labels and scores.
    """
    results = classifier(text)
    flagged = [result for result in results if result['score'] > 0.5]
    return flagged

def main():
    """
    Main function for the Content Guard CLI tool.

    Parses command-line arguments and processes the input file.
    """
    parser = argparse.ArgumentParser(description="Content Guard: Detect harmful or inappropriate text.")
    parser.add_argument('--input', '-i', type=str, required=True, help='Path to the input text file.')
    parser.add_argument('--output', '-o', type=str, help='Path to save the flagged content as JSON.')
    args = parser.parse_args()

    # Initialize the text classification pipeline
    try:
        classifier = pipeline('text-classification', model='unitary/toxic-bert')
    except Exception as e:
        print(f"Error initializing classifier: {e}", file=sys.stderr)
        sys.exit(1)

    # Read input text
    try:
        with open(args.input, 'r') as file:
            text = file.read()
    except FileNotFoundError:
        print("Error: Input file not found.", file=sys.stderr)
        sys.exit(1)
    except Exception as e:
        print(f"Error reading input file: {e}", file=sys.stderr)
        sys.exit(1)

    # Classify text
    flagged = classify_text(text, classifier)

    # Output results
    if args.output:
        try:
            with open(args.output, 'w') as file:
                json.dump(flagged, file, indent=4)
            print(f"Flagged content saved to {args.output}")
        except Exception as e:
            print(f"Error writing to output file: {e}", file=sys.stderr)
            sys.exit(1)
    else:
        print(json.dumps(flagged, indent=4))

if __name__ == "__main__":
    main()

Community

Downloads

···

Rate this tool

No ratings yet — be the first!

Details

Tool Name: content_guard
Category: AI-Generated Content Moderation
Generated: May 31, 2026
Tests: Passing ✅
Fix Loops: 2

Quick Install

Clone just this tool:

git clone --depth 1 --filter=blob:none --sparse \
  https://github.com/ptulin/autoaiforge.git
cd autoaiforge
git sparse-checkout set generated_tools/2026-05-31/content_guard
cd generated_tools/2026-05-31/content_guard
pip install -r requirements.txt 2>/dev/null || true
python content_guard.py

Links

View source on GitHub Raw README.md