๐ง AI for Large-Scale Text ProcessingMay 18, 2026โ
Tests passing
Batch Text Summarizer
A command-line tool that takes a large set of text files, processes each one using an AI model like Claude via API, and generates concise summaries for each file. This is useful for developers working on projects that require summarizing large datasets of textual information efficiently.
What It Does
- Processes multiple text files in a single batch.
- Uses OpenAI's GPT-based API for high-quality text summarization.
- Configurable summary length with the
--max-tokensparameter. - Skips empty files automatically.
- Generates summaries with metadata for traceability.
Installation
1. Clone the repository or download the script batch_text_summarizer.py.
2. Install the required dependencies:
pip install -r requirements.txtUsage
Run the script from the command line:
python batch_text_summarizer.py --input-dir ./texts --output-dir ./summaries --api-key YOUR_API_KEYArguments
--input-dir: Path to the directory containing text files to summarize.--output-dir: Path to the directory where summaries will be saved.--api-key: Your OpenAI API key for accessing the summarization model.--max-tokens: (Optional) Maximum number of tokens for the summary. Default is 100.
Example
python batch_text_summarizer.py --input-dir ./texts --output-dir ./summaries --api-key sk-abc123 --max-tokens 150Source Code
import os
import argparse
from tqdm import tqdm
import openai
def summarize_text(api_key, text, max_tokens=100):
"""Summarize the given text using OpenAI's API."""
openai.api_key = api_key
try:
response = openai.Completion.create(
engine="text-davinci-003",
prompt=f"Summarize the following text:\n{text}\n",
max_tokens=max_tokens
)
return response["choices"][0]["text"].strip()
except Exception as e:
raise RuntimeError(f"Error during summarization: {e}")
def process_files(input_dir, output_dir, api_key, max_tokens):
"""Process all text files in the input directory and save summaries."""
if not os.path.exists(output_dir):
os.makedirs(output_dir)
for filename in tqdm(os.listdir(input_dir), desc="Processing files"):
input_path = os.path.join(input_dir, filename)
if os.path.isfile(input_path) and filename.endswith(".txt"):
with open(input_path, "r", encoding="utf-8") as file:
text = file.read()
if not text.strip():
print(f"Skipping empty file: {filename}")
continue
try:
summary = summarize_text(api_key, text, max_tokens)
output_path = os.path.join(output_dir, f"summary_{filename}")
with open(output_path, "w", encoding="utf-8") as summary_file:
summary_file.write(summary)
except RuntimeError as e:
print(f"Failed to summarize {filename}: {e}")
def main():
parser = argparse.ArgumentParser(description="Batch Text Summarizer")
parser.add_argument("--input-dir", required=True, help="Path to the directory containing text files")
parser.add_argument("--output-dir", required=True, help="Path to the directory where summaries will be saved")
parser.add_argument("--api-key", required=True, help="OpenAI API key for text summarization")
parser.add_argument("--max-tokens", type=int, default=100, help="Maximum number of tokens for the summary")
args = parser.parse_args()
try:
process_files(args.input_dir, args.output_dir, args.api_key, args.max_tokens)
except Exception as e:
print(f"An error occurred: {e}")
if __name__ == "__main__":
main()Community
Downloads
ยทยทยท
Rate this tool
No ratings yet โ be the first!
Details
- Tool Name
- batch_text_summarizer
- Category
- AI for Large-Scale Text Processing
- Generated
- May 18, 2026
- Tests
- Passing โ
Quick Install
Clone just this tool:
git clone --depth 1 --filter=blob:none --sparse \ https://github.com/ptulin/autoaiforge.git cd autoaiforge git sparse-checkout set generated_tools/2026-05-18/batch_text_summarizer cd generated_tools/2026-05-18/batch_text_summarizer pip install -r requirements.txt 2>/dev/null || true python batch_text_summarizer.py