๐ง AI for Real-Time News SummarizationApril 15, 2026โ
Tests passing
News Trend Tracker
A Python library that tracks emerging news trends by analyzing real-time updates across multiple sources. It identifies common topics and generates concise summaries, helping AI developers create applications that adapt to dynamic, real-world information.
What It Does
- Fetches and parses RSS feed entries.
- Clusters news articles into topics using KMeans clustering.
- Summarizes each cluster using a pre-trained summarization model.
Installation
- Python 3.7+
- Required Python packages:
feedparserscikit-learntransformers
Install the required packages using pip:
pip install feedparser scikit-learn transformersUsage
Run the script from the command line with the following arguments:
python news_trend_tracker.py --feeds <RSS_FEED_URL_1> <RSS_FEED_URL_2> ... --clusters <NUM_CLUSTERS> --summary_length <SUMMARY_LENGTH>Arguments
--feeds: List of RSS feed URLs to analyze (required).--clusters: Number of clusters to form (default: 5).--summary_length: Maximum length of the summary for each cluster (default: 50).
Example
python news_trend_tracker.py --feeds https://example.com/rss https://another.com/rss --clusters 3 --summary_length 100Source Code
import feedparser
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans
from transformers import pipeline
from typing import List, Dict
def fetch_feed_entries(feed_urls: List[str]) -> List[str]:
"""
Fetches and parses RSS feed entries from the provided URLs.
Args:
feed_urls (List[str]): List of RSS feed URLs.
Returns:
List[str]: List of news article titles and descriptions.
"""
entries = []
for url in feed_urls:
try:
feed = feedparser.parse(url)
if 'entries' in feed:
for entry in feed['entries']:
content = entry.get('title', '') + ' ' + entry.get('description', '')
entries.append(content.strip())
except Exception as e:
print(f"Error fetching or parsing feed {url}: {e}")
return entries
def cluster_topics(documents: List[str], num_clusters: int = 5) -> Dict[int, List[str]]:
"""
Clusters the given documents into topics using KMeans.
Args:
documents (List[str]): List of text documents to cluster.
num_clusters (int): Number of clusters to form.
Returns:
Dict[int, List[str]]: A dictionary with cluster indices as keys and lists of documents as values.
"""
if not documents:
return {}
vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(documents)
kmeans = KMeans(n_clusters=num_clusters, random_state=42, n_init=10)
kmeans.fit(X)
clusters = {i: [] for i in range(num_clusters)}
for idx, label in enumerate(kmeans.labels_):
clusters[label].append(documents[idx])
return clusters
def summarize_clusters(clusters: Dict[int, List[str]], summary_length: int = 50) -> Dict[int, str]:
"""
Summarizes each cluster using an AI-based summarization model.
Args:
clusters (Dict[int, List[str]]): Dictionary of clustered documents.
summary_length (int): Maximum length of the summary for each cluster.
Returns:
Dict[int, str]: A dictionary with cluster indices as keys and summaries as values.
"""
summarizer = pipeline("summarization")
summaries = {}
for cluster_id, documents in clusters.items():
combined_text = " ".join(documents)
try:
summary = summarizer(combined_text, max_length=summary_length, min_length=10, do_sample=False)
summaries[cluster_id] = summary[0]['summary_text']
except Exception as e:
print(f"Error summarizing cluster {cluster_id}: {e}")
summaries[cluster_id] = ""
return summaries
def track_trends(feed_urls: List[str], num_clusters: int = 5, summary_length: int = 50) -> Dict[int, Dict[str, str]]:
"""
Tracks emerging news trends by analyzing real-time updates across multiple sources.
Args:
feed_urls (List[str]): List of RSS feed URLs.
num_clusters (int): Number of clusters to form.
summary_length (int): Maximum length of the summary for each cluster.
Returns:
Dict[int, Dict[str, str]]: A dictionary with cluster indices as keys, containing topics and summaries.
"""
entries = fetch_feed_entries(feed_urls)
clusters = cluster_topics(entries, num_clusters=num_clusters)
summaries = summarize_clusters(clusters, summary_length=summary_length)
result = {}
for cluster_id, documents in clusters.items():
result[cluster_id] = {
"topics": documents,
"summary": summaries.get(cluster_id, "")
}
return result
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="News Trend Tracker")
parser.add_argument("--feeds", nargs='+', required=True, help="List of RSS feed URLs.")
parser.add_argument("--clusters", type=int, default=5, help="Number of clusters to form.")
parser.add_argument("--summary_length", type=int, default=50, help="Maximum length of the summary for each cluster.")
args = parser.parse_args()
trends = track_trends(args.feeds, num_clusters=args.clusters, summary_length=args.summary_length)
for cluster_id, data in trends.items():
print(f"Cluster {cluster_id}:")
print(f"Topics: {data['topics']}")
print(f"Summary: {data['summary']}")
print()Community
Downloads
ยทยทยท
Rate this tool
No ratings yet โ be the first!
Details
- Tool Name
- news_trend_tracker
- Category
- AI for Real-Time News Summarization
- Generated
- April 15, 2026
- Tests
- Passing โ
- Fix Loops
- 3
Quick Install
Clone just this tool:
git clone --depth 1 --filter=blob:none --sparse \ https://github.com/ptulin/autoaiforge.git cd autoaiforge git sparse-checkout set generated_tools/2026-04-15/news_trend_tracker cd generated_tools/2026-04-15/news_trend_tracker pip install -r requirements.txt 2>/dev/null || true python news_trend_tracker.py