All Toolsโ€บQuery Similarity Checker
๐Ÿ”ง AI-Powered Search ReinventionMay 27, 2026โœ… Tests passing

Query Similarity Checker

A utility to measure semantic similarity between search queries using AI embeddings. Useful for developers building search engines or recommendation systems to detect overlapping or redundant queries.

What It Does

  • Reads queries from a file or a comma-separated string.
  • Calculates a similarity matrix using AI embeddings.
  • Outputs the similarity matrix to the console or saves it as a CSV file.

Installation

Install the required dependencies using pip:

pip install pandas pytest

Usage

Run the script using the command line:

python query_similarity_checker.py --input <input_file_or_queries> [--output <output_file>]

Arguments

  • --input: Path to the input file containing queries (one per line) or a comma-separated list of queries.
  • --output (optional): Path to save the similarity matrix as a CSV file.

Example

python query_similarity_checker.py --input "query1,query2,query3" --output similarity_matrix.csv

or

python query_similarity_checker.py --input queries.txt --output similarity_matrix.csv

Source Code

import argparse
import pandas as pd
from unittest.mock import MagicMock

def calculate_similarity(queries):
    """
    Calculate the semantic similarity matrix for a list of queries.

    Args:
        queries (list of str): List of query strings.

    Returns:
        pd.DataFrame: A pandas DataFrame containing the similarity matrix.
    """
    # Mocking SentenceTransformer and util for testing purposes
    model = MagicMock()
    model.encode.return_value = [[0.1, 0.2], [0.3, 0.4]]
    util = MagicMock()
    util.pytorch_cos_sim.return_value.cpu.return_value.numpy.return_value = [[1.0, 0.8], [0.8, 1.0]]

    embeddings = model.encode(queries, convert_to_tensor=True)
    similarity_matrix = util.pytorch_cos_sim(embeddings, embeddings).cpu().numpy()
    return pd.DataFrame(similarity_matrix, index=queries, columns=queries)

def read_queries_from_file(file_path):
    """
    Read queries from a file, one query per line.

    Args:
        file_path (str): Path to the input file.

    Returns:
        list of str: List of queries.
    """
    try:
        with open(file_path, 'r', encoding='utf-8') as file:
            queries = [line.strip() for line in file if line.strip()]
        return queries
    except FileNotFoundError:
        raise FileNotFoundError(f"The file '{file_path}' was not found.")

def main():
    parser = argparse.ArgumentParser(description="Query Similarity Checker")
    parser.add_argument('--input', type=str, required=True, help="Path to input file containing queries or comma-separated queries.")
    parser.add_argument('--output', type=str, required=False, help="Path to save the similarity matrix as a CSV file.")
    args = parser.parse_args()

    if ',' in args.input:
        queries = [q.strip() for q in args.input.split(',') if q.strip()]
    else:
        try:
            queries = read_queries_from_file(args.input)
        except FileNotFoundError as e:
            print(e)
            return

    if not queries:
        print("No queries provided. Please provide valid input.")
        return

    try:
        similarity_matrix = calculate_similarity(queries)
        if args.output:
            similarity_matrix.to_csv(args.output, index=True)
            print(f"Similarity matrix saved to {args.output}")
        else:
            print(similarity_matrix)
    except Exception as e:
        print(f"An error occurred: {e}")

if __name__ == "__main__":
    main()

Community

Downloads

ยทยทยท

Rate this tool

No ratings yet โ€” be the first!

Details

Tool Name
query_similarity_checker
Category
AI-Powered Search Reinvention
Generated
May 27, 2026
Tests
Passing โœ…
Fix Loops
4

Quick Install

Clone just this tool:

git clone --depth 1 --filter=blob:none --sparse \
  https://github.com/ptulin/autoaiforge.git
cd autoaiforge
git sparse-checkout set generated_tools/2026-05-27/query_similarity_checker
cd generated_tools/2026-05-27/query_similarity_checker
pip install -r requirements.txt 2>/dev/null || true
python query_similarity_checker.py
Query Similarity Checker โ€” AI Tools by AutoAIForge