๐ฌ Local LLM DeploymentJune 12, 2026โ
Tests passing
LLM Edge Deployer
LLM Edge Deployer is a Python library and CLI tool designed to streamline the process of deploying optimized LLMs on edge hardware. It provides utilities to convert models to hardware-efficient formats like ONNX, export them, and run compatibility checks for edge accelerators such as NVIDIA TensorRT or Intel OpenVINO.
What It Does
- Convert models to ONNX format.
- Check compatibility with edge devices (e.g., NVIDIA TensorRT, OpenVINO).
- Run test inference on ONNX models with sample input data.
Installation
Install the required dependencies using pip:
pip install onnx onnxruntime optimum numpy pytestUsage
CLI
Run the tool from the command line:
python llm_edge_deployer.py --input_model <path_to_model> \
--target_device <device_type> \
--test_sample <path_to_test_sample> \
--output_model <path_to_output_model>--input_model: Path to the optimized model file.--target_device: Target edge device type (e.g.,tensorrt,openvino).--test_sample: Path to the test sample JSON file.--output_model: (Optional) Path to save the converted ONNX model. Default isoutput_model.onnx.
Library
You can also use the tool as a Python library:
from llm_edge_deployer import convert_to_onnx, check_device_compatibility, run_test_inference
# Convert model to ONNX
output_model_path = convert_to_onnx("path/to/input_model", "path/to/output_model.onnx")
# Check device compatibility
check_device_compatibility("tensorrt")
# Run test inference
result = run_test_inference("path/to/output_model.onnx", "path/to/test_sample.json")
print("Inference result:", result)Source Code
import argparse
import os
import json
import numpy as np
import onnx
import onnxruntime as ort
from optimum.onnxruntime import ORTModel
def convert_to_onnx(input_model_path, output_model_path):
"""Convert the input model to ONNX format."""
try:
model = ORTModel.from_pretrained(input_model_path)
model.save_pretrained(output_model_path)
return output_model_path
except Exception as e:
raise RuntimeError(f"Failed to convert model to ONNX: {e}")
def check_device_compatibility(target_device):
"""Check compatibility of the target device."""
supported_devices = ["tensorrt", "openvino"]
if target_device.lower() not in supported_devices:
raise ValueError(f"Unsupported target device: {target_device}. Supported devices are: {supported_devices}")
return True
def run_test_inference(onnx_model_path, test_sample_path):
"""Run a test inference on the ONNX model."""
try:
session = ort.InferenceSession(onnx_model_path)
with open(test_sample_path, "r") as f:
test_sample = json.load(f)
input_name = session.get_inputs()[0].name
input_data = np.array(test_sample, dtype=np.float32)
result = session.run(None, {input_name: input_data})
return [np.array(r) for r in result] # Ensure result is a list of numpy arrays
except Exception as e:
raise RuntimeError(f"Failed to run test inference: {e}")
def main():
parser = argparse.ArgumentParser(description="LLM Edge Deployer")
parser.add_argument("--input_model", required=True, help="Path to the optimized model file.")
parser.add_argument("--target_device", required=True, help="Target edge device type (e.g., tensorrt, openvino).")
parser.add_argument("--test_sample", required=True, help="Path to the test sample JSON file.")
parser.add_argument("--output_model", default="output_model.onnx", help="Path to save the converted ONNX model.")
args = parser.parse_args()
try:
check_device_compatibility(args.target_device)
print(f"Target device {args.target_device} is compatible.")
output_model_path = convert_to_onnx(args.input_model, args.output_model)
print(f"Model converted to ONNX format and saved at {output_model_path}.")
inference_result = run_test_inference(output_model_path, args.test_sample)
print(f"Test inference result: {inference_result}")
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
Community
Downloads
ยทยทยท
Rate this tool
No ratings yet โ be the first!
Details
- Tool Name
- llm_edge_deployer
- Category
- Local LLM Deployment
- Generated
- June 12, 2026
- Tests
- Passing โ
- Fix Loops
- 3
Quick Install
Clone just this tool:
git clone --depth 1 --filter=blob:none --sparse \ https://github.com/ptulin/autoaiforge.git cd autoaiforge git sparse-checkout set generated_tools/2026-06-12/llm_edge_deployer cd generated_tools/2026-06-12/llm_edge_deployer pip install -r requirements.txt 2>/dev/null || true python llm_edge_deployer.py