User:
I want to detect images from a trash can of Paper, glass, metal, plastic, and I want to classify it using an AI model recommend me a model and guide me a process to deploy it so that I will use it through its api
Assistant:
1 — Pick a model that can both find and label recyclables
For a trash‑can view you’ll often have several items in one frame, so an object‑detection network is a better fit than a “whole‑image” classifier.
The current sweet spot for real‑time detectors in 2025 is Ultralytics YOLO‑v10 (choose the s variant, yolov10s
, unless you need to squeeze onto very small hardware)—it’s fast, accurate and NMS‑free, with ready‑made training/export tooling and many deployment formats (TorchScript, ONNX, TensorRT, CoreML, etc.) (docs.ultralytics.com, docs.ultralytics.com)
2 — Build (or extend) a dataset
What you need | Where to get it | Notes |
---|---|---|
Images of paper, glass, metal, plastic (plus “other”) | • [TrashNet] 2 527 photos (6 classes) • Kaggle Garbage Classification • Your own phone/camera images | Shoot your own photos in the target trash‑can to capture lighting & background. Balance classes with augmentation if necessary. |
Bounding‑box annotations | Label by hand in Roboflow, Label‑Studio or CVAT | Four classes: paper, glass, metal, plastic . Optionally add other or cardboard for completeness. |
TrashNet already covers the four materials you listed and is widely used in research (github.com).
3 — Train the model
# 3.1 install
python -m pip install "ultralytics>=0.4" # includes YOLO‑v10
# 3.2 folder structure
datasets/
├── images # train/val/test images
└── labels # YOLO txt labels
# 3.3 dataset yaml (trash.yaml)
path: datasets
train: images/train
val: images/val
names: [paper, glass, metal, plastic]
# 3.4 train
yolo detect train \
model=yolov10s.pt \
data=trash.yaml \
imgsz=640 \
epochs=150 \
batch=16 \
device=0
Ultralytics saves your best checkpoint to runs/detect/train/weights/best.pt
.
4 — Export for inference
yolo export model=runs/detect/train/weights/best.pt format=onnx
# others: torchscript, tensorrt, coreml …
The YOLO‑v10 docs list every supported export backend (docs.ultralytics.com).
5 — Expose an HTTP API (two common paths)
Path | When to choose | Steps (summary) |
---|---|---|
A. Ultralytics HUB | Need “zero‑ops” hosting, auto‑scaling, HTTPS & auth handled for you | 1) yolo hub login 2) yolo hub push runs/detect/train/weights/best.pt 3) In the HUB UI press Start Endpoint. You receive a URL + API‑KEY. (docs.ultralytics.com) |
B. Self‑host FastAPI + Docker | You already run containers on AWS/GCP/Azure/Jetson edge | 1) Create app.py (below) 2) Build image 3) Deploy to ECS, Cloud Run, etc. 4) Front‑end with an API‑gateway or Traefik for TLS. (docs.ultralytics.com) |
Minimal FastAPI (CPU or GPU):
# app.py
from io import BytesIO
from fastapi import FastAPI, File, UploadFile
from ultralytics import YOLO
from PIL import Image
app = FastAPI()
model = YOLO("best.pt") # path to your exported model
@app.post("/predict")
async def predict(file: UploadFile = File(...)):
img = Image.open(BytesIO(await file.read()))
results = model(img)[0]
return [
{"cls": model.names[int(b.cls)],
"conf": float(b.conf),
"xyxy": [float(x) for x in b.xyxy[0]]}
for b in results.boxes
]
Dockerfile
(GPU build shown):
FROM nvidia/cuda:12.2.0-runtime-ubuntu22.04
WORKDIR /app
RUN apt-get update && apt-get install -y python3-pip
COPY requirements.txt . # ultralytics fastapi uvicorn[standard] pillow
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
6 — Calling the API
curl -X POST http://YOUR_HOST/predict \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@photo.jpg"
Sample JSON response:
[
{"cls":"plastic","conf":0.92,"xyxy":[34,12,180,220]},
{"cls":"paper","conf":0.87,"xyxy":[200,40,350,260]}
]
7 — Production checklist
Area | Recommendation |
---|---|
Accuracy | Use Mosaic & HSV augmentations; oversample under‑represented classes. |
Latency | Export TensorRT or OpenVINO on GPU/CPU respectively; batch requests if camera FPS allows. |
Monitoring | Log incoming images & predictions, compute a running confusion matrix, set alerts on confidence or class distribution drift. |
Update loop | Periodically label mis‑predictions, retrain, and roll out a new best‑v2.pt (Ultralytics HUB can handle versioning automatically). |
Security | Require an API key or OAuth in FastAPI middleware; limit request size to, e.g., 5 MB. |
8 — If classification‑only is truly enough
When you are certain you will have exactly one item per photo, a light classifier such as EfficientNet‑V2‑S or MobileNet‑V3‑Large fine‑tuned on the same dataset will be simpler and smaller; you can still deploy it with the identical FastAPI + Docker pattern.
TL;DR Fine‑tune YOLO‑v10s on a TrashNet‑plus‑custom dataset, export to ONNX/TensorRT, and serve it behind a FastAPI container (or one‑click Ultralytics HUB). You’ll get millisecond‑level inference and a clean REST/JSON interface ready for any client.