AI/ML pipeline
Recommendations, image recognition, and fraud detection
Overview
Current state
Basic queries for item listings, manual categorization, no fraud detection.
Target state
- Personalized recommendations per user
- Auto-tagging images (brand, condition, category)
- Fraud detection for listings and users
- Price suggestions based on market data
Tech stack
- Embeddings: OpenAI / Cohere for text + image embeddings
- Vector DB: Supabase pgvector or Pinecone
- ML Models: Hugging Face for image classification
- Processing: Vercel AI SDK, background jobs
Features
Personalized Recommendations
- Track user views, likes, purchases
- Generate user preference embeddings
- Similar items based on browsing history
- "You might like" carousel on home
Image Recognition
- Auto-detect brand from logo/tags
- Suggest category from item photo
- Condition assessment (new, used, worn)
- Background removal for cleaner listings
Fraud Detection
- Duplicate listing detection (image similarity)
- Suspicious pricing alerts
- Account behavior scoring
- Automated flagging for review
Price Suggestions
- Market analysis for similar items
- Historical price trends
- "Price to sell fast" vs "maximize profit"
- Alerts when items are underpriced
Architecture
Data flow
User action → Event tracking → Feature store
↓
ML Pipeline
↓
Predictions → Cache → APIEmbedding pipeline
async function generateItemEmbeddings(item: Item) {
const textEmbedding = await openai.embeddings.create({
model: "text-embedding-3-small",
input: `${item.title} ${item.description} ${item.brand}`,
});
const imageEmbedding = await generateImageEmbedding(item.mainImageUrl);
await db.update(items)
.set({
textEmbedding: textEmbedding.data[0].embedding,
imageEmbedding
})
.where(eq(items.id, item.id));
}Similarity search
async function findSimilarItems(itemId: string, limit = 10) {
return db.execute(sql`
SELECT *,
1 - (text_embedding <=> ${targetEmbedding}) as similarity
FROM items
WHERE id != ${itemId}
ORDER BY text_embedding <=> ${targetEmbedding}
LIMIT ${limit}
`);
}Implementation
Embeddings foundation
Set up pgvector in Supabase, generate embeddings for existing items, and build similarity search API.
Recommendations
Implement user preference tracking, recommendation API endpoints, and "Similar items" on item detail page.
Image recognition
Integrate image classification model, auto-suggest categories on upload, and brand detection from images.
Fraud detection
Add duplicate detection on listing, pricing anomaly alerts, and user behavior scoring.
Checklist
Infrastructure
- Enable pgvector extension in Supabase
- Add embedding columns to items table
- Set up background job for embedding generation
- Create vector similarity indexes
Features
- Similar items API
- User recommendations API
- Image auto-tagging
- Price suggestion engine
- Fraud detection alerts