Racoons.ai
Find out why your website isn't bringing in customers
OpenMark AI provides a web-based benchmarking environment for testing 100 AI models using deterministic scores and real API cost metrics. It supports vision and document inputs to help users identify the most efficient models for specific RAG pipelines and workflows.
Ideal for: Developers, Data Scientists, and Product Managers who need to validate model performance and stability using their own specific prompts and documents.
No updates yet. Check back later for updates from the team.
Comments (2)
This is super compelling, especially the focus on reproducible results and real cost efficiency. Testing models against your own prompts without LLM-as-judge feels like a much more honest way to choose the right model.
Built OpenMark AI after finding a cheaper model beat a 'flagship' one for my task. Stop trusting generic benchmarks, test models on YOUR prompts with deterministic scoring, real costs & 100+ models.
@kean this is very timely. I could have used this when I chose gpt-4o for a client's agentic flow several.months ago, but found out months later that 4.1-mini was performing better for his use case AND much cheaper....
@theaspirinv thank you ! This is verbatim what happened to me 8 months ago. Built a rag pipeline and found out using cheaper models would actually perform better! So i made this benchmarking tool. Now I regularly use it to check for drift.