Arnav Gupta's Tech Blog

Arnav Gupta's Tech Blog

#model-evaluation

Articles tagged with #model-evaluation

Evaluating SotA LLM Models trying to solve a net-new LeetCode style puzzle
Claude, GPT, Gemini and DeepSeek try to find optimal placement for men occupying urinal stalls in a restroom!
Jan 23, 202520 min read883

#model-evaluation - Arnav Gupta's Tech Blog