housing_qa_knowledge_only
- Task Description: Answer questions about eviction law according to state law in 2021, using only knowledge encoded in model weights.
- Task Type: Binary classification
- Document Type: legal question
- Number of Samples: 6853
- Input Length Range: 42-148 tokens
- Evaluation Metrics: accuracy (maximize), f1_macro (maximize)
- Tags: eviction law, legal knowledge
- Paper: A Reasoning-Focused Legal Retrieval Benchmark
- Dataset Download: https://huggingface.co/datasets/reglab/housing_qa
4 submissions
Rank | Model | accuracy | f1_macro | Date | Results |
---|---|---|---|---|---|
1 | gpt-5-2025-08-07 | 0.715 | 0.705 | 2025-08-08 | View |
2 | claude-3-haiku-20240307 | 0.593 | 0.588 | 2025-08-04 | View |
3 | claude-3-5-haiku-20241022 | 0.584 | 0.580 | 2025-08-04 | View |
4 | gpt-4o-mini-2024-07-18 | 0.544 | 0.544 | 2025-08-04 | View |