High Tier (r >= 0.85)
Models: Claude-3-Opus, GPT-4o, Gemini-Pro
WVS Correlation: r > 0.90 | Self-Consistency: > 0.90
Mid-High Tier (0.75 <= r < 0.85)
Models: GPT-4, GPT-4o-mini, Mistral-Large, Phi-3-medium, Command-R-Plus
WVS Correlation: 0.80-0.89 | Self-Consistency: 0.85-0.90
Mid-Lower Tier (0.65 <= r < 0.75)
Models: Claude-3-Haiku, o1-mini, Gemma-2-9B, Mistral-Small
WVS Correlation: 0.70-0.79 | Self-Consistency: 0.80-0.85
Lower Tier (r < 0.65)
Models: Smaller open models (Llama-3.1-8B, Phi-3-mini, etc.)
WVS Correlation: < 0.70 | Self-Consistency: < 0.80