Resources

Publications

Reasoning Beyond the Obvious: Evaluating Divergent and Convergent Thinking in LLMs

Most reasoning benchmarks for LLMs emphasize factual accuracy or step-by-step logic. In finance, however, professionals must not only converge on optimal decisions but also generate creative, plausible futures under uncertainty. We introduce ConDiFi, a benchmark that jointly evaluates divergent and convergent thinking in LLMs for financial tasks.

ConDiFi features 607 macro-financial prompts for divergent reasoning and 990 multi-hop adversarial MCQs for convergent reasoning. Using this benchmark, we evaluated 14 leading models and uncovered striking differences. Despite high fluency, GPT-4o underperforms on Novelty and Actionability. In contrast, models like DeepSeek-R1 and Cohere Command R+ rank among the top for generating actionable, insights suitable for investment decisions. ConDiFi provides a new perspective to assess reasoning capabilities essential to safe and strategic deployment of LLMs in finance.

Presented at KDD2025: Workshop on Evaluation and Trustworthiness of Agentic and Generative AI Models, Oral Track

(Joint work with the Government Technology Agency of Singapore)

Paper
Workshop

Resources

Github

We believe in open research and community-driven innovation. Our GitHub is home to a growing collect

🌟 Awesome AI Agents

Publications

Reasoning Beyond the Obvious: Evaluating Divergent and Convergent Thinking in LLMs

Resources

Github

We believe in open research and community-driven innovation. Our GitHub is home to a growing collect

🌟 Awesome AI Agents

Publications

Reasoning Beyond the Obvious: Evaluating Divergent and Convergent Thinking in LLMs

This website uses cookies.