Logical Thinking Performance Task

GLM 4.7 AI Brings Stronger Reasoning, Higher HLE Scores & Cleaner Web Output with Tools

GLM version 4.7 lifts software engineering accuracy from 68% to 73.8%, helping you ship cleaner code and UI faster. Terminal Bench rises from 24.5% to 41%, giving teams steadier ...

Geeky Gadgets

Deepseek-r1 vs OpenAI-o1 – AI Reasoning Performance Comparison

Deepseek, a Chinese company, has introduced its Deepseek R1 model, attracting attention for its potential to rival OpenAI’s latest offerings. Reportedly outperforming OpenAI’s o1 Preview in benchmarks ...

VentureBeat

LLMs excel at inductive reasoning but struggle with deductive tasks, new research shows

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Large language models (LLMs) have shown impressive performance on various ...

Hosted on MSN

Scientists just developed a new AI modeled on the human brain — it's outperforming LLMs like ChatGPT at reasoning tasks

Scientists have developed a new type of artificial intelligence (AI) model that can reason differently from most large language models (LLMs) like ChatGPT, resulting in much better performance in key ...

EurekAlert!

AI makes human-like reasoning mistakes

Manipulating content within fixed logical structures. In each of the author’s three datasets, they instantiate different versions of the logical problems. Different versions of a problem offer the ...

Computerworld

Microsoft introduces Phi-4, an AI model for advanced reasoning tasks

Microsoft has announced Phi-4 — a new AI model with 14 billion parameters — designed for complex reasoning tasks, including mathematics. Phi-4 excels in areas such as STEM question-answering and ...

WinBuzzer

Z.ai Releases GLM-4.7, Claiming GPT-5.1 Parity with ‘Preserved Thinking’ for Agents

The gist: Z.ai has released GLM-4.7, an open-weights AI model that claims performance parity with proprietary leaders like ...

VentureBeat

Phi-4 proves that a 'data-first' SFT methodology is the new differentiator

AI engineers often chase performance by scaling up LLM parameters and data, but the trend toward smaller, more efficient, and better-focused models has accelerated. The Phi-4 fine-tuning methodology ...

Indiatimes

AI models struggle with complex scientific reasoning tasks

New Delhi: A new study by researchers from IIT Delhi and an international university found that today's leading AI models perform well on simple tasks but struggle with the complex reasoning needed ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results