EP29 - Orca 2: Teaching Small Language Models How to Reason

Download the paper - Read the paper on Hugging Face

Charlie: Welcome to episode 29 of Paper Brief! I’m your host Charlie, diving into fascinating ML topics with experts. Today we’re stoked to have Clio with us, an adept in machine learning.

Clio: Thrilled to be here, Charlie! And I can’t wait to unpack the intricacies of Orca 2’s reasoning capabilities.

Charlie: Now, Orca 2 is a step forward in teaching small language models to reason. But can you tell us, why veer away from the go-to approach of imitation learning?

Clio: It’s quite the paradigm shift! Instead of just mimicking larger models, Orca 2’s built to navigate different tasks using multiple reasoning strategies. This fosters more independent, strategic thinking.

Charlie: Fascinating! So does this mean Orca 2 essentially learns how to ’learn'?

Clio: Exactly! It’s learning to select the most effective reasoning method per task, not just a one-strategy-fits-all.

Charlie: That sounds empowering. But how does Orca 2 perform compared to, let’s say, larger models out there?

Clio: Surprisingly well! It not only surpasses similar-sized models but even rivals models 5 to 10 times its size in complex tasks, and that’s without extra data during zero-shot settings.

Charlie: Astounding! Now, we’ve seen language models handle prompts, but how well does Orca 2 tackle reasoning in practical examples?

Clio: Let’s take this scenario where two characters, John and Mark, interact with a ball. Orca 2 brilliantly deduces their individual perspectives based on their knowledge, something its counterparts didn’t quite nail.

Charlie: So, distinguishing perspectives is a part of its reasoning. What else is Orca 2 bringing to the table?

Clio: Apart from perspective handling, it’s the methodical step-by-step reasoning and adjusting strategies depending on the complexity that make Orca 2 exceptional.

Charlie: And that’s ultimately why we’re seeing such impressive benchmark results, right?

Clio: Absolutely. By fine-tuning and targeting specific reasoning processes, it’s a game-changer for smaller models.

Charlie: Wow, Clio, diving into Orca 2 has been quite the journey! Thanks for shedding light on this innovative leap.

Clio: I loved discussing it, Charlie. Thanks for having me! Anyone curious about Orca 2, dive into the paper – it’s open-source and a treasure trove of insights.

Charlie: For our listeners, that’s a wrap on episode 29. Stay curious and join us next time on Paper Brief!