The year of agents

January Edition

Nick Sorros

and

Matthew Upson

Jan 15, 2025

Happy New Year 🎉 Welcome to the January edition of the Token! We hope you had a nice break over the holidays!

Sharing personal data like health, banking or telecom requires special care in order to ensure the data does not end up in the wrong hands or used in the wrong way. One way to resolve some of those issues is using a de-centralised platform that makes data discoverable without the data leaving the data store. We recently completed a project where we evaluated whether such platform was fit for purpose for a telecoms client that wanted to enable its members to safely share data.

In this issue, we also cover a case study on evaluating a de-centralised platform for a large telecoms consortium 📡

📰 Bites from AI news

OpenAI released o3 with significant jumps in reasoning abilities mainly demonstrated through coding and math benchmarks. For coding it ranks 175th globally in codeforces, one of the most popular online competitions 😮. In math, it is able to solve 1 in 4 of the hardest problems expert mathematicians have created 🤯 https://techcrunch.com/2024/12/20/openai-announces-new-o3-model/
Microsoft releases phi-4, a 14B model with performance similar to some frontier models 🚀 https://techcommunity.microsoft.com/blog/aiplatformblog/introducing-phi-4-microsoft’s-newest-small-language-model-specializing-in-comple/4357090
Google started rolling out its Gemini 2.0 models - beginning with Flash, which seems to be better than 1.5 Pro ⚡ The 2.0 series is advertised to be optimised for agentic workflows, some examples of which were shown through Project Astra and Mariner. https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/
Meta released Llama 3.3 70B which delivers the same performance as their flagship 405B in version 3.2 at a much lower cost 🔥 It matches GPT4o, Claude 3.5, Gemini 1.5 Pro and Nova Pro in a couple of benchmarks. https://techcrunch.com/2024/12/06/meta-unveils-a-new-more-efficient-llama-model/
Cerebras clocks an inference speed of 1000 tokens per second for the largest Llama 😮. For comparison, Groq, another leading AI inference solution advertises a speed of 736 t/s 🐌 for the small Llama model 🦙 https://cerebras.ai/blog/llama-405b-inference

💼 Evaluating a de-centralised platform for the telecoms industry

Sharing personal data like health, banking or telecom requires special care in order to ensure the data does not end up in the wrong hands or used in the wrong way. One way to resolve some of those issues is using a de-centralised platform that makes data visible without the data leaving the data store. We recently completed a project where we evaluated whether such platform was fit for purpose for a telecoms client that wanted to enable its members to safely share data.

🔗 Read the entire case study here https://mantisnlp.com/work/telecoms/

💡The year of agents

In order to build useful assistants using agents, AI needs to have world knowledge ✅ in order to understand our requests and the context as well as the ability to plan and reason ✅ so that it can execute and navigate complex environments to complete tasks. Arguably those necessary requirements have now been met to a large extent with the release of o3 🔥

The next step for AI is to take actions in the real world by: first using our computers, and eventually interacting with the physical world, when robotics catches up with the rapid advances in AI. We can already see glimpses of the former with Project Astra from Google, Computer use from Anthropic and ChatGPT working with Apps. It seems that this year, AI will finally break out of the chat box and will start completing small tasks for us like checking us in before a flight 🛫 .

This will be even more transformational within companies since there is the possibility of automating business tasks that were previously out of reach, as well as doing others that were previously too costly 🚀 This does not mean agents will replace employees or reinvent your business, but it does mean your business economics will change.

Interested to see how your organisation can optimize its processes by integrating AI? Get in touch with for a free consultancy call with us, and one of our AI experts https://mantisnlp.com/contact/#cta

The Token

Discussion about this post