Welcome to the October edition of the Token! Q4 is here and the days are certainly starting to feel a bit shorter (if you’re in the Northern hemisphere!), but we’re still here with your monthly rundown of the latest in the world of AI and NLP.
September was packed with AI news, most notably the release of the much anticipated o1 model from OpenAI. It quickly jumped into the top of the leaderboards 🥇 and represents a new generation of models that are better at solving complex problems that require more ‘thinking’.
In this issue, we also cover a new case study with an engineering consultancy client, helping them to build an AI assistant to help prepare responses to new project tenders. We also share our thoughts on why sometimes AI projects can go wrong ❌
📰 Bites from AI news
🖱️OpenAI introduced o1, a model that is better at complex problems that require more ‘thinking’. o1 represents a step change compared to GPT4o for which most of the abilities of the model were coming from its training, whilst part of o1’s abilities come from spending more time on solving the problem 🤔 https://openai.com/index/introducing-openai-o1-preview/
🧠 Qwen 2.5 was released which is trained on 18T tokens 😮 The 72B model outperforms llama 3.1 70B and is closer in performance to the 405B model. The 14B model performs comparably to GPT4o mini 🔥 https://qwenlm.github.io/blog/qwen2.5/
🧠 Meta released Llama 3.2, which is the new state of the art small model of 1B and 3B, outperforming phi and Gemma 🔥 It also adds vision to the llama series with the 11B and 90B variants. The latter is on a par to GPT4o-mini https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/
🧪 GOT was introduced, which is an end to end OCR model that achieves state of the art results with only 580M params 😮 https://arxiv.org/abs/2409.01704
💬 Anthropic introduced Contextual Retrieval, a new way to improve your RAG solution. It improves on classic RAG solutions by (1) using an LLM to summarise the context that surrounds each chunk so that it’s easier to retrieve all relevant chunks, and (2) leveraging the traditional BM25 ranking technique 💬 https://www.anthropic.com/news/contextual-retrieval
💼 AI copilot for engineering projects
We worked with a large Spanish engineering consultancy to help them build an AI assistant that helps them to respond to tenders for new projects by bringing relevant information from the organisational knowledge and past projects.
You can read the entire case study here.
❌ Why AI projects go wrong
While AI is a transformative technology, using it effectively is not always straightforward. We’ve written a blog about root causes of AI projects not delivering the value the business anticipated, in our experience. In short, poor planning, not enough focus on data before starting, and lack of ongoing investment.
Read our entire piece here.