Llama 2, ChatGPT performance, GPT4 architecture and more

and

Jul 28, 2023

Welcome to the seventh edition of our newsletter The Token! In this episode we take a brief look at the release of Llama 2, the best open source LLM currently by a margin, whether the performance of ChatGPT is degrading as well as some rumours about GPT4 architecture. We also discuss Wix new AI site generator that lets you create a website completely from prompts and OpenAI custom instructions.

As ever, let us know what you think, and if you find yourself in need of help with an NLP problem, get in touch at hi@mantisnlp.com.

🧪 Llama 2 🦙

Llama 2 landed last week and it is the best open source model available by a margin 🔥 Most importantly it comes with a commercial friendly license which means it can be used for most industry use cases 🚀

Llama 2 performs better than all other open source models in most benchmarks and its performance is close to ChatGPT3.5. It comes with a context window of 4096 which can easily be extended using rope_scaling. It is also currently available through the huggingface hub 🤗 and chat.

Llama comes at three sizes, 7B, 13B and 70B with in two flavors base model and chat with the second being fine tuned for conversations using supervised fine tuning and reinforcement learning in order to increase safety 👏 and helpfulness using collectively more than 1M examples.

The gap between state of the art and open source is narrowing 💪

🔗 Read more in the technical report https://arxiv.org/pdf/2307.09288.pdf

🧪 ChatGPT performance over time

Has ChatGPT performance dropped over time? The short answer is no. A new paper published recently caused some stir online as it seemed to prove some speculations that the performance of ChatGPT has degraded over time 😮 The paper tested the same prompts over time and noticed a consistent drop in performance in various categories such as solving math problems, code generation and question answering.

As rightly pointed out, and also consistent with the paper claims, this does not mean that capability of the model has degraded but more so that its behavior has changed. As in, you need different prompts over time to elicit similar capabilities. If anything, this speaks to the importance of monitoring when building on top of LLMs as well as using a particular version of the model when possible 👌

🔗 Read more in the paper https://arxiv.org/pdf/2307.09009.pdf and this blog

AI Snake Oil

Is GPT-4 getting worse over time?

A new paper making the rounds is being interpreted as saying that GPT-4 has gotten worse since its release. Unfortunately, this is a vast oversimplification of what the paper found. And while the findings are interesting, some of the methods are questionable, so it’s worth digging into the details…

2 years ago · 88 likes · 9 comments · Arvind Narayanan and Sayash Kapoor

🧪 GPT4 architecture

There have been a number of reports at this point around the GPT4 architecture, which even though not confirmed by OpenAI, it is worth keeping in mind. According to those reports then GPT4 seems to be using 1.8T parameters 🔥, this is an order of magnitude more than GPT3 and the biggest model to date in terms of parameter count. Interestingly these parameters are spread between 16 expert models 🧠, each with 111B parameters which makes both training and inference more efficient. This strategy might be more relevant to generalist models than specialized used in the industry but worth keeping in mind.

The model has seen 13T tokens during training which is also an order of magnitude more than the most common 1T tokens seen by most open source models released recently. As we have discussed in previous posts, the number of tokens seen is as important as parameter count for performance so this is another number that is worth keeping in mind when new models emerge speculating similar performance 📈

Finally, the model was trained using an 8K context window which was later increased to 32K via fine tuning. As we have seen recently, increasing context window size has been proven quite hard for other so it might be possible they use a combination of fine tuning and position interpolation which is something we discussed in a previous post.

🔗 Read more in this latest post https://www.linkedin.com/feed/update/urn:li:activity:7085589908110635008

🖱 Wix AI Site Generator

Wix is one of the most popular no code platforms to create websites. It already integrates a number of tools based on generative AI such as the ability to create copy or images based on prompts as well as changing layouts, removing the background from an image and more.

Wix has now announced the next set of AI tools its building, one of which allows you to create a complete website based purely on a conversation where you describe your website and pick relevant images 🤯 What's even more interesting in this case is that the site created can be further customized using both traditional wix tools or further prompting in the assistant ✨ This is important as it is unlikely that the tool will get it right the first time but it will definitely decrease the time it takes you to create a working website for your business.

Another interesting tool announced is an AI assistant for managing your business. This will help you tailor you website based on current content and analytics 🔮 This offers us another glimpse of what is to come i.e. specialised assistants that help us manage different aspects of our personal and professional life.

🖱 OpenAI custom instructions

OpenAI released the ability to include custom instructions in your conversations with ChatGPT that persist between sessions ✨ This allows you to customize the responses you receive without having to repeat the same prompts to get your ideal response. For example, you might prefer to see code examples instead of mathematical equations or prefer tables than bullet points.

We believe customizing the assistants we use to better suit our needs will be the norm in the near future. We might have different personalization settings for different types of conversations. We might also be able to change those instructions via text or have the assistant learn them by interacting with us 🔮

❗ Note that this feature is not yet available in the UK and EU

The Token