Technology Innovation Institute Releases World’s Best Performing Open Source State Space Language Model

Jul 19, 2024
FALCON_NEWS_BANNER
  • • The Falcon Mamba 7B is the best performing open source State Space Language Model (SSLM) in the world, as independently verified by Hugging Face
  • • SSLMs have a low memory cost and don’t require additional memory to generate arbitrary long sequences
  • • Falcon Mamba 7B outperforms traditional transformer architecture models such Meta’s Llama 3 8B and Mistral’s 7B
  • • New model reflects the innovation and pioneering approach of Abu Dhabi in AI research and development

     

Abu Dhabi-UAE: 22 July, 2024 - The Technology Innovation Institute (TII), a leading global scientific research center and the applied research pillar of Abu Dhabi’s Advanced Technology Research Council (ATRC), has released a new large language model in its Falcon series, the Falcon Mamba 7B. The new model is the best performing open source SSLM in the world, as independently verified by Hugging Face.

As the first State Space Language Model (SSLM) for Falcon, it departs from prior Falcon models which all use transformer architecture. This new Falcon Mamba 7B model is yet another example of the pioneering research the institute is conducting and the breakthrough tools and products it makes available to the community in an open source format.

Hugging Face has created a new leaderboard for open source LLMs with more stringent benchmarks. For the transformer architecture models, Falcon Mamba 7B outperforms Meta’s Llama 3 8B and Mistral’s 7B on both Hugging Face’s old and new benchmarks. Meanwhile for the other SSLMs, Falcon Mamba 7B beats all other open source models in the old benchmarks and it will be the be first model on Hugging Face’s new tougher benchmark leaderboard.

State Space models are extremely performant at understanding complex systems and differ from transformer models in their ability to capture large chunks of information at once. This means SSLMs don’t require additional memory to digest large bits of information.

Transformer models, on the other hand, are efficient at remembering and using information it processed much earlier in a sequence. However, they have limitations when it comes to processing large amounts of information due to limited memory capacity.

SSLMs can find applications in various fields such as estimation, forecasting, and control tasks. Transformer architecture models excel in Natural Language Processing tasks and are also commonly applied to machine translation, text summarization, computer vision, and audio processing.

H.E. Faisal Al Bannai, Secretary General of ATRC and Adviser to the UAE President for Strategic Research and Advanced Technology Affairs, said: “This new language model exemplifies Abu Dhabi’s emergence as a leading hub for AI research and development. It underscores the region’s unwavering commitment to innovation, consistently pushing the boundaries in the field of artificial intelligence and shaping our connected future.”

Dr. Najwa Aaraj, Chief Executive of TII, said: “It is my pleasure to recognize Dr. Hakim’s exceptional research that resulted in the Falcon Mamba 7B. This new language model is pioneering work and paves the way for further innovations that will enhance human capabilities and improves lives.”

Dr. Hakim Hacid, Acting Chief Researcher of the TII’s AI Cross-Center Unit, said: “As we introduce the Falcon Mamba 7B, I’m proud of the collaborative ecosystem of TII that nurtured its development. This novel language represents a significant stride forward, inspiring fresh perspectives and further fueling the quest for intelligent systems.”