Meta AI has released a new version of its advanced code generation model, Code Llama 70B. The new platform, one of the largest open-source AI models for code generation, is a significant upgrade over its predecessor, making it both significantly faster and more accurate.
Code Llama 70B is trained on 500 billion tokens of code and code-related data, and has a large context window of 100,000 tokens, allowing it to process and generate longer and more complex code in a range of languages including C++, Python, PHP and Java.
Based on Llama 2, one of the largest large language models (LLM) in the world, Code Llama 70B has been refined to generate code using a technique called self-attention, which can better understand code relationships and dependencies.
Uphill battle
Another highlight of the new model is CodeLlama-70B-Instruct, a variant tailored to understanding natural language instructions and generating code accordingly.
Meta AI CEO Mark Zuckerberg said: “The ability to code has also proven important for AI models to process information in other domains more rigorously and logically. I am proud of the progress here and look forward to bringing this progress to Llama 3 and future models.”
Code Llama 70B is available for free download under the same license as Llama 2 and previous Code Llama models, allowing both researchers and commercial users to use and modify it.
Despite the improvements, Meta faces a tough challenge in winning over developers currently using GitHub Copilot, the number one AI tool for developers created by GitHub and OpenAI. Many developers are also suspicious of Meta and its data collection processes, and many are not fans of AI-generated code in the first place. This can often require serious debugging and produce code that non-programmers enjoy using but don’t understand, which can lead to problems later.