Opening up generative AI

Generative AI has enormous potential to revolutionize business, create new opportunities and make employees more efficient in the way they work. According to McKinsey, more than a quarter of business leaders say generative AI is a board-level agenda item, while 79 percent of respondents have already used generative AI.

These technologies are already impacting the software industry. IDC found that 40 percent of IT managers think generative AI “will allow us to create much more innovative software,” while GBK Collective estimates that 78 percent of companies expect to use AI for software development within the next three to five years . About half of video game companies already use generative AI in their work processes, according to research from the Game Developer Conference.

All these signals show that generative AI is being used more and more. However, the number of developers with the right skills to get started putting together generative AI-powered applications is limited. For companies looking to build and operate their own generative AI-powered services, rather than using a provider’s service, integration will be key to using business data more effectively.

Carter Rabasa

Head of Developer Relations at DataStax.

Where are the holes?

So what are the challenges surrounding generative AI? The first of these is about how to make data ready for generative AI systems. The second is how these systems can be integrated with each other and how software can be developed around generative AI capabilities.

For many companies, generative AI is inextricably linked to large language models (LLMs) and services like ChatGPT. These tools take text input, translate it into a semantic query that the service can understand, and then provide answers based on their training data. For simple questions, a ChatGPT response may be sufficient. But for companies, this level of general knowledge is not enough.

To solve this problem, techniques such as Retrieval Augmented Generation (RAG) are needed. RAG covers how companies can collect their data, make it available for query and then provide that information to the LLM for recording. This data can exist in multiple formats, from corporate knowledge bases or product catalogs to text in PDFs or other documents. The data must be collected and converted into vectors, which codify data into numerical values ​​that preserve semantic information and relationships.

This process involves a process called chunking: breaking down your text into separate units that can then be represented by vectors. There are several approaches here, from looking at individual words to sentences or paragraphs. The smaller the piece of data you use, the more capacity and cost it takes; Conversely, the larger each slice is, the less accurate data you get. Segmenting data is still a very new area and best practices are still being developed here, so you may need to experiment with your approach to get the best results.

Once your data has been sliced ​​and converted into vectors, you need to make it available as part of your generative AI system. When a user request comes in, it is converted into a vector that can then be used to query your data. By comparing your user’s query against your company’s vector data, you can find the best semantic matches. These agreements can then be shared with your LLM and used to provide context when the LLM creates the response to the user.

RAG data has two main benefits: first, it allows you to provide information to your LLM service for processing, but without adding that data to the LLM so it can be used in another response. This means you can use generative AI with sensitive data because with RAG you maintain control over how that data is used. Second, you can also include more time-sensitive data in your responses: you can continue to update the data in your vector database to keep it as current as possible, and then share it with customers when the right request comes in.

Implementing RAG is a potential challenge because it relies on multiple systems that are currently very new and rapidly evolving. The number of developers familiar with all the technology involved – data chunking, vector embedding, LLMs and the like – is still relatively small, and those skills are in high demand. So everyone will benefit if it becomes easier for more developers to get started with RAG and generative AI.

This is where there can be challenges for developers. Generative AI is most associated with Python, the software language that data scientists use to build data pipelines. However, Python only ranks third on the list of most popular languages ​​according to Stack Overflow’s 2023 research. Expanding support for other languages ​​like JavaScript (the most popular programming language) can help more developers get involved in building generative AI applications or integrating them. with other systems.

Abstracting AI with APIs

One approach that can make this process easier is to support APIs that developers want to work with. By looking at the most common languages ​​and providing APIs for them, developers can get started with generative AI faster and more efficiently.

This also helps solve another bigger problem for developers around generative AI: how to get all the parts working together effectively. Generative AI applications will cover a wide range of use cases, from extending current customer service bots or search functions to more autonomous agents that can take over entire work processes or customer requests. Each of these steps requires multiple components to work together to fulfill a request.

This integration work will incur significant overhead if we cannot eliminate it using APIs. Each connection between system components should be managed, updated, and changed as more functionality is requested or more new elements are added to the AI ​​application. Using standardized APIs instead will make developers’ work easier to manage over time. This will also open up generative AI to more developers, as they can work with components through APIs as services, rather than having to create and run their own instances for vector data, data integration, or chunking. Developers can also choose the LLM they want to work with and switch if they find a better alternative, rather than being tied to a specific LLM.

This also makes it easier to integrate generative AI systems into front-end developer frameworks such as React and Vercel. Enabling developers to implement generative AI into their applications and websites combines front-end design and delivery with back-end infrastructure. So simplifying the stack will be essential for more developers to join in. Making it easier to work with the full Retrieval Augmented Generation stack of technologies – or RAGStack – will be necessary as companies adopt generative AI in their business.

We have highlighted the best AI writer.

This article was produced as part of Ny BreakingPro’s Expert Insights channel, where we profile the best and brightest minds in today’s technology industry. The views expressed here are those of the author and are not necessarily those of Ny BreakingPro or Future plc. If you are interested in contributing, you can read more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Related Post