How Much Does It Cost to Build an AI Application Based on LangChain?
With the recent surge in LLM services, many companies find that existing products do not fully meet their needs, and those that do often come at excessive costs. This has led some to consider developing an AI application tailored to their requirements.
Today, we will estimate the cost for a company to build an AI application based on LangChain.
1. LangChain-Based AI Applications
LangChain is a framework that allows you to build applications powered by large language models (LLMs). LangChain provides a standard interface for connecting various components and includes features such as chaining, data awareness, and agents, which allow applications to be built more easily and quickly.
The diagram below shows how LangChain and agents operate.
In short, agents work by calling a Python agent in the background, which executes Python code generated by the LLM. User questions or prompts are interpreted by the LLM and converted into Python code, which is then passed to a Python engine for execution. The Python engine can access and read data necessary for the process.
2. LangChain (LLM & AI Agent) Ecosystem
The diagram below illustrates the LangChain ecosystem.
As shown in the diagram, there are numerous solutions in each area of LangChain's ecosystem (and even this does not cover everything), placing the challenging task of choosing and orchestrating the right solutions entirely on the company.
Since it is difficult to select a solution in every area and estimate each of their costs, we will instead focus on a few specific components as outlined below.
Let us look at each component below.
3. Solutions and Estimated Costs for Each Component
As seen in the diagram above, the components that make up the LangChain ecosystem are highly diverse, but for convenience, we have simplified them to a few key elements here.
These mainly include the LLM, vector DB, in-memory data store, and data platform.
To get a rough sense of costs, we have chosen representative solutions and estimated expenses based on hypothetical criteria.
Keep in mind that the actual costs will vary depending on the scale of datasets and the performance or quality requirements of each organization. The figures in this post are for reference only.
1) LLM
The cost of LLMs varies depending on context. For popular models, prices based on one million tokens are as follows:
Assuming daily usage of 1,000 tokens and 1 million tokens, the annual cost calculation results in the following estimates:
For OpenAI, low usage (1,000 tokens per day) costs around $1,000 to $50,000 annually, while high usage (1 million tokens per day) costs range from $1 million to $5.6 million annually.
Open-source models like Falcon-7B, LLaMa2-7B, and Vicuna-33B, on the other hand, have significantly lower hosting costs for higher usage than OpenAI.
2) Vector DB
We selected Pinecone as a representative vector DB solution and set hypothetical criteria as follows:
- Use of four P1.x8 pods: $3,300 per month
- One-time embedding creation cost: $3,300
- Additional cost for each embedding model change: $3,300
- Additional recurring costs for query embedding service and response generation: +a
Assuming +a is zero, if we multiply the above amount by 12, the annual cost would be approximately 164 million KRW.
(As mentioned previously, please keep in mind that these figures are not absolute and are provided solely as a reference.)
It is important to note that this setup is for a relatively small dataset and is not configured for high-performance runtime. This setup supports a relatively small dataset with a non-optimized runtime configuration, handling only 30 queries per second. For context, initializing 10 million vectors would take about four days at this rate. If higher performance or speed is required, additional replicas become necessary, increasing costs significantly. Companies with large corpora for RAG applications might expect costs 10 to 100 times higher than this baseline estimate.
3) In-Memory Data Store
In-memory data store is an in-memory database that provides sub-millisecond data access, scalability, and high availability.
We selected Redis as a representative solution for our in-memory data store and set an arbitrary standard for cost calculation as shown below.
- 1 Node
- 128GB Memory
- 20 CPU cores
- 2TB Storage
- 14-day backups
Using the above specifications in a managed service form would cost approximately $6 per hour, totaling $4,300 per month. Multiplying this by 12 for an annual estimate, the cost comes to around 71 million KRW.
4) Data Platform
Leading data platform solutions in the market include Databricks, Snowflake, and Dataiku.
Since all data platform costs fluctuate based on usage, it is difficult to provide a definitive amount.
For mid-sized U.S. companies, Databricks costs typically range from $100,000 to $1 million annually, depending on average data usage. (Source)
Thus, the estimated annual expense for a company might range from about 138 million KRW to 1.38 billion KRW ($115,000 to $1.15 million).
5) Pipeline Construction and Integration
Finally, there are the costs of constructing and integrating a pipeline that links all components of the LangChain ecosystem. This part is challenging to standardize, and since it can vary from case to case, we will not be estimating the costs for it.
It is important to consider that pipeline construction and integration require a prominent level of expertise and considerable time investment from multiple professionals, potentially resulting in substantial (or even the largest) costs.
Additionally, there will be hidden extra costs related to Machine Learning Operations (MLOps) or Large Language Model Operations (LLMOps).
So, what is the total when we add up all five components? That is right: building an AI application based on LangChain is extremely expensive. Not only are the costs high, but it also requires harmonizing numerous elements, as well as managing ongoing operations and maintenance. This is precisely why many companies find AI adoption challenging, and even when they start, they often face failure.
4. Is a Data Platform Necessary?
Apart from LLMs, data platforms make up the largest cost. In most cases, developing an AI application based on LangChain necessitates using a data platform. While it is possible to build an AI service without a data platform solution, there would be a significant increase in custom (or outsourced) development requirements and difficulty.
Below is a brief comparison of characteristics, costs, and difficulty levels for each approach with or without a data platform.
The mentioned solution costs and labor expenses are approximate amounts under specific conditions and are intended only for comparative reference.
5. Conclusion
LangChain is an open-source toolkit prepared to use LLM, prompt engineering, RAG, text-to-SQL, other ML services, and external APIs.
RAG and prompt engineering require vector DB and in-memory data store, and other machine learning models require a data platform.
LangChain, as a protocol, allows any solution to integrate into its functionality as long as it aligns with LangChain standards. From a customer perspective, however, using LangChain still requires individually selecting and paying for necessary solutions, such as vector DB, in-memory data store, and data platform.
Meanwhile, data platforms used for data analysis and machine learning development were designed before LLM and GPT models; therefore, they do not process data in a vector DB format. This means unstructured data like images, videos, and text must be stored in costly data storage within the data platform.
If a company opts to develop an AI application using ThanoSQL instead of the LangChain approach, these complex issues can be resolved all at once.
Additionally, hanoSQL offers significantly higher accuracy and execution speed compared to the LangChain approach, all at a lower cost. This is because, along with LLMs, ThanoSQL leverages Transformer models for other machine learning developments, making full use of its built-in vector DB. In practice, raw data is stored in inexpensive distributed storage, like S3, while the data used for analysis is converted into much smaller embedding vectors, which are then stored and processed in the vector DB.
ThanoSQL enables the TAG framework (using LLM, prompt engineering, RAG, text-to-SQL, additional ML models, and external APIs within SQL) to function by integrating transformer models for LLM and other machine learning (analysis).