Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
Motivation
Retrieval-Augmented Generation systems often struggle with accurately identifying and prioritizing the most relevant information from large document collections. This can lead to irrelevant or low-quality content being included in the generated responses. Two potential approaches to address this issue are:
...
RAG improvement option | Model used / Tokens to process | Costs per 1k input tokens | costs per month (21 120 requests) | Total cost estimation | Total cost per search |
---|---|---|---|---|---|
None | GPT-4o / 7k | $0.005 | $740 | $740 | $0.04 |
More Input Tokens | GPT-4o / 30k | $0.005 | $3'168 | $3’168 | $0.15 |
Chunk Relevancy Sort | GPT-4o / 7k GPT-4o-min / 100k | $0.005 $0.000165 | $740 $350 | $1’090 | $0.05 |
Reranker | GPT-4o / 7k reranker / 100k | $0.005 $0.00 | $740 $300 fixed | $1’040 | $0.05 |
...
Conclusion
This documentation provides a comprehensive guide for improving RAG results based on different customer scenarios. Whether dealing with rate limits, latency concerns, or cost management, the options outlined above will help you choose the best configuration for your needs.
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
For any further questions or personalized recommendations, please contact the customer success team. |
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
For more information: RAG Assessment and Improvement |
...
Author |
---|