Insights

D-Star builds state-of-the-art RAG infrastructure and applications

If you need to build an LLM application that connects to external data sources, we're probably the best in the world to do it for you.

OUR TEAM

Zach and Nick McCormick

We're brothers who have been working together for our entire careers. We began by raising a few million dollars for a quant hedge fund right out of college. We ran that for five years before starting Superpowered AI (YC S22). Superpowered was one of the first RAG-as-a-service providers, so we've been building RAG systems since the GPT-3 days.

Zach McCormick
Zach McCormick
Co-founder
Zach McCormick
Nick McCormick
Co-founder

Services

Our Offerings

Implementation

We can build production-quality full-stack applications (web and mobile) from scratch. Alternatively, if you just need us to do the AI backend, we can do that too.

Advisory

If you already have strong engineering talent in-house and just need to augment your team with specialized RAG expertise, we're your guys. We can help you set up evals, identify failure modes, and iterate on the solution until it meets your performance requirements.

Our Open-Source Philosophy

Pushing the State-of-the-Art in RAG Performance

We love pushing the state-of-the-art in RAG performance. We've developed quite a bit of novel technology that improves accuracy for RAG applications. We're also big believers in open-source software, so we open-source a lot of the core RAG tech we develop.

dsRAG

High-performance retrieval engine for unstructured data that achieves state-of-the-art accuracy on challenging RAG benchmarks

dsParse

Sub-module of dsRAG that does visual file parsing and semantic chunking

Our Latest Insights

Stay up-to-date with our latest thoughts on RAG and LLM technologies.

10 min read

How we went from 32.0% to 96.7% accuracy on FinanceBench

FinanceBench is one of the best and most challenging RAG benchmarks available. It contains a corpus of 368 financial documents (mostly 10-Ks) and 150 questions, which are designed to be the sort of real-world questions a financial analyst might want to ask

d-star-founder-image

Zach McCormick

Co-founder

10 min read

How to use dynamic retrieval granularity to improve RAG performance

We can think of the questions a user might ask over a document or set of documents as falling on a spectrum based on the length of the context string required to accurately answer the question. Let’s call this the optimal context length. On one end of the spectrum we have factoid question answering.

d-star-founder-image

Zach McCormick

Co-founder

10 min read

Solving the out-of-context chunk problem for RAG

Many of the problems developers face with RAG come down to this: Individual chunks don’t contain sufficient context to be properly used by the retrieval system or the LLM. This leads to the inability to answer seemingly simple questions and, more worryingly, hallucinations.

d-star-founder-image

Zach McCormick

Co-founder

Process

Our Proven Approach

FAQ

Frequently Asked Questions