Introducing a huge update for Swiftide, 0.12, with hybrid search for Qdrant, filters for similarity search, a parquet loader, a massive performance boost, and many other improvements.
Trumpets and a big thanks to @ephraimkunz for his first contribution!
Swiftide is a Rust native library for building LLM applications. Large language models are amazing, but need context to solve real problems. Swiftide allows you to ingest, transform and index large amounts of data fast, and then query that data so it it can be injected into prompts. This process is called Retrieval Augmented Generation.
To get started with Swiftide, head over to swiftide.rs, check us out on github, or hit us up on discord.
Hybrid search support
Retrieving the most relevant information for a given query is the key challenge in Retrieval Augmented Generation. Research and our own experience shows that similarity search on vectors is not enough. The idea behind hybrid search is fairly simple:
- Retrieve n documents with similarity search
- Retrieve n documents with another kind of search (ie full text, sparse vectors)
- Rerank the documents for relevancy
- Take the top k documents
There are two broad ways to go about this, either use multiple data stores, or use a database that can do both.
Qdrant supports hybrid search with sparse vectors. They recently reworked their implementation and Swiftide now fully supports it.
To use hybrid search in Qdrant, both the indexed data and the query need sparse embeddings. You can then build a query pipeline with the HybridSearch
strategy.
It can be implemented as follows:
The full example is available on github.
Unfortunately, Lancedb does not support hybrid search yet in their Rust client. Shoot us a message on discord if your solution needs more elaborate search, and we’re happy to see what is possible.
Search filters
Both lancedb and Qdrant now support search filters in their native client language with SimilaritySingleEmbedding
search. This enables the full api for both without the need to wrap.
Filters are set in the SearchStrategy
when creating the query pipeline.
For example in Qdrant:
Lancedb filters with strings in a sql like format. Unlike Qdrant, lancedb needs the fields it indexes configured when indexing. Once that works, the same filter query looks like this:
Parquet loader
Swiftide can now load parquet files. Parquet is quickly becoming the defacto standard for ML datasets. This enables experimenting with datasets from HuggingFace, and using your own datasets. Only plain text columns are implemented.
For example:
The loader fully streams the content of the parquet file.
Massive performance boost
Well, this is a bit embarrassing and very exciting at the same time. Concurrency was not working fully in streaming pipelines if the future did not yield.
With this fixed, testing large datasets both local resource bound (with fastembed and local modals) and io bound (with openai), we see a 30% up to a 50% improvement in overall performance. Tests were done on a MacBook M3 Pro. The benchmark indexes around 10k chunks, with concurrency set to the number of available CPUs. Of course, with the IO bound we could go a lot higher in concurrency.
Openai
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
swiftide-0.12 | 4.202 | 4.202 | 4.202 | 1.00 |
swiftide-0.11 | 6.352 | 6.352 | 6.352 | 1.51 |
FastEmbed
Command | Mean [s] | Min [s] | Max [s] | Relative |
---|---|---|---|---|
swiftide-0.12 | 30.385 ± 0.795 | 29.505 | 31.051 | 1.00 |
swiftide-0.11 | 41.290 ± 0.127 | 41.161 | 41.415 | 1.36 ± 0.04 |
Other notable improvements
Since our last release post on 0.9, more notable improvements:
- Debug logging is much less verbose, truncating text and tuning
- Ollama embeddings support (thanks @ephraimkunz!)
- IDs for both lancedb and qdrant are now generated with uuidv3, based on the chunk and path, for more reliable upserts
- Rust 1.81 support
- Chunking regular text with
text_splitter
- All steps in pipelines now support
Box<dyn trait>
, enabling more generic pipelines - Failed embeddings now properly propagate errors
Call for contributors
There is a large list of desired features, and many more unlisted over at our issues page; ranging from great starter issues, to fun, complex challenges.
You can find the full changelog here.
To get started with Swiftide, head over to swiftide.rs or check us out on github.