Swiftide 0.12 - Hybrid Search, search filters, parquet loader, and a giant speed bump

Introducing a huge update for Swiftide, 0.12, with hybrid search for Qdrant, filters for similarity search, a parquet loader, a massive performance boost, and many other improvements.

Trumpets and a big thanks to @ephraimkunz for his first contribution!

Swiftide is a Rust native library for building LLM applications. Large language models are amazing, but need context to solve real problems. Swiftide allows you to ingest, transform and index large amounts of data fast, and then query that data so it it can be injected into prompts. This process is called Retrieval Augmented Generation.

To get started with Swiftide, head over to swiftide.rs, check us out on github, or hit us up on discord.

Hybrid search support

Retrieving the most relevant information for a given query is the key challenge in Retrieval Augmented Generation. Research and our own experience shows that similarity search on vectors is not enough. The idea behind hybrid search is fairly simple:

Retrieve n documents with similarity search
Retrieve n documents with another kind of search (ie full text, sparse vectors)
Rerank the documents for relevancy
Take the top k documents

There are two broad ways to go about this, either use multiple data stores, or use a database that can do both.

Qdrant supports hybrid search with sparse vectors. They recently reworked their implementation and Swiftide now fully supports it.

To use hybrid search in Qdrant, both the indexed data and the query need sparse embeddings. You can then build a query pipeline with the HybridSearch strategy.

It can be implemented as follows:

let fastembed = FastEmbed::try_default()?;
let fastembed_sparse = FastEmbed::try_default_sparse()?;

// Ensure Qdrant has vectors configured for both dense and sparse
// In this case we're working with combined vectors (chunk + any metadata)
let qdrant = Qdrant::builder()
      .batch_size(batch_size)
      .vector_size(384)
      .with_vector(EmbeddedField::Combined)
      .with_sparse_vector(EmbeddedField::Combined)
      .collection_name("swiftide-hybrid-example")
      .build()?;

// Then add sparse embeddings for indexing:

// <snip> rest of pipeline
indexing_pipeline.then_in_batch(
  256,
  transformers::SparseEmbed::new(fastembed_sparse)
)

// And set up the query pipeline with hybrid search and sparse embeddings
let query_pipeline = query::Pipeline::from_search_strategy(
    // By default it uses the Combined fields, no need to configure, with a top_k of 10 and a top_n of 10
    HybridSearch::default()
)
// Generate sub questions on the initial query to increase our query coverage
.then_transform_query(query_transformers::GenerateSubquestions::from_client(
    openai.clone(),
))
// Generate the same embeddings we used for indexing
.then_transform_query(query_transformers::Embed::from_client(fastembed.clone()))
.then_transform_query(query_transformers::SparseEmbed::from_client(
    fastembed_sparse.clone(),
))
.then_retrieve(qdrant.clone())
// Answer with Simple, which either takes the documents as is (in this case), or any transformations applied
// after querying
.then_answer(answers::Simple::from_client(openai.clone()));

The full example is available on github.

Unfortunately, Lancedb does not support hybrid search yet in their Rust client. Shoot us a message on discord if your solution needs more elaborate search, and we’re happy to see what is possible.

Both lancedb and Qdrant now support search filters in their native client language with SimilaritySingleEmbedding search. This enables the full api for both without the need to wrap.

Filters are set in the SearchStrategy when creating the query pipeline.

For example in Qdrant:

// Given we have a field "filter" on our data (ie from indexed metadata, which in qdrant is stored by default)
let search_strategy = SimilaritySingleEmbedding::from_filter(qdrant::Filter::must([
  qdrant::Condition::matches("filter", "true".to_string()),
]));

// Then build the pipeline from the strategy like this:
query::Pipeline::from_search_strategy(search_strategy)

Lancedb filters with strings in a sql like format. Unlike Qdrant, lancedb needs the fields it indexes configured when indexing. Once that works, the same filter query looks like this:

let search_strategy =
    SimilaritySingleEmbedding::from_filter("filter = \"true\"".to_string());
query::Pipeline::from_search_strategy(search_strategy)

Parquet loader

Swiftide can now load parquet files. Parquet is quickly becoming the defacto standard for ML datasets. This enables experimenting with datasets from HuggingFace, and using your own datasets. Only plain text columns are implemented.

For example:

let loader = Parquet::builder()
    .path(path)
    .column_name(column)
    .build()?;

indexing::Pipeline::from_loader(loader)

The loader fully streams the content of the parquet file.

Massive performance boost

Well, this is a bit embarrassing and very exciting at the same time. Concurrency was not working fully in streaming pipelines if the future did not yield.

With this fixed, testing large datasets both local resource bound (with fastembed and local modals) and io bound (with openai), we see a 30% up to a 50% improvement in overall performance. Tests were done on a MacBook M3 Pro. The benchmark indexes around 10k chunks, with concurrency set to the number of available CPUs. Of course, with the IO bound we could go a lot higher in concurrency.

Openai

Command	Mean [s]	Min [s]	Max [s]	Relative
swiftide-0.12	4.202	4.202	4.202	1.00
swiftide-0.11	6.352	6.352	6.352	1.51

FastEmbed

Command	Mean [s]	Min [s]	Max [s]	Relative
swiftide-0.12	30.385 ± 0.795	29.505	31.051	1.00
swiftide-0.11	41.290 ± 0.127	41.161	41.415	1.36 ± 0.04

Other notable improvements

Since our last release post on 0.9, more notable improvements:

Debug logging is much less verbose, truncating text and tuning
Ollama embeddings support (thanks @ephraimkunz!)
IDs for both lancedb and qdrant are now generated with uuidv3, based on the chunk and path, for more reliable upserts
Rust 1.81 support
Chunking regular text with text_splitter
All steps in pipelines now support Box<dyn trait>, enabling more generic pipelines
Failed embeddings now properly propagate errors

Call for contributors

There is a large list of desired features, and many more unlisted over at our issues page; ranging from great starter issues, to fun, complex challenges.

You can find the full changelog here.

To get started with Swiftide, head over to swiftide.rs or check us out on github.