Swiftide 0.5 - Bedrock support, stream splitting, chaining and merging, and more

Published: at by Timon Vonk

Introducing Swiftide 0.5! This release introduces Bedrock support, stream splitting, chaining and merging, and more.

To get started with Swiftide, head over to swiftide.rs or check us out on github.

AWS Bedrock Support

AWS provides a straightforward way to access managed, publicly available models.

In this release we’ve added support for the Anthropic and Titan families, implementing SimplePrompt.

You need have access to the models in your AWS account, and the aws cli configured.

For example, to use Claude Sonnet:

let aws_bedrock = AwsBedrock::build_anthropic_family(
"anthropic.claude-3-sonnet-20240229-v1:0",
)
.build()?;
let memory_storage = MemoryStorage::default();
IngestionPipeline::from_loader(FileLoader::new("./README.md"))
.then_chunk(ChunkMarkdown::from_chunk_range(100..512))
.then(MetadataSummary::new(aws_bedrock))
.then_store_with(memory_storage)
.run()
.await?;

Stream splitting and merging

It is now possible to split and merge streams. This allows you to split a stream into multiple streams, process them in parallel and merge them back together. Useful for processing nodes conditionally in a stream.

let (left, right) = IngestionPipeline::from_loader(FileLoader::new("./README.md"))
.then_chunk(ChunkMarkdown::from_chunk_range(100..512))
.then(MetadataSummary::new(aws_bedrock))
.split_by(|result| {
if let Ok(node) = node {
node.chunk.starts_with("will go left")
} else {
false
});
left.then(move |mut node| {
node.chunk = "left".to_string();
Ok(node)
});
right.then(move |mut node| {
node.chunk = "right".to_string();
Ok(node)
});
left.merge(right)
.then_store_with(memory_storage)
.run().await?;

Splitting is not lazy. There is a buffer on the resulting streams and you should avoid slow side effects before running the final streams.

By default merge will alternate between the two stream so that concurrency settings are respected. Other Rust mechanisms can of course also be used to process the stream. Streams do not require to be merged.

Pipeline chaining

Pipelines can now start from anything that implements Into<IngestionStream> using IngestionPipeline::from_stream. This opens options making complex constructions or hooking up pipelines to existing code.

Current implementers:

Iterators can also be converted to a stream directly using IngestionStream::iter.

Breaking changes


You can find the full changelog here.

To get started with Swiftide, head over to swiftide.rs or check us out on github.