Quick Start¶
This guide walks through a complete workflow: defining a schema, creating a collection, inserting documents, querying by vector similarity, and cleaning up.
1. Define a Schema¶
Every collection needs at least a primary key field and one vector field. Scalar fields store metadata alongside vectors.
require "zvec"
require "tmpdir"
pk = Zvec::FieldSchema.create("pk", Zvec::DataType::STRING)
title = Zvec::FieldSchema.create("title", Zvec::DataType::STRING)
year = Zvec::FieldSchema.create("year", Zvec::DataType::INT32)
embedding = Zvec::FieldSchema.create("embedding", Zvec::DataType::VECTOR_FP32,
dimension: 4,
index_params: Zvec::HnswIndexParams.new(Zvec::MetricType::COSINE))
schema = Zvec::CollectionSchema.create("movies", [pk, title, year, embedding])
2. Create a Collection¶
A collection is stored on disk at a specified path. Use create_and_open for new collections:
Or use a temporary directory for experiments:
Dir.mktmpdir("zvec") do |dir|
col = Zvec::Collection.create_and_open(File.join(dir, "movies"), schema)
# ... work with col ...
end
3. Insert Documents¶
Build Doc objects with typed fields matching your schema:
doc = Zvec::Doc.new
doc.pk = "mov1"
doc.set_field("pk", Zvec::DataType::STRING, "mov1")
doc.set_field("title", Zvec::DataType::STRING, "The Matrix")
doc.set_field("year", Zvec::DataType::INT32, 1999)
doc.set_field("embedding", Zvec::DataType::VECTOR_FP32, [0.9, 0.1, 0.0, 0.2])
statuses = col.insert([doc])
statuses.each { |s| puts s } # => "OK"
col.flush # persist to disk
4. Query by Vector Similarity¶
Find the most similar documents to a query vector:
results = col.query_vector("embedding", [0.85, 0.2, 0.05, 0.15], top_k: 3)
results.each do |doc|
h = doc.to_h(col.schema)
puts "#{h['title']} (#{h['year']}) score=#{h['score']}"
end
Score Interpretation
The score field contains the distance between vectors. For cosine distance, 0 means identical and values closer to 0 indicate higher similarity. To convert to a similarity value where higher is better, use 1.0 - score.
5. Fetch by Primary Key¶
Retrieve specific documents without a vector search:
docs = col.fetch(["mov1"])
docs.each do |pk, doc|
h = doc.to_h(col.schema)
puts "#{pk}: #{h['title']}"
end
6. Delete Documents¶
7. Block-Form Collections¶
Zvec.open_collection opens an existing collection, yields it, and auto-flushes on block exit:
Zvec.open_collection("/tmp/my_movies") do |col|
results = col.query_vector("embedding", [0.1, 0.2, 0.3, 0.4], top_k: 5)
# ...
end
# col is flushed automatically here
Next Steps¶
- Schemas and Fields — Learn about all supported data types and field options.
- Vector Search — Understand distance metrics, query parameters, and result scoring.
- Examples — See complete working examples.