Collections¶
A collection is a persistent, on-disk container for documents. It owns the schema, the vector indexes, and the forward storage for scalar fields.
Creating a Collection¶
Use create_and_open to create a new collection at a file system path:
This creates the directory structure on disk and returns an open collection handle.
Warning
Calling create_and_open on a path that already exists raises Zvec::AlreadyExistsError.
Opening an Existing Collection¶
Raises Zvec::NotFoundError if the path does not contain a valid collection.
Block-Form Open¶
Zvec.open_collection yields the collection and auto-flushes when the block exits:
Zvec.open_collection("/path/to/my_collection") do |col|
results = col.query_vector("embedding", query_vec, top_k: 5)
# ...
end
# col is flushed automatically here
Collection Options¶
Pass a CollectionOptions object to control how a collection is opened:
opts = Zvec::CollectionOptions.new
opts.read_only = true # open in read-only mode
opts.enable_mmap = true # memory-map data files
opts.max_buffer_size = 4096 # write buffer size
col = Zvec::Collection.open("/path/to/my_collection", options: opts)
Writing Documents¶
Insert¶
Returns an array of Status objects, one per document.
Upsert¶
Insert or replace existing documents by primary key:
Update¶
Update fields on existing documents:
Flushing¶
Writes are buffered in memory. Call flush to persist to disk:
The block-form Zvec.open_collection flushes automatically on block exit.
Reading Documents¶
Fetch by Primary Key¶
docs = col.fetch(["pk1", "pk2", "pk3"])
docs.each do |pk, doc|
h = doc.to_h(col.schema)
puts "#{pk}: #{h['title']}"
end
Returns a hash mapping primary keys to Doc objects. Missing keys are omitted.
Vector Query¶
See Vector Search for the full query interface.
Deleting Documents¶
Delete by Primary Key¶
Delete by Filter¶
Collection Metadata¶
col.path # => "/path/to/my_collection"
col.schema # => CollectionSchema
col.stats # => CollectionStats
col.options # => CollectionOptions
Collection Stats¶
Schema Modification¶
Add a Column¶
With a default expression:
Drop a Column¶
Alter a Column¶
Rename a column or change its schema:
Index Management¶
Create an Index¶
Add an index to an existing column:
Drop an Index¶
Optimize¶
Compact and optimize indexes:
Destroying a Collection¶
Permanently delete the collection and all its data from disk:
Danger
This is irreversible. All data is deleted.