Vector Search¶
Vector search finds the most similar documents to a query vector using approximate nearest neighbor (ANN) algorithms.
Convenience Method: query_vector¶
The simplest way to search:
Full signature:
col.query_vector(
field_name, # name of the vector field to search
vector, # query vector (Array of floats)
top_k:, # number of results to return
filter: nil, # optional scalar filter expression
include_vector: false, # include vector data in results
query_params: nil, # optional QueryParams for tuning
output_fields: nil # optional array of field names to return
)
Distance Metrics¶
The distance metric is set when you define the vector field's index:
| Metric | Constant | Score Meaning |
|---|---|---|
| Cosine | MetricType::COSINE |
0 = identical, 2 = opposite |
| L2 (Euclidean) | MetricType::L2 |
0 = identical, increases with distance |
| Inner Product | MetricType::IP |
Higher = more similar (negated internally) |
Converting Cosine Distance to Similarity
Zvec returns cosine distance (lower is better). To get a similarity score where higher is better:
Reading Results¶
Query results are returned as an array of Doc objects, sorted by distance (closest first):
results = col.query_vector("embedding", query_vec, top_k: 3)
results.each do |doc|
h = doc.to_h(col.schema)
puts "#{h['pk']}: #{h['title']} (distance: #{h['score'].round(4)})"
end
Advanced: VectorQuery¶
For full control, build a VectorQuery directly:
vq = Zvec::VectorQuery.new
vq.topk = 10
vq.field_name = "embedding"
vq.filter = "year > 2020"
vq.include_vector = true
vq.query_params = Zvec::HnswQueryParams.new(ef: 500)
vq.output_fields = ["title", "year"]
# Set the query vector using the field schema for type dispatch
field_schema = col.schema.get_field("embedding")
vq.set_vector(field_schema, query_vec)
results = col.query(vq)
VectorQuery Properties¶
| Property | Type | Description |
|---|---|---|
topk |
Integer | Number of results to return |
field_name |
String | Vector field to search |
filter |
String | Scalar filter expression |
include_vector |
Boolean | Include vector data in results |
include_doc_id |
Boolean | Include internal doc IDs |
query_params |
QueryParams | Index-specific search parameters |
output_fields |
Array | Specific fields to include in results |
Group-By Query¶
Group results by a scalar field value:
gq = Zvec::GroupByVectorQuery.new
gq.field_name = "embedding"
gq.group_by_field_name = "category"
gq.group_count = 5 # number of groups
gq.group_topk = 3 # results per group
field_schema = col.schema.get_field("embedding")
gq.set_vector(field_schema, query_vec)
groups = col.group_by_query(gq)
groups.each do |group|
puts "Group: #{group.group_by_value}"
group.docs.each do |doc|
h = doc.to_h(col.schema)
puts " #{h['title']}"
end
end
Output Fields¶
By default, all scalar fields are returned. Use output_fields to limit the response:
Including Vectors¶
Vector data is excluded from results by default for performance. Set include_vector: true to get vectors back: