Class: FactDb::Services::EntityService
- Inherits:
-
Object
- Object
- FactDb::Services::EntityService
- Defined in:
- lib/fact_db/services/entity_service.rb
Overview
Service class for managing entities in the database
Provides methods for creating, searching, and managing entities including name resolution, alias management, and duplicate detection.
Instance Attribute Summary collapse
-
#config ⇒ FactDb::Config
readonly
The configuration object.
-
#resolver ⇒ FactDb::Resolution::EntityResolver
readonly
The entity resolver instance.
Instance Method Summary collapse
-
#add_alias(entity_id, alias_name, kind: nil, confidence: 1.0) ⇒ FactDb::Models::EntityAlias
Adds an alias to an entity.
-
#auto_merge_duplicates! ⇒ void
Automatically merges high-confidence duplicates.
-
#by_kind(kind) ⇒ ActiveRecord::Relation
Returns entities of a specific kind.
-
#create(name, kind:, aliases: [], attributes: {}, description: nil) ⇒ FactDb::Models::Entity
Creates a new entity in the database.
-
#facts_about(entity_id, at: nil, status: :canonical) ⇒ ActiveRecord::Relation
Returns facts about an entity.
-
#find(id) ⇒ FactDb::Models::Entity
Finds an entity by ID.
-
#find_by_name(name, kind: nil) ⇒ FactDb::Models::Entity?
Finds an entity by exact name match.
-
#find_duplicates(threshold: nil) ⇒ Array<Hash>
Finds potential duplicate entities.
-
#fuzzy_search(query, kind: nil, threshold: 0.3, limit: 20) ⇒ Array<FactDb::Models::Entity>
Searches entities using PostgreSQL trigram similarity (handles typos).
-
#initialize(config = FactDb.config) ⇒ EntityService
constructor
Initializes a new EntityService instance.
-
#merge(keep_id, merge_id) ⇒ FactDb::Models::Entity
Merges two entities, keeping one as canonical.
-
#relationship_types ⇒ Array<Symbol>
Returns all relationship types used in the database.
-
#relationship_types_for(entity_id) ⇒ Array<Symbol>
Returns relationship types for a specific entity.
-
#resolve(name, kind: nil) ⇒ FactDb::Resolution::ResolvedEntity?
Resolves a name to an existing entity.
-
#resolve_or_create(name, kind:, aliases: [], attributes: {}, description: nil) ⇒ FactDb::Models::Entity
Resolves a name to an entity, creating one if not found.
-
#search(query, kind: nil, limit: 20) ⇒ ActiveRecord::Relation
Searches entities by name or alias using LIKE pattern matching.
-
#semantic_search(query, kind: nil, limit: 20) ⇒ ActiveRecord::Relation
Searches entities using semantic similarity (vector search).
-
#stats ⇒ Hash
Returns aggregate statistics about entities.
-
#timeline_for(entity_id, from: nil, to: nil) ⇒ FactDb::Temporal::Timeline
Builds a timeline of facts for an entity.
-
#timespan_for(entity_id) ⇒ Hash
Returns the timespan of facts for an entity.
Constructor Details
#initialize(config = FactDb.config) ⇒ EntityService
Initializes a new EntityService instance
24 25 26 27 |
# File 'lib/fact_db/services/entity_service.rb', line 24 def initialize(config = FactDb.config) @config = config @resolver = Resolution::EntityResolver.new(config) end |
Instance Attribute Details
#config ⇒ FactDb::Config (readonly)
Returns the configuration object.
16 17 18 |
# File 'lib/fact_db/services/entity_service.rb', line 16 def config @config end |
#resolver ⇒ FactDb::Resolution::EntityResolver (readonly)
Returns the entity resolver instance.
19 20 21 |
# File 'lib/fact_db/services/entity_service.rb', line 19 def resolver @resolver end |
Instance Method Details
#add_alias(entity_id, alias_name, kind: nil, confidence: 1.0) ⇒ FactDb::Models::EntityAlias
Adds an alias to an entity
141 142 143 144 |
# File 'lib/fact_db/services/entity_service.rb', line 141 def add_alias(entity_id, alias_name, kind: nil, confidence: 1.0) entity = Models::Entity.find(entity_id) entity.add_alias(alias_name, kind: kind, confidence: confidence) end |
#auto_merge_duplicates! ⇒ void
This method returns an undefined value.
Automatically merges high-confidence duplicates
281 282 283 |
# File 'lib/fact_db/services/entity_service.rb', line 281 def auto_merge_duplicates! @resolver.auto_merge_duplicates! end |
#by_kind(kind) ⇒ ActiveRecord::Relation
Returns entities of a specific kind
242 243 244 |
# File 'lib/fact_db/services/entity_service.rb', line 242 def by_kind(kind) Models::Entity.by_kind(kind).not_merged.order(:name) end |
#create(name, kind:, aliases: [], attributes: {}, description: nil) ⇒ FactDb::Models::Entity
Creates a new entity in the database
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
# File 'lib/fact_db/services/entity_service.rb', line 37 def create(name, kind:, aliases: [], attributes: {}, description: nil) = (name) entity = Models::Entity.create!( name: name, kind: kind.to_s, description: description, metadata: attributes, resolution_status: "resolved", embedding: ) aliases.each do |alias_text| entity.add_alias(alias_text) end entity end |
#facts_about(entity_id, at: nil, status: :canonical) ⇒ ActiveRecord::Relation
Returns facts about an entity
252 253 254 255 256 257 258 |
# File 'lib/fact_db/services/entity_service.rb', line 252 def facts_about(entity_id, at: nil, status: :canonical) Temporal::Query.new.execute( entity_id: entity_id, at: at, status: status ) end |
#find(id) ⇒ FactDb::Models::Entity
Finds an entity by ID
61 62 63 |
# File 'lib/fact_db/services/entity_service.rb', line 61 def find(id) Models::Entity.find(id) end |
#find_by_name(name, kind: nil) ⇒ FactDb::Models::Entity?
Finds an entity by exact name match
70 71 72 73 74 |
# File 'lib/fact_db/services/entity_service.rb', line 70 def find_by_name(name, kind: nil) scope = Models::Entity.where(["LOWER(name) = ?", name.downcase]) scope = scope.where(kind: kind) if kind scope.not_merged.first end |
#find_duplicates(threshold: nil) ⇒ Array<Hash>
Finds potential duplicate entities
274 275 276 |
# File 'lib/fact_db/services/entity_service.rb', line 274 def find_duplicates(threshold: nil) @resolver.find_duplicates(threshold: threshold) end |
#fuzzy_search(query, kind: nil, threshold: 0.3, limit: 20) ⇒ Array<FactDb::Models::Entity>
Searches entities using PostgreSQL trigram similarity (handles typos)
Requires pg_trgm extension. Falls back to LIKE search if unavailable.
192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 |
# File 'lib/fact_db/services/entity_service.rb', line 192 def fuzzy_search(query, kind: nil, threshold: 0.3, limit: 20) return [] if query.to_s.strip.length < 3 sql = <<~SQL SELECT DISTINCT e.id, GREATEST( similarity(LOWER(e.name), LOWER(?)), COALESCE(MAX(similarity(LOWER(a.name), LOWER(?))), 0) ) as sim_score FROM fact_db_entities e LEFT JOIN fact_db_entity_aliases a ON a.entity_id = e.id WHERE e.resolution_status != 'merged' AND ( similarity(LOWER(e.name), LOWER(?)) > ? OR similarity(LOWER(a.name), LOWER(?)) > ? ) GROUP BY e.id ORDER BY sim_score DESC LIMIT ? SQL sanitized = ActiveRecord::Base.sanitize_sql( [sql, query, query, query, threshold, query, threshold, limit] ) results = ActiveRecord::Base.connection.execute(sanitized) entity_ids = results.map { |r| r["id"] } return [] if entity_ids.empty? # Preserve ordering by fetching in order entities_by_id = Models::Entity.where(id: entity_ids).index_by(&:id) ordered_entities = entity_ids.map { |id| entities_by_id[id] }.compact # Apply kind filter if specified if kind ordered_entities = ordered_entities.select { |e| e.kind == kind.to_s } end ordered_entities rescue ActiveRecord::StatementInvalid => e # pg_trgm extension not available, fall back to LIKE search config.logger&.warn("Fuzzy search unavailable (pg_trgm not installed): #{e.}") search(query, kind: kind, limit: limit).to_a end |
#merge(keep_id, merge_id) ⇒ FactDb::Models::Entity
Merges two entities, keeping one as canonical
130 131 132 |
# File 'lib/fact_db/services/entity_service.rb', line 130 def merge(keep_id, merge_id) @resolver.merge(keep_id, merge_id) end |
#relationship_types ⇒ Array<Symbol>
Returns all relationship types used in the database
302 303 304 |
# File 'lib/fact_db/services/entity_service.rb', line 302 def relationship_types Models::EntityMention.distinct.pluck(:mention_role).compact.map(&:to_sym) end |
#relationship_types_for(entity_id) ⇒ Array<Symbol>
Returns relationship types for a specific entity
310 311 312 313 314 315 316 317 |
# File 'lib/fact_db/services/entity_service.rb', line 310 def relationship_types_for(entity_id) Models::EntityMention .where(entity_id: entity_id) .distinct .pluck(:mention_role) .compact .map(&:to_sym) end |
#resolve(name, kind: nil) ⇒ FactDb::Resolution::ResolvedEntity?
Resolves a name to an existing entity
Uses exact alias matching, canonical name matching, and fuzzy matching.
83 84 85 |
# File 'lib/fact_db/services/entity_service.rb', line 83 def resolve(name, kind: nil) @resolver.resolve(name, kind: kind) end |
#resolve_or_create(name, kind:, aliases: [], attributes: {}, description: nil) ⇒ FactDb::Models::Entity
Resolves a name to an entity, creating one if not found
Also checks if any provided aliases match existing entities.
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
# File 'lib/fact_db/services/entity_service.rb', line 97 def resolve_or_create(name, kind:, aliases: [], attributes: {}, description: nil) # First, try to resolve the canonical name resolved = @resolver.resolve(name, kind: kind) if resolved # Add any new aliases to the resolved entity add_new_aliases(resolved.entity, aliases) return resolved.entity end # Check if any of the provided aliases match an existing entity # This handles cases like: name="Lord", aliases=["Jesus"] where "Jesus" already exists aliases.each do |alias_text| next if alias_text.to_s.strip.empty? resolved_by_alias = @resolver.resolve(alias_text.to_s.strip, kind: kind) if resolved_by_alias entity = resolved_by_alias.entity # Add the new canonical name as an alias to the existing entity entity.add_alias(name) unless entity.name.downcase == name.downcase # Add all the other aliases too add_new_aliases(entity, aliases) return entity end end create(name, kind: kind, aliases: aliases, attributes: attributes, description: description) end |
#search(query, kind: nil, limit: 20) ⇒ ActiveRecord::Relation
Searches entities by name or alias using LIKE pattern matching
152 153 154 155 156 157 158 159 160 161 162 163 164 |
# File 'lib/fact_db/services/entity_service.rb', line 152 def search(query, kind: nil, limit: 20) scope = Models::Entity.not_merged # Search canonical names and aliases scope = scope.left_joins(:aliases).where( "LOWER(fact_db_entities.name) LIKE ? OR LOWER(fact_db_entity_aliases.name) LIKE ?", "%#{query.downcase}%", "%#{query.downcase}%" ).distinct scope = scope.where(kind: kind) if kind scope.limit(limit) end |
#semantic_search(query, kind: nil, limit: 20) ⇒ ActiveRecord::Relation
Searches entities using semantic similarity (vector search)
Requires an embedding generator to be configured.
174 175 176 177 178 179 180 181 |
# File 'lib/fact_db/services/entity_service.rb', line 174 def semantic_search(query, kind: nil, limit: 20) = (query) return Models::Entity.none unless scope = Models::Entity.not_merged.nearest_neighbors(, limit: limit) scope = scope.where(kind: kind) if kind scope end |
#stats ⇒ Hash
Returns aggregate statistics about entities
288 289 290 291 292 293 294 295 296 297 |
# File 'lib/fact_db/services/entity_service.rb', line 288 def stats { total: Models::Entity.not_merged.count, total_count: Models::Entity.not_merged.count, by_kind: Models::Entity.not_merged.group(:kind).count, by_status: Models::Entity.group(:resolution_status).count, merged_count: Models::Entity.where(resolution_status: "merged").count, with_facts: Models::Entity.joins(:entity_mentions).distinct.count } end |
#timeline_for(entity_id, from: nil, to: nil) ⇒ FactDb::Temporal::Timeline
Builds a timeline of facts for an entity
266 267 268 |
# File 'lib/fact_db/services/entity_service.rb', line 266 def timeline_for(entity_id, from: nil, to: nil) Temporal::Timeline.new.build(entity_id: entity_id, from: from, to: to) end |
#timespan_for(entity_id) ⇒ Hash
Returns the timespan of facts for an entity
323 324 325 326 327 328 329 330 331 332 |
# File 'lib/fact_db/services/entity_service.rb', line 323 def timespan_for(entity_id) facts = Models::Fact .joins(:entity_mentions) .where(entity_mentions: { entity_id: entity_id }) { from: facts.minimum(:valid_at), to: facts.maximum(:valid_at) || Date.today } end |