Class: FactDb::Models::Source
- Inherits:
-
ActiveRecord::Base
- Object
- ActiveRecord::Base
- FactDb::Models::Source
- Defined in:
- lib/fact_db/models/source.rb
Overview
Represents a source document from which facts are extracted
Sources are immutable content documents (emails, transcripts, documents, etc.) that serve as the provenance for extracted facts. Content is deduplicated by SHA256 hash.
Constant Summary collapse
- KINDS =
Returns valid source content kinds.
%w[email transcript document slack meeting_notes contract report].freeze
Class Method Summary collapse
-
.nearest_neighbors(embedding, limit: 10) ⇒ ActiveRecord::Relation
Finds sources by vector similarity using pgvector.
Instance Method Summary collapse
-
#by_kind(k) ⇒ ActiveRecord::Relation
Returns sources of a specific kind.
-
#captured_after(date) ⇒ ActiveRecord::Relation
Returns sources captured after a date.
-
#captured_before(date) ⇒ ActiveRecord::Relation
Returns sources captured before a date.
-
#captured_between(from, to) ⇒ ActiveRecord::Relation
Returns sources captured within a date range.
-
#immutable? ⇒ Boolean
Returns whether the source content can be modified.
-
#preview(length: 200) ⇒ String
Returns a preview of the content, truncated if needed.
-
#search_text(query) ⇒ ActiveRecord::Relation
Full-text search on source content using PostgreSQL tsvector.
-
#word_count ⇒ Integer
Returns the word count of the content.
Class Method Details
.nearest_neighbors(embedding, limit: 10) ⇒ ActiveRecord::Relation
Finds sources by vector similarity using pgvector
74 75 76 77 78 |
# File 'lib/fact_db/models/source.rb', line 74 def self.nearest_neighbors(, limit: 10) return none unless order(Arel.sql("embedding <=> '#{}'")).limit(limit) end |
Instance Method Details
#by_kind(k) ⇒ ActiveRecord::Relation
Returns sources of a specific kind
40 |
# File 'lib/fact_db/models/source.rb', line 40 scope :by_kind, ->(k) { where(kind: k) } |
#captured_after(date) ⇒ ActiveRecord::Relation
Returns sources captured after a date
53 |
# File 'lib/fact_db/models/source.rb', line 53 scope :captured_after, ->(date) { where("captured_at >= ?", date) } |
#captured_before(date) ⇒ ActiveRecord::Relation
Returns sources captured before a date
59 |
# File 'lib/fact_db/models/source.rb', line 59 scope :captured_before, ->(date) { where("captured_at <= ?", date) } |
#captured_between(from, to) ⇒ ActiveRecord::Relation
Returns sources captured within a date range
47 |
# File 'lib/fact_db/models/source.rb', line 47 scope :captured_between, ->(from, to) { where(captured_at: from..to) } |
#immutable? ⇒ Boolean
Returns whether the source content can be modified
Sources are always immutable to preserve provenance integrity.
85 86 87 |
# File 'lib/fact_db/models/source.rb', line 85 def immutable? true end |
#preview(length: 200) ⇒ String
Returns a preview of the content, truncated if needed
100 101 102 103 104 |
# File 'lib/fact_db/models/source.rb', line 100 def preview(length: 200) return content if content.length <= length "#{content[0, length]}..." end |
#search_text(query) ⇒ ActiveRecord::Relation
Full-text search on source content using PostgreSQL tsvector
65 66 67 |
# File 'lib/fact_db/models/source.rb', line 65 scope :search_text, lambda { |query| where("to_tsvector('english', content) @@ plainto_tsquery('english', ?)", query) } |
#word_count ⇒ Integer
Returns the word count of the content
92 93 94 |
# File 'lib/fact_db/models/source.rb', line 92 def word_count content.split.size end |