Vector Search Module
The vector module centralises embeddings and similarity search for enabled entities. It complements the JSON-based query index by storing dense vectors in a configurable backend (pgvector by default) and exposes shared helpers for frontends and APIs.
Module anatomy
- Package:
@open-mercato/vector - Module id:
vector - Generated hooks: the DI graph registers
vectorIndexService,vectorEmbeddingService, and driver instances at boot. - Subscribers: the module listens to
query_index.upsert_oneandquery_index.delete_oneevents so existing CRUD flows automatically trigger vector reindexing.
export function register(container: AppContainer) {
const embeddingService = new EmbeddingService()
const drivers = [createPgVectorDriver(), createChromaDbDriver(), createQdrantDriver()]
const indexService = new VectorIndexService({
drivers,
embeddingService,
queryEngine: container.resolve('queryEngine'),
moduleConfigs: vectorModuleConfigs,
containerResolver: () => container,
})
container.register({
vectorEmbeddingService: asValue(embeddingService),
vectorDrivers: asValue(drivers),
vectorIndexService: asValue(indexService),
})
}
Declaring searchable entities
Modules opt in by exporting vectorConfig from src/modules/<module>/vector.ts.
import type { VectorModuleConfig } from '@open-mercato/shared/modules/vector'
export const vectorConfig: VectorModuleConfig = {
defaultDriverId: 'pgvector',
entities: [
{
entityId: 'customers:customer_entity',
formatResult: ({ record }) => ({
title: record.display_name,
subtitle: record.kind === 'person' ? record.primary_email : record.description,
}),
resolveUrl: ({ record }) => record.kind === 'person'
? `/backend/customers/people/${record.id}`
: `/backend/customers/companies/${record.id}`,
},
{
entityId: 'customers:customer_comment',
buildSource: async (ctx) => {
const parent = await loadCustomerEntity(ctx, ctx.record.entity_id)
return {
input: [`Customer: ${parent?.display_name ?? ''}`, `Note: ${ctx.record.body}`],
presenter: {
title: parent?.display_name ?? 'Customer note',
subtitle: ctx.record.body,
},
}
},
resolveUrl: async (ctx) => {
const parent = await loadCustomerEntity(ctx, ctx.record.entity_id)
return parent ? `/backend/customers/companies/${parent.id}#notes` : null
},
},
],
}
Key callbacks:
buildSourcereturns the text chunks that will be embedded, plus optional presenter metadata and checksum source. Shorthand fields fall back to the raw record and custom fields.formatResult,resolveUrl, andresolveLinksshape the runtime payload sent to front-end consumers (command palette, Data Designer, custom UIs).
Drivers & migrations
Drivers implement a small interface (ensureReady, upsert, delete, query, getChecksum, purge). The pgvector driver ships with an embedded migration that creates the vector_search table and IVFFLAT index with cosine distance.
Driver migrations run on first use via ensureReady. Each driver can maintain its own migration log (vector_search_migrations for pgvector) without depending on MikroORM.
Reindexing
VectorIndexService exposes three entry points:
indexRecord– upserts a single record, used by event subscribers.deleteRecord– removes a record when the base row disappears.reindexEntity/reindexAll– batch operations invoked via the REST API or CLI to bootstrap historic data.
Whenever the checksum computed from the record, custom fields, and optional checksumSource stays unchanged, the service skips re-embedding, preventing redundant OpenAI calls.
Frontend helpers
The package exports VectorSearchDialog (global command palette) and VectorSearchTable (Data Designer page). Both rely on the shared /api/vector/search endpoint and the fetchVectorResults() helper from frontend/utils.ts which wraps apiCall with a typed response. A module CLI (yarn mercato vector reindex ...) mirrors the REST endpoint to kick off bulk reindexing from scripts or CI.
You can reuse fetchVectorResults in custom UIs to embed vector search in specialized workflows.
Runtime configuration
- Vector-specific preferences live in the shared
configsmodule (module_id = 'vector'). vector.auto_index_enableddetermines whether query index events trigger automatic vector reindexing.- Toggle the setting in Backend → Configuration → Vector Search; the page talks to
/api/vector/settingsso custom dashboards can reuse the same endpoint. - Setting the environment flag
DISABLE_VECTOR_SEARCH_AUTOINDEXING=1forces the toggle off and disables updates from the UI/API. yarn mercato configs restore-defaults(automatically executed bymercato init) seeds default values and respects the environment override above.