新しい投稿

検索

記事
· 7 hr 前 6m read

Vector Search with Embedded Python in InterSystems IRIS

One objective of vectorization is to render unstructured text more machine-usable. Vector embeddings accomplish this by encoding the semantics of text as high-dimensional numeric vectors, which can be employed by advanced search algorithms (normally an approximate nearest neighbor algorithm like Hierarchical Navigable Small World). This not only improves our ability to interact with unstructured text programmatically but makes it searchable by context and by meaning beyond what is captured literally by keyword.

In this article I will walk through a simple vector search implementation that Kwabena Ayim-Aboagye and I fleshed out using embedded python in InterSystems IRIS for Health. I'll also dive a bit into how to use embedded python and dynamic SQL generally, and how to take advantage of vector search features offered natively through IRIS.

Environment Details:

  • OS: Windows Server 2025
  • InterSystems IRIS for Health 2025.1
  • VS Code / InterSystems Server Manager
  • Python 3.13.7
  • Python Libraries: pandas, ollama, iris**
  • Ollama 0.12.3 and model all-minilm
  • Dynamic SQL
  • Sample database of unstructured text (classic poems)

Process:

      0. Setup the environment; complete installs

  1. Define an auxiliary table

  2. Define a RegisteredObject class for our vectorization methods, which will be written in embedded python. First let's focus on a VectorizeTable() method, which will contain a driver function (of the same name) and a few supporting process functions all written in Python.

    • The driver function walks through the process as follows:
      1. Load from IRIS into a Pandas Dataframe (via supporting function load_table())
      2. Generate an embedding column (via supporting class method GetEmbeddingString, which will later be used to generate embeddings for queries as well)
        • Convert the embedding column to a string that's compatible with IRIS vector type
      3. Write the dataframe into the auxiliary able
      4. Create an HNSW index on the auxiliary table
    • The VectorizeTable() class method then simply calls the driver function:
    • Let's examine it step-by-step:
    1. Load the table from IRIS into a Pandas Dataframe

      • def load_table(sample_size='*') -> pd.DataFrame:
            sql = f"SELECT * FROM SQLUser.SamplePoetry{f' LIMIT {sample_size}' if sample_size != '*' else ''}"
            result_set = iris.sql.exec(sql)
            df = result_set.dataframe()
        
            # Entries without text will not be vectorized nor searchable
            for index, row in df.iterrows():
                if row['poem'] == ' ' or row['poem'] is None:
                    df = df.drop(index)
        
            return df
      • This function leverages the dataframe() method of the embedded python SQLResultSet objects
      • load_table() accepts an optional sample_size argument for testing purposes. There's also a filter for entries without unstructured text. Though our sample database is curated and complete, some use cases may seek to vectorize datasets for which one cannot assume each row will have data for all columns (for example survey responses with skipped questions). As opposed to implementing a "null" or empty vector, we chose to exclude such rows from vector search by removing them at this step in the process.
      • *Note that iris is the InterSystems IRIS Python Module. It functions as an API to access IRIS classes, methods, and to interact with the database, etc.
      • *Note that SQLUser is the system-wide default schema which corresponds to the default package User.
    2. Generate an embedding column (support method)

      • ClassMethod GetEmbeddingString(aurg As %String) As %String [ Language = python ]
        {
          import iris
          import ollama
        
          response = ollama.embed(model='all-minilm',input=[ aurg ])
          embedding_str = str(response.embeddings[0])
        
          return embedding_str
        }
      • We installed Ollama on our VM, loaded the all-minilm embedding model, and generated embeddings using Ollama’s Python library. This allowed us to run the model locally and generate embeddings without an API key.
      • GetEmbeddingString returns the embedding as a string because TO_VECTOR by default expects the data argument to be a string, more on that to follow.
      • *Note that Embedded Python provides syntax for calling other ObjectScript methods defined within the current class (similar to self in Python). The earlier example uses iris.cls(__name__) syntax to get a reference to the current ObjectScript class and invoke GetEmbeddingString (ObjectScript method) from VectorizeTable (Embedded Python method inside ObjectScript method).
    3. Write the embeddings from the dataframe into the table in IRIS

      • # Write dataframe into new table
        print("Loading data into table...")
        for index, row in df.iterrows():
            sql = iris.sql.prepare("INSERT INTO SQLUser.SamplePoetryVectors (ID, EMBEDDING) VALUES (?, TO_VECTOR(?, decimal))")
            rs = sql.execute(row['id'], row['embedding'])
        
        print("Data loaded into table.")
      • Here, we use Dynamic SQL to populate SamplePoetryVectors row-by-row. Because earlier we declared the EMBEDDING property to be of type %Library.Vector we must use TO_VECTOR to convert the embeddings to IRIS' native VECTOR datatype upon insertion. We ensured compatibility with TO_VECTOR by converting the embeddings to strings earlier.
        • The iris python module again allows us to take advantage of Dynamic SQL from within our Embedded Python function.
    4. Create a HNSW Index

      • # Create Index
        iris.sql.exec("CREATE INDEX HNSWIndex ON TABLE SQLUser.SamplePoetryVectors (EMBEDDING) AS HNSW(Distance='Cosine')")
        print("Index created.")
      • IRIS will natively implement a HNSW graph for use in vector search methods when an HNSW index is created on a compatible column. The vector search methods available through IRIS are VECTOR_DOT_PRODUCT and VECTOR_COSINE. Once the index is created, IRIS will automatically use it to optimize the corresponding vector search method when called in subsequent queries. The parameter defaults for an HNSW index are Distance = CosineM = 16, and efConstruction = 200.
      • Note that VECTOR_COSINE implicitly normalizes its input vectors, so we did not need to perform normalization before inserting them into the table in order for our vector search queries to be scored correctly!
  3. Implement a VectorSearch() class method

    •    

    1. Generate an embedding for the query string

      • # Generate embedding of search parameter
        search_vector = iris.cls(__name__).GetEmbeddingString(aurg)
      • Reusing the class method GetEmbeddingString
    2. Prepare and execute a query that utilizes VECTOR_COSINE

      • # Prepare and execute SQL statement
        stmt = iris.sql.prepare(
                """SELECT top 5 p.poem, p.title, p.author 
                FROM SQLUser.SamplePoetry AS p 
                JOIN SQLUser.SamplePoetryVectors AS v 
                ON p.ID = v.ID 
                ORDER BY VECTOR_COSINE(v.embedding, TO_VECTOR(?)) DESC"""
        )
        results = stmt.execute(search_vector)
      • We use a JOIN here to combine the poetry text with its corresponding vector embedding so we can rank results by semantic similarity.
    3. Output the results

      • results_df = pd.DataFrame(results)
        
        pd.set_option('display.max_colwidth', 25)
        results_df.rename(columns={0: 'Poem', 1: 'Title', 2: 'Author'}, inplace=True)
        
        print(results_df)
      • Utilizes formatting options from pandas to tweak how it appears in the IRIS Terminal:
        •  
ディスカッション (0)1
続けるにはログインするか新規登録を行ってください
記事
· 8 hr 前 5m read

"IRIS-CoPilot" prototype - English (etc) as an IRIS language?

Keywords:  IRIS, Agents, Agentic AI, Smart Apps

Motive?

Transformer based LLMs appear to be a pretty good "universal logical–symbolic abstractor".  They started to bridge up the previous abyss among human languages and machine languages, which in essence are all logic symbols that could be mapped into the same vector space. 

Objective?

Wondering for 3 years we might be able to just use English (etc human natural languages) to do IRIS implementations as well, one day. 

Possibly tomorrow all machines, software and apps will be "intelligent" enough to interact with users in any human languages to get desired outcomes.   And tomorrow is likely today's tomorrow; not tomorrow's tomorrow.  

Research?

Researches indicate LLM are still likely to be probabilistic sequence models that internalise statistical approximations to patterns of symbols, rather than actually implement formal symbolic logic. While CoT etc produces outcomes that statistically emulates structured reasoning and can act as an abstraction layer between human language and machine actions, they *may not* manifest logically grounded inference and remain limited by statistical mimicry, shallow heuristics and the absence of semantic grounding. By saying that, theoretically we don't actually understand how to measure "intelligence" by today yet, or whether it is a thing or not, so we don't actually understand LLMs' theoretical boundaries and limits that well anyway. 

Evidences?

Today's "Vibe Coding" tools are using human languages to drive software lifecycle implementations.  But what if people don't even want to use vibe coding tools or Visual Studio - they just want to speak to IRIS directly that gets things "done"?   How are clinical quality and enterprise governance etc BAUs are auto enforced too. IRIS-CoPilot app is just a prototype, an initial demo towards our vision above.  

Prototype ideas?

https://github.com/zhongli1990/iris-copilot#iris-copilot

Human natural language-driven agentic AI platform for IRIS implementation lifecycles. This prototype is built for NHS Trust integration delivery: users describe clinical integration requirements in natural language; Copilot designs and generates IRIS artifacts, and deployment is executed only after explicit human approval.

Design?

https://github.com/zhongli1990/iris-copilot#architecture

  • CSP Chat UI: AIAgent.UI.Chat.cls
  • IRIS backend REST APIs: AIAgent.API.Dispatcher
  • IRIS backend: CoPilot Orchestrator/engine services in IRIS
  • Node.js bridge adapters for:
    • Claude Agent SDK
    • OpenAI Codex (standard API runner)
    • OpenAI Codex SDK runner
    • Azure OpenAI (to be added)
    • Google Gemini (to be added)
    • LiteLLM gateways (on the roadmapp)

Deployment?

https://github.com/zhongli1990/iris-copilot?tab=readme-ov-file#1-deploy-...

A few very simple deployment steps on any laptop: 

0. Git clone this repo into a local working path to IRIS server:  git clone https://github.com/zhongli1990/iris-copilot

1. Identify an existing IRIS namespace that you want the agent to have access to.

2. Import this IRIS-CoPilot package via Studio/Terminal, such as:  https://github.com/zhongli1990/iris-copilot/blob/main/deploy/AIAgent-exp...

3. Create a REST web app in IRIS Management Portal:  `/ai` for REST APIs (dispatch class `AIAgent.API.Dispatcher`)

4. Start the external node.js bridge, which acts as a REST adaptor for OpenAI Codex and Claude Code etc intelligence agents.

I am running Node.js v24.13.0 on a win10 laptop, so didn't use Docker.  I will dockerise it into a Ubuntu demo server later. 

cd <working path>/AIAgent/bridge
npm install
npm run build
npm start

5. Configure keys and runner settings in:

  • bridge/.env (local - add in your OpenAI and/or Claude API keys)
  • bridge/.env.example (template)

 

Demo?

1. Health checks:

  • IRIS API: http://localhost:52773/ai/health
  • Bridge: http://localhost:3100/api/health

2. Open the CSP Chat UI page:  http://<iris-host>:<port>/csp/<namespace>/AIAgent.UI.Chat.cls

      For example: http://localhost:52773/csp/healthshare/demo2_ai2/AIAgent.UI.Chat.cls

3. Demo Chat UI when it's running:

4. Local CMD console for the bridge


 

Test report?

https://github.com/zhongli1990/iris-copilot/blob/main/docs/REALWORLD-EVA...

I created 34 demo queries in the tests script along a typical lifestyles of NHS TIE implementation tasks. The above is a quick run of the test report.

Below is the actual sample queries and actual reponse to each query, using LLM-as-a-Judge to determine Pass or Fail, as listed below:

https://github.com/zhongli1990/iris-copilot/blob/main/docs/REALWORLD-LIF...

The failed test cases are also for illustration purpose - they failed simply because I haven't added in sufficient tools and resource accesses for them yet.  

 
Next Actions?

This demo app is more about conveying the ideas. It's a lightweight implementation of agent wrappers - one of design principles since LLMs and Agent SDKs are evolving rapidly - we hope to rise with tides, not stuck into our any hard-coded Langgraph etc workflows.

Next actions could be:

 1. Agents to be more generic, aiming for real human engineer tasks along daily implementation lifecycles.

 2. Hope to embed the CSP Chat UI page better within IRIS Management Portal, which would be more convenient.   

 3. An IRIS native agent SDKs other than current agent runners? (again, hope to be lightweight and future compatible)

4. Add in Skills and Hooks placeholders to auto enforce enterprise QA, governance and compliances per site-specific policies? 

5. OK, how about a self-evolving software/system: the user/clinician/engineer sets the targets, and the application just starts build/refine itself via RL etc loops, just consuming tokens. The engineer would just manage the agents like managing a production line, rather than manually manufacturing each specific product on the line?

Disclaimer:

    Prototype in progress - initial versions for bouncing ideas.  

   Rushed this through some spare time, so pardon me if some thought is still being shaped. 

ディスカッション (0)1
続けるにはログインするか新規登録を行ってください
記事
· 11 hr 前 3m read

Building a Robust Asynchronous Queue Manager with InterSystems IRIS

This is an excellent candidate for a developer community post (like Dev.to, Medium, or the InterSystems Community). It bridges the gap between high-level architecture and hands-on implementation.

Here is the summarized article format.


Building a Robust Asynchronous Queue Manager with InterSystems IRIS and Angular

As applications scale, handling heavy computational tasks synchronously becomes a bottleneck. Whether it's processing large data sets, sending high-volume emails, or managing API integrations, a decoupled architecture is essential.

I’ve recently developed %ZQueue, a process-based queue management system that combines the high-performance persistence of InterSystems IRIS with a modern Angular dashboard.

The Core Architecture: Why a Queue?

The system utilizes a classic Producer-Consumer model. By decoupling task submission from execution, we ensure that the main application remains responsive while background "workers" handle the heavy lifting.

Key Value Propositions:

  • Built-in Persistence: Unlike in-memory queues, data in %ZQueue survives system restarts or process interruptions.
  • Traceability: Every background job is assigned a unique Process ID (PID), making monitoring and debugging straightforward.
  • Error Resilience: The system distinguishes between transient and permanent failures, routing problematic tasks to a Dead Letter Queue (DLQ).

🛠 Tech Stack & Setup

The project is fully containerized using Docker, allowing for a "one-command" setup.

  • Database/Backend: InterSystems IRIS
  • Frontend: Angular (Management Dashboard)
  • Orchestration: Docker Compose

⚙️ Managing the Lifecycle

Control is split between a user-friendly UI and a powerful ObjectScript API. Once your containers are up

1. Starting the Engine

Write ##class(%ZQueue.Manager).Start()

The system returns a PID, and the manager immediately begins processing pending entries.

2. State Monitoring

You can programmatically check if the worker is active:

Write ##class(%ZQueue.Manager).IsQueueRunning()

3. Graceful Shutdown

When you need to stop the worker, run:

Write ##class(%ZQueue.Manager).Stop()

Important: This is a non-destructive stop. It halts the process but preserves all queue entries. Processing resumes exactly where it left off upon the next Start().

📊 Workflow & Visibility

The system logic is designed to move tasks through a clear lifecycle, visible through the Angular dashboard at http://localhost:8080.

  1. New Task: Validated and persisted.
  2. Active Queue: Real-time visibility of pending and processing jobs.
  3. History: Successful tasks are moved here for auditing.
  4. Dead Letter: Failed tasks are isolated here for manual intervention or debugging.

Summary

The %ZQueue Management System provides a reliable blueprint for developers looking to implement background processing within the InterSystems ecosystem. By combining the speed of IRIS with a decoupled worker model, you can build applications that are both highly responsive and resilient.

ディスカッション (0)1
続けるにはログインするか新規登録を行ってください
記事
· 12 hr 前 5m read

Clinical Staff Master Data Management with RESTful APIs and Dynamic Mapping on InterSystems IRIS for Health

Project Overview:

 

The Clinical Staff Master Data Management (CSMDM) system is a full-stack healthcare integration application built on InterSystems IRIS for Health. It centralizes and standardizes clinical staff metadata into a single authoritative repository, exposed through RESTful CRUD APIs and reusable backend methods.

The platform eliminates fragmented lookup tables and hardcoded mappings that commonly cause errors in HL7 and FHIR integration workflows, ensuring data consistency and interface reliability.

Application to Other Domains:

This approach can be applied to other domains such as Financial Services, Insurance, and similar industries by extending existing data models or creating new domain-specific classes based on their data mapping needs. It centralizes master data, standardizes schemas, and uses reusable services to ensure consistent and reliable integrations.

Problem Statement:

In healthcare integration environments, particularly in HL7 and FHIR workflows, clinical staff data often faces the following challenges:

  • Scattered across multiple systems
  • Multiple redundant lookup tables
  • Inconsistent codes leading to invalid messages
  • Increased maintenance due to duplicate mapping tables

Each clinical interface frequently requires multiple identifiers such as: Consultant Code or attending doctor code, Provider ID, GMC Number, Occupation Code, Specialty Code, Department, etc during the HL7 or FHIR based interface design. These issues create operational inefficiency, risk of interface failure, and increased maintenance overhead if not managed properly.

Proposed Solution:

 

The CSMDM system provides a centralized Clinical Staff Master Table and a RESTful API layer, which:

  • Stores all clinical staff metadata in a single persistent table
  • Provides full CRUD operations via REST
  • Supports real-time lookups for integration engines
  • Serves HL7, FHIR, JSON, XML, and file-based interfaces
  • Eliminates redundant lookup tables
  • Offers dynamic methods for clinical data mapping to streamline interface design

REST API Layer

Implemented using %CSP.REST, the API exposes the following endpoints:

HTTP Method Endpoint Function
POST /ClinicalStaff Add a clinical staff record
GET /ClinicalStaff Retrieve all clinical staff records
PUT /ClinicalStaff/:id Update a clinical staff record
DELETE /ClinicalStaff/:id Delete a clinical staff record

Architecture Flow:

  • Frontend: HTML, CSS, Bootstrap, jQuery AJAX
  • REST API Calls
  • IRIS for Health REST Service (%CSP.REST)
  • Business Logic Layer: ObjectScript, JavaScript
  • Persistent Clinical Staff Master Table
  • Dynamic methods for clinical data mapping
    • Example methods and usage are provided in SMDM.CustomDataLookup.Classes.ClinicalStaffMapping
    • Can be directly called during interface development (Code or DTL)

Key Benefits:

  • Single source of clinical staff information
  • Real-time access for integration engines
  • Eliminates duplicate mapping tables
  • Reduces operational maintenance
  • Full CRUD capability via REST APIs -Dynamic mapping for interface design
  • Supports integration intelligence for HL7, FHIR, and multi-format workflows

Test the Application

If everything is configured correctly, the application should load successfully. ⚠️ If you encounter any issues, verify:

  • Web Application configuration
  • Namespace selection
  • Files are properly imported and compiled

📸 Screenshot 1: Web App Landing Page

Below is the landing page of the Web Application after successful configuration and launch. Web App Landing Page

📸 Screenshot 2: Add Staff Record

 

Add Recprd

Add Recprd

📸 Screenshot 3: Update Record

 

Update Record

Update Record

📸 Screenshot 3: Delete Record

 

Delete Recprd

Delete Recprd

🔎 Search Functionality You can search for records using the search bar. Simply enter any of the following based on your requirement:Consultant name, code, GMC number or other relevant keywords.

Clinical Staff Data Mapping

Three methods are implemented in the class: CSMDM.CustomDataLookup.Classes.ClinicalStaffMapping

1️⃣ MapStaffConsult

ClassMethod MapStaffConsult(StaffID As %Integer) As %String

This is a static method.

It maps staff consultation data based on the provided StaffID.

📸 Screenshot:

MapStaffConsult

2️⃣ GetMappingValue

-Method Name : ClassMethod GetMappingValue(getKeyColumn As %String, CValue As %Integer, outputValue As %String) As %String

retrieves a single mapping value based on your requirement.

Allows you to specify:

-getKeyColumn – Key column name

-CValue – Column value

📸 Screenshot: GetMappingValue

3️⃣ GetMappingValues

-Method Name : ClassMethod GetMappingValues(getKeyColumn As %String, CValue As %Integer, mapValues As %String) As %String

-A more dynamic and robust method.

-Allows you to retrieve multiple column values.

-Output is returned in JSON format.

-You can pass multiple column names as output parameters.

📸 Screenshot: GetMappingValue ✅ These methods provide flexible mapping capabilities that can be integrated into HL7, FHIR, JSON, or other interface implementations.

📩 Support -If you need any assistance or support, please feel free to contact me. I’ll be happy to help.

Usage Instructions and Code/Packages

Please refer to the GitHub repository and follow the instructions provided in the README file for installation, and download the required packages for CSMDM-Dynamic-Data-Mapping.  CSMDM-Dynamic-Data-Mapping
 

This is also available on InterSystems Open Exchange :  Intersystems-OpenExchange

📩 Support

If you need any assistance or support, please feel free to ask or contact me via the web comment page. I’ll be happy to help.

 

Conclusion

CSMDM centralizes clinical staff data and provides a robust REST API with full CRUD capabilities—Create, Read, Update, and Delete. This ensures consistent, accurate data across HL7 and FHIR integrations, streamlines interface development, reduces errors, and minimizes operational overhead.

Thank you.

ディスカッション (0)1
続けるにはログインするか新規登録を行ってください
記事
· 13 hr 前 16m read

Implementando openEHR con IRIS for Health

¿Se te hiela la sangre al oír hablar de OpenEHR? ¿Te ponen los pelos de punta los arquetipos?

¡Supera tus miedos con este artículo y domina OpenEHR con las capacidades de InterSystems IRIS for Health!

¿Qué es openEHR?

openEHR es una especificación abierta e independiente del proveedor, diseñada para representar, almacenar e intercambiar información clínica de forma semánticamente rica y sostenible a largo plazo. En lugar de definir estructuras de mensajes fijas (como hacen muchos estándares de interoperabilidad), openEHR separa el conocimiento clínico de la implementación técnica mediante un enfoque de modelado multicapa.

En esencia, openEHR se basa en tres conceptos fundamentales:

  • Modelo de referencia (RM) Un modelo que define las estructuras centrales utilizadas en los registros de salud, como composiciones, entradas, observaciones, evaluaciones y acciones. El modelo de registro es deliberadamente genérico y agnóstico en cuanto a tecnología.
  • Arquetipos Modelos de datos legibles por máquina (expresados ​​en ADL o Archetype Definition Language) que definen la semántica clínica detallada de un concepto específico, como la medición de la presión arterial o un resumen del alta. Los arquetipos restringen el RM y proporcionan un vocabulario clínico reutilizable.
  • Plantillas o Templates (archivos OPT) Especializaciones basadas en arquetipos. Las plantillas adaptan los arquetipos a un caso de uso, sistema o formulario específico (por ejemplo, una plantilla de constantes vitales o una nota de alta regional). Las plantillas eliminan la opcionalidad y generan definiciones operativas que los sistemas pueden implementar de forma segura.

Veamos un ejemplo de un archivo RAW basado en una Composición (en JSON):

 
Spoiler

Este enfoque de modelado por capas permite que los sistemas openEHR se mantengan estables durante años, incluso décadas, a la vez que permite que los modelos clínicos evolucionen independientemente de la plataforma de software subyacente.

Un componente clave del ecosistema openEHR es AQL (Archetype Query Language), el lenguaje de consulta estándar utilizado para recuperar datos clínicos almacenados en repositorios openEHR.

Archetype Query Language (AQL)

AQL se inspira en SQL, pero se adapta al modelo de información de openEHR. En lugar de consultar tablas relacionales, AQL permite a los desarrolladores consultar contenido clínico estructurado dentro de las composiciones de openEHR, aprovechando las rutas de arquetipos.

Características claves de AQL

  • Consultas basadas en rutas: AQL utiliza rutas de arquetipos (similares a XPath) para navegar por la estructura interna de una composición.
    /content[openEHR-EHR-OBSERVATION.blood_pressure.v1]/data/events/time
  • Conciencia semántica clínica: Las consultas se refieren a conceptos clínicos (arquetipos, tipos de entrada, puntos de datos) en lugar de nombres de columnas de bases de datos.
  • Cláusulas WHERE flexibles: AQL admite el filtrado de valores dentro de las composiciones, por ejemplo, presión arterial sistólica > 140 o diagnósticos que coinciden con un código específico.
  • Consultas multi-composición: Puede recuperar datos de múltiples composiciones, por ejemplo, todas las observaciones de un paciente determinado a lo largo del tiempo.
  • Independiente del proveedor tecnológico: Cualquier implementación de openEHR que admita AQL debería, en principio, aceptar las mismas consultas.

Veamos un ejemplo de AQL:

SELECT
    c/uid/value AS composition_id,
    o/data[at0001]/events[at0006]/data[at0003]/value/magnitude AS systolic,
    o/data[at0001]/events[at0006]/data[at0004]/value/magnitude AS diastolic
FROM
    EHR e
    CONTAINS COMPOSITION c
    CONTAINS OBSERVATION o[openEHR-EHR-OBSERVATION.blood_pressure.v1]
WHERE
    e/ehr_id/value = '12345'
AND
    o/data[at0001]/events[at0006]/time/value > '2023-01-01T00:00:00Z'

Comparación OpenEHR - FHIR

Técnicamente hablando, OpenEHR y FHIR son muy similares. Información basada en formato documental, API REST, etc. Veamos los conceptos principales de FHIR y OpenEHR y su correlación:

¿Qué necesitamos para implementar openEHR en IRIS for Health?

Almacenamiento de composiciones en bruto

  • Que almacene las composiciones entrantes en formato JSON o XML en bruto tal y como se reciben.
  • Ninguna transformación deberá modificar el contenido semántico.
  • En nuestro ejemplo, trabajaremos con formato JSON, pero existen varias opciones.

Servicios API REST

Nuestra implementación debe exponer la siguiente API REST:

Servicios EHR

Para crear y localizar el “historial” de un paciente.

  • POST /ehr: Creación de un nuevo EHR.
  • GET /ehr/{ehr_id}: Recuperación de metadatos de EHR.
  • GET /ehr?subject_id=?: Localizar un EHR basado en identificadores externos.

Servicios sobre composiciones

Para almacenar información clínica de pacientes.

  • POST /ehr/{id}/composition: Confirmar una nueva composición en formato RAW. Validar con OPT cuando sea posible.
  • GET /composition/{version_uid}: Recuperar una versión específica.
  • GET /ehr/{id}/compositions: Lista de composiciones para un EHR.
  • DELETE /composition/{uid}: Marcar la composición como eliminada (eliminación lógica).

Endpoints para consultas AQL

  • POST /query/aql: que acepte una consulta de AQL, la transforme a SQL (haciendo uso de las rutas de JSON si fuera necesario) y retorne la información requerida.

Validación de composiciones en bruto

Tenemos que forzar la validación de composiciones RAW basadas en archivos OPT2, no podemos guardar en nuestro repositorio ningún JSON que recibamos.

Implementación de openEHR en IRIS for Health

Bueno, implementar todas las funcionalidades disponibles en un servidor openEHR llevará algún tiempo, por lo que me centraré en las funcionalidades principales:

Uso de una aplicación web para implementar un servicio API REST

Publicar una API REST es muy sencillo, solo necesitamos dos componentes: una clase que extiende %CSP.REST y un nuevo registro en la lista de aplicaciones web. Veamos la cabecera de nuestra clase %CSP.REST extendida:

Como puede ver, hemos definido todas las rutas mínimas requeridas para nuestro repositorio. Gestionamos las composiciones, los archivos OPT2 para las validaciones RAW y, finalmente, la ejecución de consultas AQL.

Para nuestro ejemplo, no definiremos ninguna configuración de seguridad, pero se recomienda la autenticación JWT.

Bien, ya tenemos nuestra definición de aplicación web y nuestra clase %CSP.REST, ¿qué más?

Validación de composiciones en bruto

openEHR no es nada nuevo, por lo que se puede suponer que existen varias bibliotecas para soportar algunas funcionalidades, como la validación de archivos RAW. Para este ejemplo, hemos utilizado y personalizado la biblioteca Archie, una biblioteca de código abierto desarrollada en Java para validar composiciones RAW con archivos OPT2. El validador es un archivo JAR configurado desde el servidor de lenguaje externo (por defecto en la implementación de imágenes de Docker) y que se invoca antes de guardar el archivo RAW mediante la funcionalidad JavaGateway:

set javaGate = $system.external.getJavaGateway()
set result = javaGate.invoke("org.validator.openehr.Cli", "validate", filePath, optPath)

Si se valida el archivo raw, el documento JSON se guardará en la base de datos.

Almacenamiento en bruto de JSON

Podríamos usar DocDB, pero queremos aprovechar al máximo el rendimiento de las bases de datos SQL. Uno de los mayores problemas de openEHR es el bajo rendimiento al consultar documentos, por lo que preprocesaremos las composiciones para obtener información común a todos los tipos de composiciones y optimizar las consultas.

Nuestra clase Composition define las siguientes propiedades:

Class OPENEHR.Object.Composition Extends (%Persistent, %XML.Adaptor) [ DdlAllowed ]
{
/// Description
Property ehrId As %Integer;
Property compositionUid As %String(MAXLEN = 50);
Property compositionType As %String;
Property startTime As %DateTime;
Property endTime As %Date;
Property archetypes As list Of %String(MAXLEN = 50000);
Property doc As %String(MAXLEN = 50000);
Property deleted As %Boolean [ InitialExpression = 0 ];
Index compositionUidIndex On compositionUid;
Index ehrIdIndex On ehrId;
Index ExampleIndex On archetypes(ELEMENTS);
}
  • ehrId: con el registro de información clínica electrónica del paciente.
  • compositionUid: el identificador de la composición.
  • compositionType: con el tipo de composición almacenada.
  • startTime: con la fecha de creación de la composición.
  • archetypes: con la lista de arquetipos presentes en la composición.
  • doc: con el formato JSON del documento.
  • deleted: valor boleano para las eliminaciones "soft" que no implican el borrado total.

Usaremos índices para mejorar el rendimiento de las consultas.

Soporte AQL

Como dijimos antes, AQL es un lenguaje de consulta basado en rutas. ¿Cómo podríamos emular el mismo comportamiento en IRIS for Health? ¡Bienvenido a JSON_TABLE!

¿Qué son las JSON_TABLE?

La función JSON_TABLE devuelve una tabla que puede usarse en una consulta SQL mediante la asignación de valores JSON a columnas. Las asignaciones de un valor JSON a una columna se escriben como expresiones del lenguaje de rutas SQL/JSON.

Como función con valores de tabla, JSON_TABLE devuelve una tabla que puede usarse en la cláusula FROM de una instrucción SELECT para acceder a los datos almacenados en un valor JSON; esta tabla no persiste entre consultas. Se pueden realizar varias llamadas a JSON_TABLE dentro de una misma cláusula FROM y pueden aparecer junto con otras funciones con valores de tabla.

Hemos implementado un método de clase en Python para traducir el AQL a SQL, pero existe un problema: el AQL se basa en rutas relativas de los arquetipos, no en rutas absolutas. Por lo tanto, necesitamos identificar la ruta absoluta de cada arquetipo y vincularla con la ruta relativa del AQL.

¿Cómo podemos saber la ruta absoluta? ¡Muy fácil! La encontramos cuando el usuario guarda el archivo OPT2 de la composición en IRIS. En cuanto obtenemos la ruta absoluta, la guardamos en un archivo CSV específico para la composición (se guardaría de forma global o de cualquier otra forma). De esta manera, solo tenemos que obtener la ruta absoluta del archivo de composición específico o, si el AQL no define la composición, buscar en los archivos CSV disponibles la ruta absoluta de los arquetipos del AQL.

Veamos cómo funciona. Aquí hay un ejemplo de AQL para obtener todos los diagnósticos con un código CIE-10 específico:

SELECT
  c/uid/value AS comp_uid,
  c/context/start_time/value AS comp_start_time,
  dx/data[at0001]/items[at0002]/value/value AS diagnosis_text,
  dx/data[at0001]/items[at0003]/value/defining_code/code_string AS diagnosis_code
FROM EHR e
CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.diagnostic_summary.v1]
CONTAINS SECTION s[openEHR-EHR-SECTION.diagnoses_and_treatments.v1]
CONTAINS EVALUATION dx[openEHR-EHR-EVALUATION.problem_diagnosis.v1]
WHERE dx/data[at0001]/items[at0003]/value/defining_code/code_string
      MATCHES {'E11', 'I48.0'}
ORDER BY c/context/start_time/value DESC

La función Transform de la clase OPENEHR.Utils.AuxiliaryFunctions transformará el AQL a lo siguiente:

SELECT comp_uid, comp_start_time, diagnosis_text, diagnosis_code 
FROM ( 
    SELECT c.compositionUid AS comp_uid, 
        jt_root.comp_start_time AS comp_start_time, 
        jt_n1.diagnosis_text AS diagnosis_text, 
        jt_n1.diagnosis_code AS diagnosis_code 
    FROM OPENEHR_Object.Composition AS c, 
        JSON_TABLE( c.doc, '$' COLUMNS ( comp_start_time VARCHAR(4000) PATH
            '$.context.start_time.value' ) ) AS jt_root, 
        JSON_TABLE( c.doc, '$.content[*]?(@._type=="SECTION" && @.archetype_node_id==
            "openEHR-EHR-SECTION.diagnoses_and_treatments.v1").items[*]?
            (@._type=="EVALUATION" && @.archetype_node_id=="openEHR-EHR-EVALUATION.problem_diagnosis.v1")'
            COLUMNS ( 
                diagnosis_text VARCHAR(4000) PATH '$.data[*]?(@.archetype_node_id=="at0001").items[*]?
                    (@.archetype_node_id=="at0002").value.value', 
                diagnosis_code VARCHAR(255) PATH '$.data[*]?(@.archetype_node_id=="at0001").items[*]?
                    (@.archetype_node_id=="at0003").value.defining_code.code_string' ) ) AS jt_n1 
    WHERE ('openEHR-EHR-COMPOSITION.diagnostic_summary.v1' %INLIST (c.archetypes) 
        AND 'openEHR-EHR-EVALUATION.problem_diagnosis.v1' %INLIST (c.archetypes)) 
        AND (jt_n1.diagnosis_code LIKE '%E11%' OR jt_n1.diagnosis_code LIKE '%I48.0%') ) U 
ORDER BY comp_start_time DESC

Probemos nuestra API REST con un AQL:

¡Bingo!

Y ahora una AQL incluyendo condiciones numéricas:

SELECT
  c/uid/value AS comp_uid,
  c/context/start_time/value AS comp_start_time,
  a/items[at0024]/value/magnitude AS creatinine_value,
  a/items[at0024]/value/units AS creatinine_units
FROM EHR e
CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.lab_results_and_medications.v1]
CONTAINS OBSERVATION o[openEHR-EHR-OBSERVATION.laboratory_test_result.v1]
CONTAINS CLUSTER a[openEHR-EHR-CLUSTER.laboratory_test_analyte.v1]
WHERE a/items[at0001]/value/value = 'Creatinina (mg/dL)'
  AND a/items[at0024]/value/magnitude BETWEEN 1.2 AND 1.8
ORDER BY c/context/start_time/value DESC

Se transforma en:

SELECT comp_uid, comp_start_time, ldl_value, ldl_units 
FROM ( 
    SELECT c.compositionUid AS comp_uid, jt_root.comp_start_time AS comp_start_time,
        jt_n1.ldl_value AS ldl_value, jt_n1.ldl_units AS ldl_units 
    FROM OPENEHR_Object.Composition AS c, 
        JSON_TABLE( c.doc, '$' COLUMNS ( comp_start_time VARCHAR(4000) 
            PATH '$.context.start_time.value' ) ) AS jt_root, 
        JSON_TABLE( c.doc, '$.content[*]?(@._type=="OBSERVATION" && 
            @.archetype_node_id=="openEHR-EHR-OBSERVATION.laboratory_test_result.v1")
                .data.events[*]?(@._type=="POINT_EVENT").data.items[*]?(@._type=="CLUSTER" &&
                @.archetype_node_id=="openEHR-EHR-CLUSTER.laboratory_test_analyte.v1")'
                COLUMNS ( 
                    ldl_value NUMERIC PATH '$.items[*]?
                        (@.archetype_node_id=="at0024").value.magnitude', 
                    ldl_units VARCHAR(64) PATH '$.items[*]?
                        (@.archetype_node_id=="at0024").value.units', 
                    _w1 VARCHAR(4000) PATH '$.items[*]?
                        (@.archetype_node_id=="at0001").value.value' ) ) AS jt_n1 
    WHERE ('openEHR-EHR-COMPOSITION.lab_results_and_medications.v1' %INLIST (c.archetypes) 
        AND 'openEHR-EHR-OBSERVATION.laboratory_test_result.v1' %INLIST (c.archetypes) 
        AND 'openEHR-EHR-CLUSTER.laboratory_test_analyte.v1' %INLIST (c.archetypes)) 
        AND jt_n1._w1 = 'LDL (mg/dL)' AND jt_n1.ldl_value <= 130 ) 
    U ORDER BY comp_start_time DESC

¡Otro éxito absoluto!

Conclusiones

Como puede ver en este ejemplo, InterSystems IRIS for Health le proporciona todas las herramientas necesarias para implementar un repositorio openEHR. Si tiene alguna pregunta, ¡no dude en dejar un comentario!

Descargo de responsabilidad

El código asociado es solo una prueba de concepto. No pretende implementar un repositorio con todas las funcionalidades, sino demostrar que es perfectamente posible con InterSystems IRIS for Health. ¡Disfrútalo!

ディスカッション (0)1
続けるにはログインするか新規登録を行ってください