Encontrar

記事
· 2024年7月25日 6m read

配置数据库的多卷存储

 

ISC 开发者们,我向你们致敬 👑。

多卷数据库

下面有关多卷数据的解释直接从文档搬过来的:

在InterSystems IRIS的默认配置中,数据库会使用单个 IRIS.DAT 文件保存数据。 你也可以将数据库配置为在当其达到指定大小阈值时自动保存到另外的文件(IRIS–0001.VOL、IRIS–0002.VOL 等等)中。 这些文件可能位于与 IRIS.DAT 相同的目录中和/或一组其他的目录中。

我这里想做的是设置一个较小阈值,并检查在备用目录上保存的多个扩展的数据卷。毫无疑问,这对镜像、性能 以及管理的影响是巨大的。简单来说,,前瞻性的解决方案考虑是否可以注入一个“回调”机制,并在溢出扩展之前即时地配置一个新的云存储卷。。

环境
我在 2024.1 (Build 263U) 上​​有一个正在运行的IRIS实例,我的 $ISC_DATA_DIRECTORY 设置为一个 50Gi 的OpenEBS PVC,这是大概一个月前配置的。

我向命名空间添加了一个额外的 OpenEBS PVC:
 

#kind: PersistentVolumeClaim
#apiVersion: v1
#metadata:
#  name: jiva-iris-volume-claim
#spec:
#  storageClassName: openebs-jiva-csi-default
#  accessModes:
#    - ReadWriteOnce
#  resources:
#    requests:
#      storage: 50Gi
#--
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: jiva-iris-volume-claim-mv
spec:
  storageClassName: openebs-jiva-csi-default
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

应用

sween@run1:~$ kubectl apply -f deezwatts-volume.yaml -n rivian
persistentvolumeclaim/jiva-iris-volume-claim-mv created

随后,通过初始化容器应用这些设置。

<snips>
volumes:
   - name: task-pv-storage
        persistentVolumeClaim:
            claimName: jiva-iris-volume-claim
        - name: task-pv-storage-mvd
          persistentVolumeClaim:
            claimName: jiva-iris-volume-claim-mv
<snips>
volumeMounts:
    - name: task-pv-storage
   mountPath: /data
    - name: task-pv-storage-mvd
   mountPath: /data-mvd

现在,我们将另一个磁盘卷设置为 `/data-mvd` 的多卷扩展存储



设置

以下是使用 System Management进行设置

首先在IRIS的实例中创建数据库,并设置创建数据库的一些基本属性。

我们创建了数据库“mvd”,以及主数据库文件保存路径,在点击Next后,向导页面会有一些新的设置内容:

New Volume Threshold 设置触发扩展存储的数据库大小,在设置后请注意配置参数下方的提示,这个提示非常重要。否则,你又会看到相关的警告。

输入值为零则禁用新卷的自动创建。 如果不为零,当 IRIS.DAT 大小达到到此阈值时,将创建名为 IRIS-0001.VOL 的新数据库文件。 当新数据库文件再次达到阈值时,将创建 IRIS-0002.VOL文件,依此类推。 对于非零值,建议至少设置为 1 TB,以避免文件数量过多。 每个数据库被限制为最多扩展使用200 个数据库文件。

第二步,挂载新的数据库,在挂载之前无法对其进行其他配置。

现在,我们可以在数据库列表中看到 Volumes 的选项

点进Volumes后,我们可以为扩展使用多个数据库文件的数据库设置备用保存位置。



扩展

由于我在这个例子中将数据库阈值设置的相当小,它会生成多个数据库文件。  

为了将IRIS.DAT 中的空间耗尽,我新建了一个命名空间使用该数据库:

在该命名空间中,我调用 ZPM,从 openexchange 安装了点东西,并运行。 
 

irisowner@iris-deezwatts-deployment-7b9bfcff8f-dssln:~$ irissession IRIS

Node: iris-deezwatts-deployment-7b9bfcff8f-dssln, Instance: IRIS

USER>zn "MVD"
MVD>zn "%SYS" d ##class(Security.SSLConfigs).Create("z") s r=##class(%Net.HttpRequest).%New(),r.Server="pm.community.intersystems.com",r.SSLConfiguration="z" d r.Get("/packages/zpm/latest/installer"),$system.OBJ.LoadStream(r.HttpResponse.Data,"c")

Load started on 06/04/2024 13:43:08
Loading file /data/IRIS/mgr/Temp/z9mu1CvnPnaGbA.xml as xml
Imported class: %ZPM.Installer
Compiling class %ZPM.Installer
Compiling routine %ZPM.Installer.1
Load finished successfully.

%SYS>zpm

=============================================================================
|| Welcome to the Package Manager Shell (ZPM). version 0.7.1               ||
|| Enter q/quit to exit the shell. Enter ?/help to view available commands ||
|| Current registry https://pm.community.intersystems.com                  ||
=============================================================================
zpm:%SYS>install "zpm-registry"

在数据库 UI 中查看其属性:

pod 中设置的文件夹下的内容:

irisowner@iris-deezwatts-deployment-7b9bfcff8f-dssln:/data-mvd$ ls -ltr /data-mv*
total 5140
drwxrwxrwx 2 irisowner irisowner   16384 Jun  4 11:56 lost+found
-rw-rw---- 1 irisowner irisowner      20 Jun  4 12:11 iris.dbdir
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:25 IRIS-0022.VOL
irisowner@iris-deezwatts-deployment-7b9bfcff8f-dssln:/data-mvd$ ls -ltr /data/IRIS/mgr/mvd
total 164
drwxrwxrwx 2 irisowner irisowner    4096 Jun  4 11:15 stream
-rw-rw---- 1 irisowner irisowner      63 Jun  4 12:01 iris.lck
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0001.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0002.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0003.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0004.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0005.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0006.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0007.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0008.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0009.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0010.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0012.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0015.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0018.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0016.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0019.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0020.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0017.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0014.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0013.VOL
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 12:11 IRIS-0011.VOL
-rwxrwxrwx 1 irisowner irisowner 5242880 Jun  4 13:08 IRIS.DAT
-rw-rw---- 1 irisowner irisowner 5242880 Jun  4 13:08 IRIS-0021.VOL

看来我有新玩具了!

结论
这篇帖子很短,可能发到讨论区更合适。不过我得回去工作了,这是我从 @jtrog 那看到新特性。期待未来在社区看到更多使用这项功能的分享和体验。

我们峰会上见!

ディスカッション (0)0
続けるにはログインするか新規登録を行ってください
記事
· 2024年7月25日 7m read

Control del crecimiento de la base de datos - Parte 2: Interfaz gráfica

Visualización gráfica de tablas

Aquí documentaremos cómo podéis obtener los resultados de vuestra Data Collection para que se muestren gráficamente. La salida de vuestro proyecto se verá así:

image

Tened en cuenta que estoy trabajando en una máquina local. Si vosotros estáis haciendo esto en un servidor, aseguraos de usar la dirección IP correcta.

ディスカッション (0)1
続けるにはログインするか新規登録を行ってください
ディスカッション
· 2024年7月25日

Share how Developer Community AI helped you for your chance to win

Hey Community!

As you may know, our Developer Community AI has been out for over a month now 🎉 We hope you were curious enough to give it a try 😁 If you haven't yet, please do! Anyway, since it's still in beta, we're very interested in learning what you think about it, and we look forward to hearing your thoughts and experiences.

Since we value your time and effort, we will give away a cute prize to a random member of the Community who shares how DC AI helped you. To participate in this sweepstakes, you have to follow the guidelines:

  • be a member of the Community (InterSystems employees are welcome to participate)
  • write a comment describing how Developer Community AI helped you with your question (don't forget to add a link* to the result) in this discussion.

And this is it! At the end of summer, we will use random.org to choose one lucky owner of our cute little something out of everyone who commented here (and followed the guidelines) — max 5 entries per person.

Good luck!


* To get a link to the answer, click on the share button  under the answer.

22 Comments
ディスカッション (22)13
続けるにはログインするか新規登録を行ってください
ディスカッション (7)2
続けるにはログインするか新規登録を行ってください
記事
· 2024年7月25日 4m read

d[IA]gnosis: developing RAG applications with IRIS for Health

With the introduction of vector data types and the Vector Search functionality in IRIS, a whole world of possibilities opens up for the development of applications and an example of these applications is the one that I recently saw published in a public contest by the Ministry of Health from Valencia in which they requested a tool to assist in ICD-10 coding using AI models.

How could we implement an application similar to the one requested? Let's see what we would need:

  1. List of ICD-10 codes, which we will use as context for our RAG application to search for diagnoses within the plain texts.
  2. A trained model that vectorizes the texts in which we are going to look for equivalences in the ICD-10 codes.
  3. The Python libraries necessary for the ingestion and vectorization of ICD-10 codes and texts.
  4. A friendly front-end that supports texts on which we look for possible diagnoses.
  5. Orchestration of requests received from the front-end.

What does IRIS provide us to cover the above needs?

  1. CSV import, either using the RecordMapper functionality or directly using Embedded Python.
  2. Embedded Python allows us to implement the Python code necessary to generate the vectors using the selected model.
  3. Publish REST APIs that will be invoked from the front-end application.
  4. Interoperability productions that allow tracking of information within IRIS.

Well, we only have to see the developed example:

d[IA]gnosis

Associated with this article you have access to the developed application, in the next articles we will see in detail how we implement each of the functionalities, from the use of the model, the storage of the vectors and the use of vector searches.

Let's review the application:

Importing ICD-10 codes

From the configuration screen we are told the format that the CSV file must comply with the ICD-10 codes that we are going to import. The loading and vectorization process consumes a lot of time and resources, which is why the deployment of the Docker container configures not only the RAM memory usable by Docker but also the disk memory in case the requirements exceed the allocated RAM:

  # iris
  iris:
    init: true
    container_name: iris
    build:
      context: .
      dockerfile: iris/Dockerfile
    ports:
      - 52774:52773
      - 51774:1972
    volumes:
    - ./shared:/shared
    environment:
    - ISC_DATA_DIRECTORY=/shared/durable
    command: --check-caps false --ISCAgent false
    mem_limit: 30G
    memswap_limit: 32G

The file with the ICD-10 codes is available in the project path /shared/cie10/icd10.csv, once 100% is reached the application will be ready to be used.

In our application we have defined two different functionalities for diagnostic coding, one based on HL7 messages received in the system and another based on plain texts.

Diagnostic capture from HL7

The project contains some HL7 messages prepared for testing, it is only necessary to copy the /shared/hl7/messagesa01_en.hl7 file to the /shared/HL7In folder and the associated production will be responsible for extracting the diagnosis from it to display it in the web application:

From the diagnosis requests screen we can see all the diagnoses received via HL7 messaging. To code them to ICD-10 we only need to click on the magnifying glass to show a list of those ICD-10 codes closest to the diagnosis received:

Once selected, we will see the diagnosis and its associated ICD-10 code in the list. By clicking on the button with the envelope icon, a message will be generated using the original and including the new one selected within the diagnosis segment:

MSH|^~\&|HIS|HULP|EMPI||||ADT^A08|592956|P|2.5.1
EVN|A01|
PID|||1556655212^^^SERMAS^SN~922210^^^HULP^PI||GARCÍA PÉREZ^JUAN^^^||20150403|M|||PASEO PEDRO ÁLVAREZ 195 1 CENTRO^^LEGANÉS^MADRID^28379^SPAIN||555283055^PRN^^JUAN.GARCIA@YAHOO.COM|||||||||||||||||N|
PV1||N
DG1|1||O10.91^Unspecified pre-existing hypertension complicating pregnancy^CIE10-ES|Gestational hypertension||A||

This message can be found in the path /shared/HL7Out

Screenshots of diagnoses in plaintext

From the Text Analyzer option, the user can include plain text on which an analysis process will be carried out. The application will search in tuples of 3 lemmatized words (eliminating articles, pronouns and other less relevant words). Once analyzed, the system will show us the relevant underlined text and the possible diagnoses located:

Once the analysis has been carried out, it can be consulted at any time from the analysis history.

Analysis history

All analyzes carried out are recorded and can be consulted at any time, being able to view all possible ICD-10 codes available:

In the next article...

We will see how, using Embedded Python, we use a specific LLM model for the vectorization of both the ICD-10 codes that we will use as context and the free texts.

If you have any questions or suggestions, do not hesitate to write a comment on the article.

2 Comments
ディスカッション (2)3
続けるにはログインするか新規登録を行ってください