検索 | InterSystems Developer Community

この投稿は古いことに注意してください。

InterSystems公式

Fabiano Sanches · 2023年9月18日

#アラート #InterSystems IRIS #Health Connect #HealthShare #InterSystems IRIS for Health #InterSystems公式

InterSystems has corrected two defects regarding connectivity. These defects and their corrections are independent of each other.

This alert addresses them both because there are point releases containing both corrections.

Both defects only impact versions 2019.1.4 and 2020.1.4 of:

InterSystems IRIS®
InterSystems IRIS for Health™
HealthShare® Health Connect

Neither defect impacts any released version of HealthShare Unified Care Record®, Information Exchange, Health Insight, Patient Index, Provider Directory, Care Community, Personal Community, or Healthcare Action Engine.

The first defect causes failed login attempts to hang for 60 seconds before returning. The correction reduces this to two seconds and provides a better notification message. The correction is identified as DP-421918.

The second defect causes a <PROTECT> error in OAuth2 clients configured on an InterSystems IRIS instance in /csp/sys/oauth2/OAuth2.JWTServer.cls. The correction is identified as DP-418534.

InterSystems has replaced the original distributions with point releases to make these corrections available on an expedited basis. The relevant version identifiers are:

Original posting	Point Release
2019.1.4.755.0	2019.1.4.756.1
2020.1.4.536.0	2020.1.4.538.1

The corrections are also available via Ad hoc distribution.

If you have any questions regarding this alert, please contact the Worldwide Response Center.

ディスカッション (0)1

続けるにはログインするか新規登録を行ってください

記事

Dmitry Maslennikov · 2023年9月18日 7m read

Open Exchange

#Embedded Python #SQL #Vector Search #InterSystems IRIS

Nowadays so much noise around LLM, AI, and so on. Vector databases are kind of a part of it, and already many different realizations for the support in the world outside of IRIS.

Why Vector?

Similarity Search: Vectors allow for efficient similarity search, such as finding the most similar items or documents in a dataset. Traditional relational databases are designed for exact match searches, which are not suitable for tasks like image or text similarity search.
Flexibility: Vector representations are versatile and can be derived from various data types, such as text (via embeddings like Word2Vec, BERT), images (via deep learning models), and more.
Cross-Modal Searches: Vectors enable searching across different data modalities. For instance, given a vector representation of an image, one can search for similar images or related texts in a multimodal database.

And many other reasons.

So, for this pyhon contest, I decided to try to implement this support. And unfortunately I did not manage to finish it in time, below I'll explain why.

There are a few major things, that have to be done, to make it full

Accept and store vectorized data, with SQL, simple example, (3 in this example is the amount of dimensions, it's fixed per field, and all vectors in the field have to have exact dimensions)
```
create table items(embedding vector(3));
insert into items (embedding) values ('[1,2,3]');
insert into items (embedding) values ('[4,5,6]');
```

Similarity functions, there are different algorithms for similarity, suitable for a simple search on a small amount of data, without using indexes

-- Euclidean distance
select embedding, vector.l2_distance(embedding, '[9,8,7]') distance from items order by distance;
-- Cosine similarity
select embedding, vector.cosine_distance(embedding, '[9,8,7]') distance from items order by distance;
-- Inner product
select embedding, -vector.inner_product(embedding, '[9,8,7]') distance from items order by distance;

Custom index, which helps with a faster search on a big amount of data, indexes can use a different algorithm, and use different distance functions from above, and some other options
- HNSW
- Inverted file index
The search just will use the created index and its algorithm will find the requested information.

Insert vectors

The vector is expected to be an array of numeric values, which could be integers or floats, as well as signed or not. In IRIS we can store it just as $listbuild, it has a good representation, it's already supported, only needed to implement conversion from ODBC to logical.

Then the values can be inserted as plain text using external drivers such as ODBC/JDBC or from just inside IRIS with ObjectScript

Plain SQL

insert into items (embedding) values ('[1,2,3]');

From ObjectScript

set rs = ##class(%SQL.Statement).%ExecDirect(, "insert into test.items (embedding) values ('[1,2,3]')")

set rs = ##class(%SQL.Statement).%ExecDirect(, "insert into test.items (embedding) values (?)", $listbuild(2,3,4))

Or Embedded SQL

&sql(insert into test.items (embedding) values ('[1,2,3]'))

set val = $listbuild(2,3,4)
&sql(insert into test.items (embedding) values (:val))

It will always be stored as $lb(), and returned back in textual format in ODBC

Unexpected behaviour

Calculations

Mainly vectors are required to support the calculation of distances between two vectors

For the contest, I needed to use embedded Python, and here comes an issue, how to operate with $lb in Embedded Python. There is a method ToList in %SYS.Class, but Python package iris does not have it builtin, and needs to call it ObjectScript way

ClassMethod l2DistancePy(v1 As dc.vector.type, v2 As dc.vector.type) As %Decimal(SCALE=10) [ Language = python, SqlName = l2_distance_py, SqlProc ]
{
    import iris 
    import math
    
    vector_type = iris.cls('dc.vector.type')
    v1 = iris.cls('%SYS.Python').ToList(vector_type.Normalize(v1))
    v2 = iris.cls('%SYS.Python').ToList(vector_type.Normalize(v2))

    return math.sqrt(sum([(val1 - val2) ** 2 for val1, val2 in zip(v1, v2)]))
}

It does not look right at all. I would prefer that $lb could be interpreted on a fly as list in python, or at list builtin functions to_list and from_list

Another issue is when I tried to test this function using different ways. Using SQL from Embedded Python that uses SQL Function written in Embedded Python, it will crash. So, I had to add ObjectScript's functions as well.

ModuleNotFoundError: No module named 'dc'
SQL Function VECTOR.NORM_PY failed with error:  SQLCODE=-400,%msg=ERROR #5002: ObjectScript error: <OBJECT DISPATCH>%0AmBm3l0tudf^%sqlcq.USER.cls37.1 *python object not found

Currently implemented functions to calculate distance, both in Python and ObjectScript

Euclidean distance

[SQL]_system@localhost:USER> select embedding, vector.l2_distance_py(embedding, '[9,8,7]') distance from items order by distance;
+-----------+----------------------+
| embedding | distance             |
+-----------+----------------------+
| [4,5,6]   | 5.91607978309961613  |
| [1,2,3]   | 10.77032961426900748 |
+-----------+----------------------+
2 rows in set
Time: 0.011s
[SQL]_system@localhost:USER> select embedding, vector.l2_distance(embedding, '[9,8,7]') distance from items order by distance;
+-----------+----------------------+
| embedding | distance             |
+-----------+----------------------+
| [4,5,6]   | 5.916079783099616045 |
| [1,2,3]   | 10.77032961426900807 |
+-----------+----------------------+
2 rows in set
Time: 0.012s

Cosine similarity

[SQL]_system@localhost:USER> select embedding, vector.cosine_distance(embedding, '[9,8,7]') distance from items order by distance;
+-----------+---------------------+
| embedding | distance            |
+-----------+---------------------+
| [4,5,6]   | .034536677566264152 |
| [1,2,3]   | .11734101007866331  |
+-----------+---------------------+
2 rows in set
Time: 0.034s
[SQL]_system@localhost:USER> select embedding, vector.cosine_distance_py(embedding, '[9,8,7]') distance from items order by distance;
+-----------+-----------------------+
| embedding | distance              |
+-----------+-----------------------+
| [4,5,6]   | .03453667756626421781 |
| [1,2,3]   | .1173410100786632659  |
+-----------+-----------------------+
2 rows in set
Time: 0.025s

Inner product

[SQL]_system@localhost:USER> select embedding, vector.inner_product_py(embedding, '[9,8,7]') distance from items order by distance;
+-----------+----------+
| embedding | distance |
+-----------+----------+
| [1,2,3]   | 46       |
| [4,5,6]   | 118      |
+-----------+----------+
2 rows in set
Time: 0.035s
[SQL]_system@localhost:USER> select embedding, vector.inner_product(embedding, '[9,8,7]') distance from items order by distance;
+-----------+----------+
| embedding | distance |
+-----------+----------+
| [1,2,3]   | 46       |
| [4,5,6]   | 118      |
+-----------+----------+
2 rows in set
Time: 0.032s

Additionally Implemented mathematical functions, add, sub, div, mul. InterSystems support create own aggregate functions. So, it could be possible to sum all vectors or find the avg. But unfortunately, InterSystems does not support using the same name and needs use own name (and schema) for function. But it does not support non-numeric result for aggregate function

Simple vector_add function, which returns a sum of two vectors

When used as an aggregate, it shows 0, while the expected vector too

Build an index

Unfortunately, I did not manage to finish this part, due to some obstacles I faced during realization.

The lack of builtin $lb to python list conversions and back when vector in IRIS stored in $lb, and all the logic with building index is expected to be in Python, it's important to get data from $lb and set it back to globals too
lack of support for globals
- $Order in IRIS, supports direction, so it can be used in reverse, while order realization in Python Embedded does not have it, so it will require reading all keys and reversing them or storing the end somewhere
Have doubts due to bad experience with Python's SQL functions, called from Python mentioned above
During the building index, was expected to store distances in the graph between vectors, but faced a bug with storing float numbers in global

I opened 11 issues with Embedded Python I found during the work, so most of the time to find workarounds to solve issues. With help from @Guillaume Rongier project named iris-dollar-list I managed to solve some issues.

Installation

Anyway it is still available and can be installed with IPM, and used even with limited functionality

zpm "install vector"

Or in development mode with docker-compose

git clone https://github.com/caretdev/iris-vector.git
cd iris-vector
docker-compose up -d

7 Comments

ディスカッション (7)5

続けるにはログインするか新規登録を行ってください

記事

Oleksandr Zaitsev · 2023年9月17日 2m read

Open Exchange

#Python #SQL #ツール #InterSystems IRIS #Open Exchange

Enhanced Password Management: Edit Passwords Seamlessly

In the ever-evolving landscape of digital security, robust password management tools have become indispensable. Our password management application, designed to simplify and secure your online life, now comes with an enhanced feature – the ability to edit passwords with ease.

Why is this feature a game-changer?

Flexibility: Life is dynamic, and so are our online accounts. With the new edit password feature, you have the flexibility to modify your saved passwords whenever you need to. Whether you want to change a password due to security concerns or simply update it, this feature allows you to adapt effortlessly.
Streamlined Experience: Editing passwords is seamless and user-friendly. No more tedious processes or creating new entries from scratch. Just a few clicks, and your password is updated, keeping your records organized and up-to-date.
Enhanced Security: We prioritize security above all else. The edit password functionality ensures that your updated password is encrypted using your existing encryption key. This means that even when modifying passwords, your data remains protected.
Personalization: Your passwords, your way. Customize titles, logins, and passwords as needed. This feature enables you to make your password manager truly personal, fitting your unique preferences and organization style.

How it works:

Log in to your account.
Navigate to the password you want to edit.
Click the 'Edit' icon.
Modify the password title, login, or password itself.
Save your changes.
Your updated password is now securely stored and ready to use.

Stay Secure, Stay Organized:

With the enhanced edit password feature, our password manager offers an even more comprehensive solution for your security needs. Stay secure, stay organized, and manage your passwords with confidence.

What's Next:

Our commitment to improving your digital security experience doesn't stop here. We are continuously working on enhancing our password manager with new features and capabilities. Stay tuned for more updates and innovations as we strive to make your online life simpler and more secure.

Try out the edit password feature today and experience the convenience of effortless password management.

ディスカッション (0)1

続けるにはログインするか新規登録を行ってください

質問

Christine Nyamu · 2023年9月14日

#SQL #InterSystems IRIS for Health

I need to run a SQL query and use the output to map PV1 7.1. The query is :

SELECT ID
FROM TestTable
WHERE ProviderName = 'TEST,PROVIDER' AND IDType= 'BPI'

When I run this query with the 'TEST PROVIDER' I do pull the ID in question but I can't figure out how to do it from the DTL given that there are various providers sent in PV 1 7 . Any assistance will be greatly appreciated.

8 Comments

ディスカッション (8)2

続けるにはログインするか新規登録を行ってください

質問

Diane Steidler · 2023年9月9日

#InterSystems IRIS

Does anyone have experience compacting and truncating IRIS datasets that are greater than 10 TB in size? How long did it take and what was the size? Did you run into any issues?

Thanks.

Diane

4 Comments

ディスカッション (4)1

続けるにはログインするか新規登録を行ってください

検索

Sep. 18, 2023 – Alert: Failed login handling and OAuth2 client errors

Vectors support, well almost

Insert vectors

Calculations

Build an index

Installation

Enhanced Password Management: Edit Passwords Seamlessly

How to do a SQL query in DTL and map PV1 7.1 to results of query

Compacting and Truncating very large datasets