Article
· Jan 31, 2018 3m read

Container - What is a Container?

Containers

With the launch of InterSystems IRIS Data Platform, we provide our product even  in a Docker container. But what is a container?

The fundamental container definition is that of a sandbox for a process.  

Containers are software-defined packages that have some similarities to virtual machines (VM) like for example they can be executed. 

Containers provide isolation without a full OS emulation. Containers are therefore much lighter than a VM. 

In their essence, containers are an answer to the issue of how to reliably move an application from a system to another and guarantee that it will work. By encapsulating all application dependencies inside a container and creating a process isolation space, there is a higher degree of guarantee that the application solution will run when moved between platforms.  

An operating system allows us to run processes. These processes share address spaces, namespaces, cgroup, etc., and in general, they have access to the whole OS environment and the OS schedules and manages them. All that is a good thing, however, what if we wanted to isolate a particular process or number of processes to run a specific task, operation or service? In short, that capability to isolate a process is what containers offer us. Hence we could define a container as a sandbox for a process. 

What is a sandbox, then? It is an isolation level within which a container has its process. This feature is implemented via the Linux kernel feature called namespaces (https://en.wikipedia.org/wiki/Linux_namespaces) which also allows for other important system parts to be sandboxed like network interfaces, mount points, interprocess communications (IPC) and universal time-sharing (uts). 

The container or sandbox can also be governed or controlled via another kernel feature called control groups or cgroups (https://en.wikipedia.org/wiki/Cgroups). The rules we give the containers are used to make sure the container can be a good neighbor in sharing the resources with other containers and the host. 

To understand how a container differs from a VM, we could use the analogy that a VM is like a house while a container is like an apartment.  

VMs are self-contained and independent like a single standing house. Each house has its infrastructure: plumbing, heating, electrical, etc. A house also has minimum requirements (at least 1 bedroom, 1 roof, etc.). 

Containers instead are built to leverage a shared infrastructure so we can compare them to an apartment. The apartment building shares the plumbing, heating, electrical system, main entrance, lifts, etc.  In the same way, containers leverage the available resources of the host via the Linux kernel. Also, consider that apartments come in different sizes and shapes. 

Because containers do not have a full OS but only the minimum required Linux OS needed,  like some executables in /bin, some configuration and definition files in /etc and few other files, they can be very small in size which makes them very nimble when it's time to move them around or spin them up in 1 second flat.  That translates into agility from the moment one builds them, throughout the provisioning pipeline of the software factory and all the way to the final run in production. Incidentally, containers fit like gloves in a CI/CD microservices architecture context but that's another story.

The processes in the container are tightly coupled with the lifecycle of the container. When I start a container I typically want all the services of my app to be up and running (as an example, think of port 80 for one web server container and of port 57772 and 1972 for an InterSystems IRIS container). When I stop the container all the processes will be stopped too. 

What I have described in this post is the fundamental notion of the runtime of a container, it's sandbox that isolates its processes from the host and other containers. 

There is another part to understanding containers that is about their images. That will be covered in a second post.

Discussion (20)4
Log in or sign up to continue

Is it possible to create own image with IRIS based on Ubuntu? Currently available distributive does not support it, and where to get isc-main.

It is good that you prepared the image, but you also set a password there, so, to use this image I have to change a password, and it means that my image will be uselessly bigger with at least about 176Mb in my case. I changed a password by instruction in the documentation.

Hi Dmitry,

Thank you for downloading our InterSystems IRIS data platform container.

Our images are carefully crafted, dependencies are checked and even pinned so we know exactly what we ship. We further test them regularly for security vulnerabilities. By the time our images are published they are a safe bet for you to use. In general, we expect our customers to derive theirs from the published one so you only have to worry about implementing your app-solution in it.

However, I understand that you might want to create your custom image. We could make isc-main available if there is the request for it.

Password: we do not want to be in the news like a known database that recently was discovered with thousands of instances in the cloud up and running with default credentials. We are forcing you to do the right thing otherwise it's too easy with containers to ignore this and as you can appreciate it's not a safe practice.

This is true also when you'll use the InterSystems Cloud Manager to provision an InterSystems IRIS data platform cloud cluster. If you forgot to define the password for the system users, you will be forced to create one before the services are run.

Yes, in this key it is a good idea to use a password. But I would prefer to have unauthenticated access locally inside the container with csession. So, in this case, I can install and configure everything that I need in my image. But, the instance still will be secured from outside.

But anyway I need more control on what should appear in IRIS. If I don't need yet DeepSee, Ensemble features and so on. Installed IRIS inside container have size more than 1GB,  but even just remove not needed files for me dev folder, even maybe whole CSP folder (offered external CSPgateway container, should be interesting to inspect). It is possible to reduce size in few hundreds of megabytes more.

root@bca19b7cb221:/# du -hd1 /usr/cachesys/ | sort -h
12K     /usr/cachesys/devuser
92K     /usr/cachesys/SNMP
112K    /usr/cachesys/patrol
264K    /usr/cachesys/lib
300K    /usr/cachesys/samples
400K    /usr/cachesys/doc
1.2M    /usr/cachesys/docs
3.9M    /usr/cachesys/httpd
33M     /usr/cachesys/fop
49M     /usr/cachesys/dist
58M     /usr/cachesys/dev
85M     /usr/cachesys/csp
241M    /usr/cachesys/bin
642M    /usr/cachesys/mgr
1.1G    /usr/cachesys/

root@bca19b7cb221:/# find /usr/cachesys/mgr/ -name *.DAT -exec ls -lah {} \;
-rw-rw---- 1 root cacheusr 888K Jan 29 22:43 /usr/cachesys/mgr/cachetemp/CACHE.DAT
-rw-rw---- 1 root cacheusr 160M Jan 29 22:43 /usr/cachesys/mgr/enslib/CACHE.DAT
-rw-rw---- 1 root cacheusr 385M Jan 29 22:43 /usr/cachesys/mgr/cachelib/CACHE.DAT
-rw-rw---- 1 root cacheusr 1.0M Jan 29 22:43 /usr/cachesys/mgr/user/CACHE.DAT
-rw-rw---- 1 root cacheusr 1.0M Jan 29 22:43 /usr/cachesys/mgr/cacheaudit/CACHE.DAT
-rw-rw---- 1 root cacheusr 1.0M Jan 29 22:43 /usr/cachesys/mgr/cache/CACHE.DAT
-rw-r----- 1 root cacheusr 65M Jan 29 22:43 /usr/cachesys/mgr/CACHE.DAT

root@bca19b7cb221:/# du -h /usr/local/etc/cachesys/
106M    /usr/local/etc/cachesys/

I am not sure if I really need everything in /usr/cachesys/bin/ - 241MB
 and /usr/local/etc/cachesys/ - 106MB, this two folders plus CACHESYS - 65MB and CACHELIB - 385MB (how much it will be if separate DeepSee and other unnecessary staff), will be around 800MB. While currently image layer size 1.24GB.

Container was started with entrypoint bash. So IRIS has not started, yet.

So to clarify - Cache will NOT have durable %SYS support and not be fully supported under Docker? Or Cache will support durable %SYS in the next version, which is not yet released?

The only things I've heard so far about upgrading from Cache to IRIS is that "the installer doesn't support upgrading anyhow" (from Benjamin DeBoe) and that all our sites would require "a different license key" (from the WRC IRIS Distribution page) - which sounds a little less convenient and easy :)

Hi Sebastian,

Caché and Ensemble will be fully supported in a Docker container. However, for the reason expressed above, right now there is no plan to offer Durable %SYS in Caché and Ensemble.

InterSystems IRIS is a new product so things are different. You will find that some features are deprecated in favour of others, etc. One of them is the license key. In general, in my comment above, I was referring -at least in my mind, to a licensing plan that you will hopefully find more flexible and favorable.

NP Sebastian.

You could handle %SYS persistence with creativity :)

Given that you'd have a script or a container orchestrator tool (Mesosphere, Kubernetes, Rancher, Nomand, etc.) to handle running a new version of your app, you'd have to factor in exporting those credentials and security setting before stopping your container. Then as you spin up the new one you'd import the same. It's a workaround if you like but doable, IMO.

Having a DB you probably have to think about schema migration and other things anyway so... 

HTH

Yeah, my "creative" solution prior to the durable %SYS announcement was to:

  • Export all system parameters to a volume every 30 seconds in case of crash (reasoning that they probably wouldn't change that often). And import them prior to Cache boot.
  • Configure the WIJ to be stored in a volume
  • Configure the journals to be stored in a volume
  • Possibly store the audit database on a volume? Idk we don't currently make use of it.

I hit a showstopper as it does not seem possible to relocate the journal.log file (via symlink or any other means). I did submit a product development request to change this, unfortunately it was denied. It seems without this file the journals cannot be recovered so we can never truly separate code and data in Cache :(

I noticed that the durable %SYS feature relocates everything in the mgr directory except the CACHELIB database. I was hesitant to do that because I wasn't sure how a 2017.2 Cache system would handle 2017.1 data after version upgrade. I'm still not 100% sure to be honest - would this be a "fully supported" use of Cache under Docker?

Of course, if you have any other creative thoughts I'm all ears!

Hmm, in my tests I remember Cache not starting cleanly without a journal.log present. I'm not sure if this is still the case it's been quite a while since I tried this. I'll have to take another look.

I was more concerned as to how Cache (eg. CACHELIB - the code) would react to different versions of the CACHE system databases (eg. CACHESYS/CACHE - the system data). What I'm understanding from you is that using the CACHESYS/CACHE databases from a 2017.1 system in a 2017.2 docker image, with no Cache upgrade would be fully supported - correct?

In short:
a container is a sandbox that runs a single process. This in turn can spawn off other processes. For us, that is isc-main, which will take care of running on the necessary processes (write daemons etc). But as soon as that first process goes away, the container is being closed
In a classical VM a complete machine is being simulated. Which in turn stars up a complete OS and not just a single process context.
There are plenty of articles that explain the difference in more detail (for example: https://blog.netapp.com/blogs/containers-vs-vms/)
(It's important to notice that on some OSes[windows + macos] docker is 'cheating' by creating a hidden linux VM to run the docker images)