Cloud-Speicher spiegelt sich im Glas eines Bürogebäudes.Wolken spiegeln sich im Glas eines Bürogebäudes.

3 June 2022

From cloud storage to information management in the cloud

When one mentions cloud for enterprise users, three things usually come to mind: consuming applications from the cloud (SaaS), using computing resources in the cloud and storing information in the cloud.

For private users, storing information in the cloud is very easy and efficient: one can just book (and cancel) storage space flexibly in any number of providers (Google Drive, iCloud, OneDrive, Dropbox), drag and drop files (music, videos, photos, documents) and even rudimentary structure and search those files.

If you are an enterprise user, you will likely book some kind of S3 compatible object store together with computing resources and basically be able to store and retrieve your application files and documents in a kind of a “cloud file system”. This is where the road usually ends for both private and business users.

Information management in the cloud requires much more than a simple object store. There are several principles governing “intelligent” document storage which goes beyond storing and retrieving binary files:

  • Typing: You would ideally like to give semantics to your files, e.g., being able to classify them in your business context as invoices, contracts, personnel records, or whatever kind of document you need to store.
  • Flexible attributes: For each type it should be possible for you to define custom attributes (metadata) which describe this document type (category), e.g., an invoice is described by issuing and receiving company name, amount, tax and due date. It is also important to be able to flexibly change types and attributes of a document during the lifecycle of a document. Ideally, AI components will assist you in the classification and description of documents you put in the system.
  • Search: You will use the system primarily to be able to find your documents quickly. Documents should be searchable based on their attributes (metadata), content (if available) or both, combined. Ideally, for images and other documents which are not machine readable, OCR and automated image tagging should be available as an option.
  • Identity management and access rights: It should be possible to define access rights for each user and for each document, ideally based on document attributes. For example, a given user (role) can access only invoices of a certain amount. Furthermore, seamless integration into existing identity management must be possible without the need to replicate or synchronize users. Basically, you will log on into whichever authentication system you use and will be automatically made known to your cloud document storage.
  • Folders: Documents are structured in fixed (also nested) folders like in Windows explorer but can also be placed in dynamic folders. Those are the folders into which documents are placed automatically, based on their metadata. Simply by changing document metadata you are effectively “moving” your documents. Physically, the document is not moved on the underlying storage.
  • Archiving: If you need to store documents in the cloud with legal constraints, for example archiving incoming and outgoing invoices for the purpose of potential revision, you effectively need records management. This encompasses a series of features (as well as certifications). For example, history of all changes on an object must be persisted in a non-repudiated audit log and each change should generate a new document version, with the option to revert to older versions flexibly. Retention time management is also necessary to protect sensitive documents from accidental deletion as well as to ensure that they are stored for the minimum legal amount of time.

Those are just some of the basic criteria which separate plain cloud object store for saving your vacation photos from the “cloud object store on steroids” for business use. You can use this set of criteria to judge if a given cloud platform for information management fulfills your long-term needs for enterprise cloud storage. Do not forget to add your own development costs and time and effort required that come with trying to implement all these features from scratch, on your own. And of course, you need to factor in inevitable delays.

But this is just the tip of the iceberg. Should you like to not only store and retrieve documents (that is perform CRUD – Create, Read, Update and Delete – operations) but also to build business applications on top of this storage, you should look further for the following functions:

  • Document preview: Regardless of the document format, the platform should be able to generate a preview that is visible in a standard browser on the fly. Even better: a repository or a cache to make these previews available quickly.
  • Frontend: If you wish to build a SaaS application based on such document storage, you should design it to appeal to the end users, but also build it fast and incrementally. To that purpose, it is beneficial if the platform offers a standard browsing interface for the document repository. Even better: having a way to customize this interface by extending it programmatically.
  • Workflow: Finally, the purpose of managing documents digitally is to not only store and find them, but to collaborate on them with other people. Only then, the true benefit of an information management system fully emerges. To that purpose, your platform of choice should provide at least the rudimentary workflow capabilities. These are the ability to execute simple, predefined workflows such as invoice approval and assign tasks to other colleagues. It is even better if the platform allows you to customize (model) workflows yourself, define rules and forms, and seamlessly integrate workflows into your organization.

A word of caution about the pricing: It may be lucrative for you to look into products that offer less functionality for a lower price, to cut initial costs. Try to avoid those, because you will not be able to assess the true value of the platform and the costs will not be entirely transparent. Look into models based on the number of documents and / or users instead. It is always better to experience the full power of the tool and pay fair, per use. Almost all platforms will offer you a free trial, so look also for those that will not limit the function in a trial version. After that, look into service level agreements (SLA). How often will a backup be performed? How long does it take to restore a system in case of a catastrophic failure? Which security standards are used? Can documents be encrypted and by whom? There are even products that will offer you production grade SLAs during evaluation. In other words: evaluate by using the fullest functional scope possible, and if you like the product, convert quickly.

Ein Mann beäugt sich vor einem Mediathek-Spiegel.

Author

Dr. Nikola Milanovic, Chief Technology Officer

Dr. Nikola Milanovic has been responsible for product development at OPTIMAL SYSTEMS since 2014. He oversees software development for the product lines enaio® and yuuvis®, as well as quality assurance, maintenance, and agile development processes.

Do you have any questions?
Get in touch with us!