Data Integrity in the Cloud
Reading Time: 2 minutes

We hear a lot about data ‘in the cloud’, and if you’re working in a regulated environment like biotech/pharma, the idea of your data floating around in the ether may have you a little freaked out. But, whether your data is in the cloud, or stored on internal servers, making sure that your data meets the principles of “ALCOA” (attributable, legible, contemporaneous, original, and accurate) is one of the keys to data integrity.

Maintaining data integrity is critical to ensure that the data can be used to make good decisions and deliver high quality products. For regulated industries, data integrity is defined as, “the degree to which a collection of data is complete, consistent, and accurate.” (FDA Glossary of Computer Systems Software Development Terminology)

Think about storage of scientific data on paper. We use ‘good documentation practices’ to make sure that the data is “ALCOA”. For electronic data, these good documentation principles are still a critical part of data integrity, and the ‘leap’ from paper to electronic (and subsequently to the ‘cloud’) is not as far-fetched as you might think.


With paper records, anyone generating or altering the data had to sign or initial the paper copies. For electronic records, the use of comprehensive and permanently associated audit trails ensure that the data is attributable to the person(s) and system(s) that generated it, and include who did what, why, and when.
Data Integrity in the Cloud

Obviously, electronic data can be easier to read than handwriting on a piece of paper, especially as the paper ages, or suffers mishaps like spilled coffee. For electronic data, this translates to the data being permanently recorded on durable media, and always available for review and retrieval.


When recording data in the lab, one of the first things analysts are told is to record directly into the lab notebook at the time the data is generated, and not to write data on post-its, gloves, paper towels, etc. With today’s devices, data can easily be electronically recorded and stored at the time it is generated, with time/date stamps so that the sequence of events can be easily followed.


It is getting harder to identify the ‘original’ when looking at paper records. Ensure that the inherent quality of the data, paper or electronic, is preserved for original source data as well as copied records. Copies, including backup/archive copies, must be verified as accurate and true, preserving the content and meaning of the original, with the data traceable to its origins.


Whether results are recorded on paper or electronically, it is essential that they are generated by validated systems. Without the proper tools in place, changes to electronic data can be hard to see. Implement systems that record changes to data, as well as the who, what, when, and why of those changes in an easy-to-read audit trail.

If these principles are implemented for cloud-based data, along with proper data governance and security, data integrity assurance in the cloud may not be as nerve-wracking as you think.

This blog was penned by Paige Holst, Senior Consultant at LabAnswer. LabAnswer is a proud supporter of Denodo DataFest.