No Data Lineage? No Trust!
Reading Time: 4 minutes

The awareness of what we know is not just a nice-to-have, but the very essence of knowledge itself. Knowledge does not exist by acts of faith; it requires transparency, trust, and communicability. “Knowing something” means knowing the entire path that led us to this something, because knowledge is not a goal, but a path.

This is the spirit of data lineage: to give the path the same dignity as the goal, so that anyone who uses data can know its origin, can trust it, can evaluate it with reasonable autonomy if the path that elevated the elementary data to information constitutes a sort of certification and, therefore, allows for the safe use of this data, in line with common objectives and expectations.

Data lineage, in short, is the founding support for any data management solution, without which it would surely be incomplete, capable of only showing what emerges on the surface, but hiding that which is hidden beneath it.

We are always in chase mode

The world we live in moves with us and often moves so fast that it seems to elude us. We are forced to constantly deal with what we know about the world, in the hopes that continuous observation, and what derives from it, brings us into symbiosis with a reality that continually resists our attempts to understand it.

If in the physical world we primarily make observations through our senses, in the digital world we can only make observations through data, knowing that simple observation is only the first step of an articulated journey, whose stages are gradually richer in value, but at the same time more and more difficult to reach.

As users of data we are travelers, therefore, not moving in time and space, but in the rough terrain of meaning. Physically for the most part still, we are intellectually moving, in the continuous pursuit of a goal that moves with us and that, when we seem to have it in the frame and we are ready to shoot, it moves again, leaving us with its blurred vision, only partially usable.

As users of data we are photographers too, capturing images in the hopes of being able to re-compose them in an overall pattern, which, observed from the right distance, provides a vision that enables us to grasp the intricate connections between the individual elements. This overall vision would enrich our data collection, providing not only a memory of time, but also a fundamental support for our desire for exploration.

In this continuous confrontation with the reality in which we live, we must take note that everything has its place and its moment, knowing that each single “photo” has exactly the same importance as the entire photographic collection of our entire existence.

We shoot a series of photos (data) to tell something (information), in the hopes that what we have experienced and captured improves the image (knowledge) we have of what surrounds us and puts us in a position to make improvements (wisdom).

Our path through knowledge must be always remembered, because forgetting the point we started from inexorably limits our enjoyment in arriving at our destination. Our goal and our path are complementary elements of equal value, as one does not exist without the other, because if the goal repays the effort, the path contributes to making it sustainable. For this reason, we cannot appreciate what anything is if we forget what it was. When we talk about data, the path through it, and our ability to reach our goals and return home safely, this is the essence of data lineage, a sort of encyclopedia of the data that enables everyone, at all times, to know how each single data element has contributed to forming the information capital.

It is always a matter of discipline

No discipline is static, and none lives outside of how it is practiced. Each discipline is something alive, which is realized in the use made of it and, as in all things, this use is a combination of moments, each of which contributes to strengthening its role, to making a discipline an ideal bridge between wanting to do something and actually doing it in the best way.

It is important to assume that every single moment is of equal importance, regardless of the viewpoints of each individual that experiences them. Because what is useful for one person may not be useful for another, and vice versa.

This is the consideration that I believe should be firmly established in those who propose data management solutions. Their objective – or rather, their responsibility – must be to enable the most flexible use of data, which in turn enables stakeholders to find the hidden potential in the data, while also ensuring that this occurs in full control and in compliance with governance rules derived from internal policies or external legislation.

If this principle is embraced, I hope it is then evident that data lineage has a very important play here, guaranteeing that the trace and visibility of what happened, of what, starting from the primordial soup where the elementary data came to life, is laid bare for a full understanding. A fully traceable data lineage enables the data to evolve, to the point at which it becomes an essential component of a correct, fully governed, data management solution. It is no coincidence that today we speak of data driven companies – this speaks of the many critical data-oriented decisions that such companies make, every day.

If we think of knowledge as a heritage, then we can imagine awareness as its statement of account, something that allows us to demonstrate our control over what we know.

However, we must consider that what we know does not always form and develop totally under our control, indeed, in a world where collective knowledge takes on more and more strength, we must expect that this heritage is the result of the action of many, where each plays their role, in a harmonious ensemble that enhances individual skills and combines them in a holism of knowledge.

And then, just as in an orchestra where everyone puts full trust in the skills of his companions, even on the path to knowledge it is necessary to have confidence in what is gradually defined and formalized, constituting the embryo of what will come next.

Trust is an attitude towards others, which results from a positive evaluation of the facts and which leads us to trust in others’ own possibilities. So, thanks to data lineage, we have access to all of the relevant facts, and we are able, at any moment, to know why we have what we have, corroborating this sentiment with factual observation, which increase its strength,  or – and here we hope this does not happen – question it.

Data lineage is transparency, on which trust is built. Data lineage is the fundamental lever for strengthening awareness; without it, what we know will probably remain an unrealized potential. Without data lineage, there can be no trust.

Andrea Zinno