ChatGPT and Data Fabric are Streamlining the Field of Business Data
Reading Time: 3 minutes

Artificial intelligence (AI) technologies are poised to bring about a revolution in the business world. Disruptive, large language model (LLM) technologies like ChatGPT, OPT, CodeGen, and PaLM 2 are about to transform both the way we communicate with applications and the way we access data.

This technology has a well-known ability to surprise us when we request a summary of a complex topic, or when it automatically generates our presentations. However, LLM capabilities do not stop there, as they can also generate code that business applications can understand, simply from natural text queries. You can ask an LLM to write code for an iPhone application or a website, and it will do just that, and often more.

ChatGPT-based bot engines can, for example, translate a user’s natural-language query (or even in spoken language, through “speech-to-text” technology) into a SQL expression that can be launched on a business application.

ChatGPT Meets Data Fabric

This is where unified data access interfaces, like the data fabric enabled by the Denodo Platform, come into play. This enables business data to be queried in a unified manner, even if the sources are in different business repositories, such as on-premises (in databases, data lakes, etc.) or in the cloud (in SaaS applications like Salesforce, Workday, etc.). All the while, the data fabric maintains compliance with any data governance and security rules dictated by the organization.

Denodo has combined the power of ChatGPT and data fabric to offer users the best of both worlds: Users will be able to access information by expressing their needs in natural language, and they will automatically receive a response with the data they need.

Note: This functionality is not yet available to the general public, but it will be, soon; to develop it, Denodo has been working with a limited number of customers as part of an early-access program.

The process takes several steps that are transparent to the user: First, the user writes a description of the data they need, in natural language, in the Denodo Data Catalog, and then the ChatGPT API is automatically invoked to generate a SQL query that is sent to the Denodo data fabric query engine (Fig. 1) ChatGPT even provides an explanation of how the SQL query was generated, for subsequent validation.

One advantage of this process is that we only have to access a single system, since the data fabric aggregates access to any business repository, both on-premises and in the cloud, in a governed, secure manner.

Fig. 1 - ChatGPT user interface, from within the Denodo Data Catalog

Fig. 1 – ChatGPT user interface, from within the Denodo Data Catalog

Easy Access to Data: It’s Only Natural

These types of interactions with enterprise applications are going to be common across all industries, in the foreseeable future. The use of AI and machine learning (ML) is already built into the Denodo Platform, and for years, the Denodo Platform has leveraged ML models to analyze user queries and suggest data that may be of interest to them, similar to how Amazon provides us with automatic recommendations of products that may be of interest to us according to our profile. The Denodo Platform can also use ML models to analyze past queries to suggest partial aggregations of data that can be materialized to optimize execution of future queries, and to automate key data management tasks; data fabric enabled by the Denodo Platform makes it easier and easier for users to get the answers they need.

In the future, Denodo plans to leverage LLMs for additional use cases, such as automatically suggesting ways to combine data from different data sources when building data products in data engineering tasks, automatically choosing business friendly names for artifacts, or automatically applying semantic tags in the Denodo catalog, such as tagging items containing personally identifiable information. The powerful role of data fabric has been noted by several key analysts. In Gartner’s ebook entitled Understand the Role of Data Fabric — Guides for Effective Business Decision Making, Gartner predicts that “By 2024 data fabric deployments will quadruple efficiency in data utilization, while cutting human-driven data management tasks in half.”

In business environments, data fabric enables data scientists to gain integrated access to curated, governed data, which they can use to generate ML models for multiple uses, such as the automatic generation of prices and individualized offers for each customer, the analysis of churn rates, or the segmentation of customers according to their profile.

AI/ML technologies are here to stay, and they will be part of everyday business in the coming years. In particular, keep an eye out for ChatGPT and data fabric.