Three Lessons From a Lifetime of Data-Centricity
President and Co-Founder, Semantic Arts
April 8th, 2021
Semantic Arts helps companies uncover the true meaning of their data. They have worked with some of the biggest companies in the world, including Procter & Gamble and Goldman Sachs. In this article, Co-Founder and President Dave McComb shares some of the lessons learned over 20 years of helping companies achieve meaningful digital transformation.
We started Semantic Arts with a goal of helping people with semantic technology and knowledge graphs. For the last five years, we’ve focused on what we call becoming Data-Centric - helping clients adopt a mindset that puts their data at the heart of their business, and making their data flexible and accessible.
I believe people have secretly wanted something like Data-Centricity for a long, long time. In the earliest days, they didn’t think about it because data siloing wasn’t really an issue. Companies had a payroll system and an inventory system, and that’s it. But sometime in the 90s people started thinking, “Well, we’ve created a mess. We have a lot of silos and now we need an enterprise data model to get it all together.”
But what they discovered is that these “models” were really just a sum of all the individual data models they had. These things were gigantic, with tens of thousands of tables or occasionally more than that, and totally unworkable. And because these early models were so unwieldy, they mostly dropped out of favor.
But the idea didn’t go away, nor did our belief in Data-Centricity. And we realized that we needed to help people see what we saw. So we created a website called the Data-Centric Manifesto to help recruit people to our cause, and I started writing books about how Data-Centricity can change the nature of an enterprise. And now finally, with Data Fabric and other such technology, Data-Centricity is more approachable than ever before.
And yet maybe one-half of one percent of all large enterprises are on this Data-Centric journey, and almost nobody has arrived at the end game. But remember, it took 40 or 50 years to create this mess with data silos, and it’s going to take a while to clean it up. But businesses today lament the fact that they have silos and these giant monolithic systems that cost hundreds of millions of dollars to replace, and they want to find a way out. To me, the end game is using model-driven data for 90% of an enterprise’s functionality, and effectively ending monolithic applications—that’s the future of Data-Centricity.
After 40-plus years of having code at the center of enterprise architecture, this idea that, actually, data belongs at the center of everything is pretty much revolutionary. It’s like when people believed the Sun revolves around the Earth, and Copernicus first raised his hand and said, “Wouldn’t it make much more sense that the Earth revolves around the Sun?” Just because some civilizations had the Earth in the center of their solar system model for centuries didn’t mean it was correct. People today are having a similar realization that we’ve always had data revolving around the code, when in fact code needs to revolve around the data.
With all that being said, here are three lessons we’ve learned after 20-plus years spent advocating Data-Centricity.
1. Think Big, Start Small
One of our key mantras, “Think big, start small,” is the result of a hard-thought lesson. When we first “saw the light” of Data-Centricity, and how it allowed businesses to achieve more precision and simplicity at the same time, we thought everyone else would see the obvious benefit as well and it would be quick to catch on. But it didn’t happen as quickly as we expected.
Meanwhile, we saw people doing Agile projects, or even Agile projects using Gantt, where they’re just creating more semantic silos, and that’s not good either. What you need to do is use two things in concert: you need to do enough “think big” work that you can see your future end state, then “start small” with an individual project that will fit in and be future proof…and will continue to fit in with every other internal issue.
We take that approach in every project we do, and it has served us extremely well.
2. Simplify Your Data, Simplify Your Problems
Over the years, we’ve seen far too many cases of enterprise paralysis caused by an overabundance of data. Now, when we say overabundance, we mean data that is redundant, or data that is outdated, or empty data cells in tables that make it appear like you have far more data to deal with than you actually do. And a lot of it comes from needless data copies.
We have literally carried out projects where we reduced the amount of meta data in question by 100 to 1, just by cleaning up this sort of extraneous and unusable data. Just imagine what that does for simplification. This is one of the most exciting aspects of Data-Centricity—the ability to truly simplify—because so much data is fake noise and fake complexity.
3. Global Identifiers Are Key
In our semantics graphs, every node is a global identifier. They look like a URL, they’re really a URI. What that means is every single node is completely unique in the globe and you don’t need any metadata to join them. They just join themselves—two nodes with the same identifier just snap together. It doesn’t matter what column or table or anything those nodes came from, they just snap together.
This empowers simplification, and it really forms the basis of our “think big” approach. Those unique identifiers create the “big picture” your data will eventually fill in.
One last point: being Data-Centric is different from being Data-Driven. A lot of companies talk about being Data-Driven, and there’s nothing wrong with that. In fact, it’s a good idea to base your decisions on data rather than on opinion or heuristic information or whatever. But that doesn’t mean it’s Data-Centric.
You could have silos of Data-Driveness, and a lot of people do. And people are realizing that but, as we know, change is surprisingly slow. Even when the benefits are obvious.