When looking for an explanation of data democratization, you will most certainly find a few common themes, such as how it makes digital information accessible to the average user and gives people the tools to understand and make quick decisions. It also opens up information systems to federal government users without requiring the involvement of IT personnel. And finally, it enables government agencies to be “data first.”
Sounds pretty liberating, doesn’t it? Kind of a “Power to the People” meets “digital transformation” theme.
It’s true that the goal of data democratization is to help federal government agencies get their hands on data quickly, so they can respond quickly. That kind of power is critical and every government agency should want it.
But power brings responsibility. That’s why every smart organization wants to give its users access to data AND ensure that it’s the data to which they should have access. So here’s one more definition of data democratization for you — this time with guardrails.
What is data democratization?
Data democratization means giving federal agencies access to data so they can quickly make critical decisions with it. In data democratization, the role of IT is not to provide the data to those users, but to ensure that they access only the data they need, with organizational control.
The control is necessary to prevent wild-west scenarios. For example, organizations don’t want to lose control of the data and have it end up in unpredictable places, like USB drives and users’ personal devices. It wants to ensure the data is used in compliance with statutes like HIPAA and with privacy laws like GDPR. And it wants to avoid Garbage In Garbage Out (GIGO), in which users make bad decisions because they’ve analyzed the wrong data.
The biggest reason to include organizational control in data democratization, though, is efficiency. Most users can’t work efficiently with data because they don’t understand it. Why not? Because they didn’t create the databases, structures, schemas, tables and column names in the data sources. And even if they were in the room when the data sources were being created for their department, what about the data they need from other departments in the organization?
So, if you’re going to give agency users access to data sources, keep in mind that they’re most likely not experts in IT or database programming. You don’t want them to waste their time looking for useful data in a lot of useless places.
Why data democratization?
Data democratization is related to data empowerment, a three-pillared IT approach for giving users what they need to make the best decisions for your government agency.
The first pillar is data governance. To get a 360-degree view of data, you must understand what data you have, what it means and how it relates to the agency.
As described above, governance also involves understanding the guardrails — the rules, policies and regulations that are associated with the data. Information privacy is a hot topic everywhere, and in the U.S., the regulatory map is becoming a minefield as individual states establish their own privacy mandates.
In short, data governance is about walking the fine line between getting the greatest use out of the data and reducing the risks that come with that data.
The next pillar is data operations, which involves preparing data for use and ensuring it’s available to your agency users. Giving users access to all the data in the organization doesn’t matter if a simple query takes five minutes to run. The systems that deliver the data have to perform well enough to meet the needs of the agency.
Finally, data protection covers the mechanics of ensuring your data is backed up properly and your myriad endpoints are secured. It includes archiving your data, retaining it for compliance and being prepared in case of audits. And it extends to setting policies on sensitive data so that it cannot be used improperly.
You want the IT resources in your organization to focus on those three pillars — data governance, data operations and data protection — instead of fulfilling users’ requests for query results.
That’s why data democratization.
It isn’t that IT is in the way. It’s that there is so much data and so many different tools for using it that data democratization was inevitable.
Consider these realities:
- The shortage of IT talent is real, in positions ranging from programmers to system administrators. In smart organizations, the move to groom a generation of “citizen analysts” has the potential to become part of agencies’ strategy instead of a stop-gap measure.
- Every opportunity has a shelf life, and you can miss it if you can’t get to the data you need in time. Recent history has shown the role that data analysis can play in slowing the spread of a coronavirus and devising a vaccine against it. Decisions made quickly and based on the right data are the key to coming out ahead.
- Data scientists spend as much as 45 percent of their time on data preparation tasks, including loading and cleaning data. While that is an improvement over the 75-85 percent they were spending a few years ago, it still represents a big chunk of time massaging data instead of analyzing it.
It turns out that, when you drill into that data preparation time, it usually starts with answering a question like “What data do I have that could help me solve this problem?” Most people don’t know what data is available in the organization. And, if they do know, they don’t know where it is or how to get access to it. So the next questions are “Who owns system X, system Y, etc.?” and “How can I get access to the data in those systems?”
When they access the data source, they’re likely to see an arcane description of the data in it, like non-intuitive table and field names. So they ask, “Is this the right field?” Then, to get to their target, they may need to combine multiple pieces of data points. They wonder, “Are the fields in this table related to the fields in that table? How? How do I have to combine them?”
That obstacle course of preparation slows the process of data democratization. The process goes even more slowly if you have to phone the right person in IT to walk you through the data.
What does data democratization look like? Self-service shopping.
Ideally, it would be as easy for government agency users to find and use the right data as it is for them to shop online or find a movie to watch. Data democratization is about guiding the right data between the guardrails and putting it at users’ fingertips.
That means users would have a self-service shopping experience that includes features like these:
- Browsing with sensible parameters until they find data of interest
- Getting more information on the data — what it does and does not contain, and how it is derived
- Seeing related data — “People who used this data also used this other data”
- Using a shopping cart that shows the data you want and when you can expect to receive it
- Joining a community of people who have used the data and can tell you more about it
- Taking part in an entire ecosystem of agency users instead of IT professionals
- Seeing whether the data has been encrypted, in case you want to transport it
- Determining whether the data has been anonymized so you don’t run afoul of privacy laws
That’s the ideal state.
Now, let’s be honest: It’s a lot less fun to shop for data than it is for power tools, hair care products and red slingback pumps. So, until we reach that ideal state, here’s another take on what data democratization looks like.
You start with this:
It’s the hodgepodge of modeling tools, report generators, cloud providers, ERPs and relational databases that users in any agency have to sort through. Layered on top of them are hoops that users must jump through — the rules, regulations, standards, codes and auditing requirements associated with using the data.
Data democratization, on the other hand, looks more like this:
On the left, database professionals can find the physical data systems and structures that make sense to them, and on the right, analysts can find the entities they need, with the guardrails of policies and governance.
The combination enables data democratization — seeing how the data flows through the organization, where you can pick it up and which enterprise usage policies apply to it.
Through data democratization you can deal with the abundance of data by giving more government agencies what they need to make critical decisions. With tools that demystify the structure and relationships among data points, users can analyze and make decisions as close to the data as possible.
This article originally appeared on the official erwin blog, here.