Skip to main content

Data Enablement White Paper: Part 1 - December 2023

Finding the Data Enablement Sweet Spot – Part 1 – Can Digital Data truly live freely?

By Ellie Maier, CRM, CBCP, Data Enablement Lead, NOV, 11/17/23
To be published in the ICRM (Institute of Certified Records Managers) Newsletters

Synopsis:

Imagine a world where digital data roams joyfully and are true “Information Assets,” and are free to live another day, instead of being continually “controlled” and irradicated out of existence?  Can we finally help our clients and ourselves find and use real, reliable, and quality data sources with integrity to explain the past, predict the future, and plan for a better tomorrow?  Is there a cost to our organization for all of the technology upgrades, integration of AI/ML (Artificial Intelligence/Machine Learning), and the substitution of human thought, processes, and collaboration, in the quest for inventing better solutions and outcomes in our world?

All great questions to get to that seemingly impossible world where data can just “BE!”  Part of this can be found in combining multiple data sources via large data lakes and streams, such as Databricks and Snowflake, and organizing them into an Enterprise Data Catalogue that becomes key to helping organizations innovate brilliant new ideas, products, and services.  This leads to enhanced business outcomes and products, streamlined technology tool integration and adoption, and potential revenue generation.  This is also a state of mind for many looking to improve the ultimate overall quality and integrity of data, as written in the article "Navigating the Digital Transformation" found in the Journal of Petroleum Technology.

But getting there takes a strong leadership commitment, a team of dedicated data enthusiasts & aficionados, and an integrated IT architecture/technology tools to make it all work.  Join me as we discuss how to make the impossible…… possible!

Article:

For many years, I never thought I would think of myself as “Retired” but here I am starting all over again and reborn in a brand new all-digital transformation world, where RIM/IG (Records & Information Management/Information Governance) is a fond memory and the sky is the new limit to quench my curiosity for learning in a whole new unique way, every day!  Enter a world where digital data is not only encouraged, but also tracked in such a way that data assets can be integrated and discovered by those who need them most.  A wide variety of customers include not only those who are the amazing new generation of “Data Scientists,” but also an entire community of business and IT analysts, otherwise known as our “Data Aficionados”, looking to collaborate with various good, sound data sources of information to explain the past, predict the future, and plan for a better tomorrow.  

Our various data sources are now neatly arranged and organized in a Data Catalogue, and we don’t spend nearly the amount of time looking for reliable information resources, and bonus….. we have a way to collaborate with each other online and in person, for innovative purposes.  Enterprise data catalogues are used by Amazon, Google, and a variety of other companies to have a single source for "Crowdsourcing" (i.e., formulating ideas and content online) to share data.  And not just in retail but think of Energy Verticals, studying renewables, climate change and proper energy usage, Pharmaceuticals inventing new drug therapies or any other company who wants to optimize and share their data sources.

In a sense, think of your favorite online store.  Since the pandemic, practically everything is online, and one can choose to just stay at home to shop for virtually anything your heart desires.  Depending on what the business is, being organized is critical to direct levels of inventories and more so being able to accurately describe what your products are, and in some cases, provide sample pictures, of what is being sold.  This also contends for an organization to have a strong presence online, but without proper tools to manage all that content, you can quickly end up in the wild west of data deluges.  The Data Catalogue helps to provide that “One Stop Amazon Shop” with product vendors including:  Alation Data Catalog, Collibra, or MS Azure Data Catalog.  Other names to start to get familiar with sounds like a data jungle of monkeys, beasts, and interesting wildlife, including Python programming language, Spyder scientific development environment, Alteryx transformational software for data driven ROI, Streamlit data apps and Power BI for Data Science and Business Analytics.

Back in the day, Big Data Framework and History surfaced in the 1990’s and then we heard about “Dark Data” and really in 2014, it became the new “Flavor du Jour” for the IG community, in terms of the exponential and unlimited growth of information resources, data and implied “Records.”  Back then, and still to some extent today, it was all about Yoda’s enduring words of advice to….. “Control, Control, You must learn CONTROL” and in my mind, any wild west data that could not be controlled was at risk of becoming a detriment to the Records Manager, because our program consisted of retention and deletion principles, not just letting data be free to just BE.  The whole proposition that IG can be a “wicked” problem with complex and tangled “Big Data” zombies continued to emphasize that the more the problem grew, the more of an extensive trail of digital and physical ROT (redundant, outdated, trivial) exhaust was left behind, with only a few fending to truly organize and clean up those trails.  To me, this time marked a true “Dark Side” where most uncontrolled and unstructured data just became evil content in the wild west and fodder for discovery lawyers to find.  Big time emphasis came for having a well-documented Records Retention Schedule and defensible disposition/destruction processes.  Privacy had not surfaced yet, and here “less data is more” and doppelgangers of tools and other ways to control, organize, and produce data and records was forefront and center.  

Less emphasis was placed on the fact that some data was also valuable and good to keep.  The double-edged sword seemed focused on finding manual and automated ways to destroy the trails and ROT trolling tools to identify the stuff, scrub it and then send it to the digital or physical shredder, without much thought.  And for WHAT?  Just to say that I didn’t need to think anymore, can push a magic easy button, and rely on AL/ML to do stuff?  I think NOT!!  Also, there was finally a shift to looking at the world through a more digital lens, rather than the paper heresy that most of us had focused on in the past 40+ years of the RIM empire.  Little did we know about the IoT (Internet of Things) Galaxy charging down the pike, almost like during the Terminator movie, where Skynet just took over and human CONTROL was completely lost.  New metaphors for how we communicated governance, RIM and IG surfaced and it was no longer necessarily about capturing value from these records, but rather getting RID of them to take them out of their miserable disorganized and often orphaned and abandoned states.  And always that CONTROL = Risk Avoidance.  But what was the cost of risk avoidance?  I spent whole legions of time, smiles, and tears, finding, tracking, and using the inevitable growth of dark data, and behemoth costs of litigation discovery, based on data still available, to defend the all-mighty RIM/IG Empire.

Those were dark times for me because I felt like I worked to preserve the little valuable data and records that were there, but only for discovery needs.  NOT to help people.  Like helping my clients be able to do a great job because I could help them organize and find their stuff, but rather having to lead projects of death, destruction, and despair ?.

In 2014, I attended a couple “Big Data” ARMA Conference sessions, one by Barclay T. Blair, who presented “Big Data and Information Governance: Friends or Foes?" where it was clear that in order to move away from IG Controlism, that we would need to change our view on the all doom, all impending “Data Governance” or “Die” proposition.  In fact, I recently was offered the opportunity to join my new company as a “Enterprise Data Governance Lead”, in the Data Analytics Team, and remember the reaction on people’s faces when I introduced myself as such.  It was the same deer in the headlights stare, when I was the “Corporate Records Manager”, and reaction was “Oh no, here comes the Records Police and Auditor!” Things changed immediately when we changed this to “Data Enablement Lead”.  Now it’s easy to make connections and create my new network of Family & Friends, because while governance is still important, it is not our main focus, and NEVER a way to communicate, advertise, or draw folks into our program!  Plus, the bonus is that we are not Compliance, IT, or the Records Police, instead we are Data Analytics and Science and enablers of helping us to connect and collaborate with each other to discuss interesting data sources to provide better business outcomes and make more informed decisions.  Imagine that!

But sorry, I digress, back to Barclay’s presentation.  His presentation was all about harnessing data, like the power of a river or a lake, because in the end, you just can’t CONTROL it.  And certainly, with the continued growth and explosion, it’s like a faucet, which can’t be turned on and off, whenever you please.  He showed us some amazing slides about how research shows that “human memory evolved to allow us to predict the future rather than recall the past.”  But to understand it, you have to keep this data long enough, to make future predictions.  His key questions were “Does data have a downside”, “Does all information have equal value”, and “Is it an engineering or an organizational problem?”  All great concepts to mull over, during the initial “Big Data” age.  All great and forward-thinking concepts during that time.

During the same ARMA conference, I also attended my good friend and colleague Brent Gatewood's session on “Using New, Analytical Approaches to Enhance the Governance of Information” which I had just read an article about in the Wall Street Journal.  He addressed the ”4 Vs” of Big Data, which is part of the whole Data Science education track and now drives my whole new world in wonderous and playful ways!  They include “Volume,” “Variety,” “Velocity,” and “Value.”  Never did I think I would work with people who loved data as much as I do and dedicate their whole lives to it!  But here I am today doing it with not only them but also our many customers out in our global operations and indeed around the world.  And, yes, even I am starting to think that way too!  Now “More is Better” is the improved and funner proposition of letting data live to see another day ?.  If you want to see a cool data analytics engine and concept of a data catalogue at work check out the Data is Beautiful YouTube Videos.  Now I see how even I, after so many years in professing purist RIM/IG concepts, methods, and techniques, have “Let it all GO” and into the depths of the vast Data Lakes and use a Data Catalogue to keep track of it all and excited to let data just be free.  And guess what?  All those years of creating taxonomies and fighting the good retention fight can now be used to develop simple, easy to understand ways for one to remember things in a fast-paced digital data rich transformative, dexterous, and governance world, where terms like “Curation”, “Domains”, “Tags” and “Lineage” optimize meta-data management! 

Because isn’t that what it was all about to being with?  To help people find useful data in a fast way so that they didn’t spend hours trolling through the wild west of structured and unstructured data sloughs, to do their job more effectively?  In the end, both the Sith and Jedi approach to a digital data strategy is wrong and one must balance free data and security like, Bendu who resided on the remote planet of Atollon, and claimed to represent the “Center of the Force”, balancing both the light and the dark side.  

The principles of setting up a Data Enablement program include:

1 – Building a dynamic Data Enablement Strategy & Program
2 – Implementing and Marketing Data Science and Catalogue Concepts
3 – Taxonomizing data sources, domains, tags, and other related catalogue objects

Tune in next time for the next segment, that focuses on how to build the program in such a way that is fun, successful, and achievable, in attaining and living throughout the years of inevitable change, which most certainly will come our way.  One hint is that this is not a marathon but a journey to get done, even in this AI/ML age, as it still requires a combination of people and technical skills to create this.  Especially, in this new hybrid/remote age, where you really have to work on the “relationship factor” with folks, who may be located around the globe.  As always, “You will get a better outcome using chocolate rather than vinegar” that is especially so in my new hybrid world, where things like “crowdsourcing” in combination with potluck lunches, lots of baked goods, and general good cheer, work really well ?!!!!!

Biography:

Ellie Maier, CRM, CBCP, is the Data Enablement Lead at NOV.  Ellie is an accomplished 30+ years industry global executive and thought leader in the RIM/IG, Disaster Recovery and Continuity, and Data Governance fields, and now leads the Data Enablement Program in the Data Analytics/Science Team.  She provides organizations with the ability to strategically plan programs, build and implement project plans, and to carry out complex digital data transformation, dexterity, and governance initiatives.  She does this by mastering and using change management techniques, writing, and presenting, and gelling well in any new setting, across a number of global verticals, as well as being recognized as a SME and thought leader by both the ICRM, ARMA, ACP and DRI.  Ellie is a Certified Records Manager and Certified Business Continuity Professional, holds a Prosci Change Management Certificate, and SharePoint for Records Management Certificate.  She was awarded the ARMA Britt Literary Award in 2006, the ICRM Outstanding Mentorship Award in 2016, and is now an active member of the ICRM Marketing Committee.  Ellie is Dutch by heritage and is writing a healthy sustainable eating recipe cookbook and is currently pursuing her 200-Hour Yoga Teacher training in her spare time.  She may be contacted at elehouston@gmail.com