Metrics of a Data Driven Organization

In my previous post I highlighted metrics of a data driven organization. I would like to spend some time expanding upon them. Admittedly, I cannot claim these metrics as my own. However, they serve as a true North as I assess organizations. And more specifically, they can serve as the true North as I judge the Coast Guard’s progress towards data maturity, and whether we are even headed in the correct direction.

In late 2019, Gregory Zuckerman released, “The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution”. I started and finished the book in early 2020, and I was deeply fascinated by Jim Simons. From this book and the limited things available from Jim Simons’ public life, I was able to piece together some of the metrics I previously mentioned. However, these metrics were not entirely unique to Jim Simons. I heard the senior leaders of the Operations Research and Data Analytics Working Group within the Coast Guard voice some of these same metrics. The metrics of a data driven organization I am referring to are:

  1. Everyone in the organization has access to the data relevant to their team/work,
  2. There is a transparent entity (or entities) who is (or are) responsible for data quality,
  3. Opinions are voiced only when accompanied by supporting data, and
  4. Numbers are communicated even (and especially) when they communicate negative messages. 1

These metrics, as written, are very formal. In my opinion, these metrics are soft and forgiving to the Coast Guard’s shortcomings with respect to becoming a data driven organization. For example, if I were Commandant of the Coast Guard and I were serious about making the Coast Guard a data driven organization, I would more aggressively re-write the first two metrics as:

  1. Everyone in the organization has access to all data, and
  2. There is a transparent entity responsible for data quality.

As I previously stated, these metrics and ideas are hardly unique. To prove my point, even the Coast Guard can offer an example of an attempt to meet some of the metrics of a data driven organization with an enterprise application: Coast Guard Business Intelligence.

Coast Guard Business Intelligence

Referencing the Department of Homeland Security Publication Library website, the Coast Guard listed a Privacy Impact Assessment (PIA) for a mission support tool which provides reporting and analysis capabilities. This website was updated as recently as June 6th, 2022. 2

And the PIA posted is dated April 17th, 2012. 3

From a previous post of mine:

“A PIA is a document created for Government systems and posted publicly so citizens know things their government is doing; information the Government is tracking.” 4

Typically, a system is started with a privacy document called a Privacy Threshold Agreement (PTA). And a PIA is posted within a year of routing a PTA. We can conclude around early 2011 to 2012 timeframe, the Coast Guard identified it required a modern (modern at the time; italicized because CGBI arguably falls short of early 2010s data/analysis standards) data sharing and analytic capability; enter CGBI.

CGBI utilizes standardized data and metrics, and a front-end business intelligence application to aggregate data and provide reports. From the CGBI PIA:

“This system was created to provide an integrated reporting and analysis environment… providing ‘one version of the truth’.” 5

Obviously, this vision of an analytic business intelligence application and data sharing capability begins to address some of our established metrics of a data driven organization. CGBI is available to all users. However, it immediately fails the second metric because it does not offer any type of data governance mechanism with respect to data ownership or custody.

Palantir

In addition to the ongoing CGBI data sharing effort, the Coast Guard embarked on additional attempts at data modernization. During the Coronavirus Disease (COVID-19) Pandemic, the Coast Guard utilized Coronavirus Aid, Relief, and Economic Security (CARES) Act funding to procure an instance of Palantir Technologies Inc’s (Palantir) Foundry platform as Software-as-a-Service. In a very broad summary, the contract with Palantir represented the Coast Guard’s first efforts at data modernization. Unfortunately, the Foundry solution was likely too much too soon for the Coast Guard’s data culture.

The resolution of the Palantir contract is neither here nor there. What is interesting is how the Palantir Foundry solution began to address our identified metrics of a data driven organization.

Palantir was willing to expand to new users as quickly as the Coast Guard was willing to supply them. This highlights an ability to ensure “everyone in the organization has access to all data”. Essentially, the Coast Guard would retain ownership of its data and supply it to Palantir, who would provide the connective tissue to effectively federate the data across the enterprise. Similarly, the Palantir Foundry platform would present the data in a format cleaned with best practices. Effectively placing Palantir as a transparent entity responsible for data quality. Palantir tracked data owners and could reach out to them to remedy reported errors. So, in addition to a connective tissue for data sharing, the Palantir Foundry platform also provided the network for crowd sourcing error identification and correction. All of which is a bonus in addition to our metric that requires a transparent entity responsible for data quality.

On January 5th, 2021, the series of contracts the Coast Guard awarded to Palantir expired. And the Coast Guard proceeded with the first decommissioning of a cloud platform (ironically it was its most advanced analytic and data sharing platform).

Integrated Data Environment

Previously, we established the Coast Guard means data lake and a front-end Graphical User Interface when it references an Integrated Data Environment (IDE). 6

Following a stint where Palantir acted as an IDE for the Coast Guard, it should be apparent why the Coast Guard desires an IDE. An IDE:

  1. Enables everyone in the organization to have access to all data, and
  2. Establishes a transparent entity responsible for data quality.

Once an IDE is in place to support the first two metrics of a data driven organization, the remaining two can begin to be addressed. First:

  • Opinions are voiced only when accompanied by supporting data.

After everyone in the organization is enabled with the organization’s data, members are enabled to reference data in support of their assertions. And once data is available across the enterprise:

  • Numbers are communicated even (and especially) when they communicate negative messages.

As members are enabled to reference data in support of their opinions, it will rapidly become the organization’s expectation. Why would anyone offer their opinion without supporting data? It would risk their opinion being immediately overrun by someone who is offering supporting data. And as it becomes the expectation, there will no longer be an excuse to overlook unfavorable data. The Coast Guard will be able to address its data culture shortcomings. And the key to this level of success across the enterprise begins with the first two metrics of a data driven organization, which are closely tied to the successful implementation of the vision of an IDE.


These views are mine and should not be construed as the views of the U.S. Coast Guard.