Thus far we established the Coast Guard is seeking a data lake. We also established the Coast Guard’s persistent use of legacy software systems results in disparate, semi-structured data silos. And the issue of these data silos is compounded by the data silos being legacy software systems. Because as modern technology solutions are sought to connect or reconcile the data silos, legacy software systems cannot support the connection and transfer of data.
Without a data lake, what does the Coast Guard do with video footage from a boarding? Or what does the Coast Guard do with audio recordings of distress calls? Affordable modern technologies (microphones, video cameras, etc.) make collection of these types of data easy, and thus the Coast Guard effectively achieves data collection. However, benefitting from this data to its fullest extent remains significantly more difficult.
In fact, the Coast Guard largely remains stuck between paper logs and fully benefiting from electronic logs. As I have slightly alluded to in previous posts, mariners and sea going services have strong roots in data collection. There is a good chance the Coast Guard’s logbooks date back farther than anyone would anticipate. This speaks to the impressive data collection roots within sea going services. However, as analog (paper) logs moved to digital (electronic), the Coast Guard did not want to compromise on the logbook formats it was used to seeing. So digital log adoption moved to Adobe.pdf files, where forms were closely mapped to what watchstanders were used to utilizing on paper.
Obviously, there are ways Adobe.pdf files are used to feed structured databases. However, if you are remotely familiar with mariner logbooks you would understand they lend themselves extremely easily to tabbed, comma separated or varying spreadsheet formats. These logs track, at a specific interval, things like the: time, weather (air temperature, water temperature, wind speed, etc.), location, heading, course, etc.
Coming full circle, it should be apparent the Coast Guard is struggling to leverage its data to its fullest extent. Which is why the idea of a data lake is so appealing. Bring all the data silos together and enable users to interact with an entirety of Coast Guard data. This alone represents a previously unaccomplished feat. So what would analysts in the Coast Guard seek regarding a Coast Guard data lake?