In a previous post I highlighted that as an analyst in the Coast Guard each analysis was beginning the same way: which is to say, there is never an assumption the Coast Guard has data ready to go. Common preliminary questioning often include:
“Where is the data?”
“You don’t have data. Where can we get data?”
Simply put, a data lake could begin to address some of the Coast Guard analyst community’s largest problems. But is there an opportunity to address more Coast Guard analyst desires; potentially killing two (or more even) birds with one stone?
Analytic Tool Availability on CGONE NIPRnet
Any quick internet search for industry standard data analytic tools will have both Python and R ranking high on all lists for top data analytic tools. Any analyst will know that this is because analytic development and open source development are largely occurring on both Python and R. Thus, these tools become indispensable to analysts who want to keep up with industry trends and leverage the wisdom of crowds by crowd sourcing capability development.
However, it is this flexibility that poses such a risk to NIPRnet network administrators. Or at least, this is my understanding from the CGONE NIPRnet network administrators.
In a previous post I highlighted that any Coast Guard users, “will have countless stories of Information Technology (IT) related issues” pertaining to the Coast Guard’s NIPRnet (CGONE). For the Millennial and Gen Z generations, these occurrences are particularly frustrating. An affordable personal computer is bullet proof compared to what users experience on CGONE NIPRnet. So, imagine being an analyst where you are sent to school by your organization and trained on industry standard tools, only to be denied the same tools you learned with upon your return to the organization. To add insult to injury, the tools you are denied are not denied on a high-end cost basis. Rather, these tools are free and open source. On your personal computer where you have administrative privileges, you are enabled to safely download and utilize the tools you learned upon and desire, all free of charge.
So, the same thing that makes Python and R so indispensable to an analyst (the free capabilities developed on them by the open source community), is the very poison pill that makes them so difficult for network administrators on NIPRnet. The situational irony of the scenario is undeniable!
Python and R through the Coast Guard’s Integrated Data Environment
There are a plethora of online Python and R courses available, ranging from free to well-established and expensive. Many of these courses will direct users to download and install Python and R kernels (again free and open source) and respective Integrated Development Environments (also free) for each language. However, many of the courses will enable Python or R environments directly in a web browser. Often in these scenarios capabilities of the environments are limited. Nevertheless, the proof of concept is strong. And, any Python/R user familiar with the web embedded Python and R environments would logically extract the full capabilities of Python and R environments could be made available as embedded environments with adequate server compute/storage space for the website.
These were my initial thoughts pertaining to an Integrated Data Environment for the Coast Guard. What if analysts were not only enabled with a totality of Coast Guard data within a central data lake? What if analysts were given the Python and R kernels, side-by-side with the data, in the same environment?