

In general, academic institutions have an aversion to teaching commercial products anyway, so I don’t think Splunk will have much headway in academia.

While we could examine why this is the case, for the purposes of this article, let’s just accept the fact that it IS the case and as a result, new data scientists aren’t taught to use Splunk, and as a result, Splunk isn’t penetrating the data science community. I am program chair for Brandeis University’s Masters Program in Strategic Analytics and our program uses a mixture of R and Python, as do most academic programs. If you come from a more academic or science/math background, you have probably worked with either Matlab or R and hence these will be your tools of choice.įrom an institutional perspective, I teach data science classes for Metis as well as my own company GTK Cyber and all these programs use the Python/Pandas/Scikit-Learn ecosystem as their technical stack.
Splunk .conf 2016 cost code#
If you have a CS background, you will be comfortable writing code and so languages like Java or Python will appeal to you. I think you could argue that most people who are employed as data science jobs today come from either a CS or Math dominated background.

Reason 3: Splunk Isn’t Taught in Data Science Schooling However, that is true also whether or not you use Splunk. Now you might argue that you have to pay for the data indirectly regardless of whether you use Splunk or not, in the form of compute and storage costs.
Splunk .conf 2016 cost license#
Oh… and that license cost is quite expensive for large projects. It’s my data dammit, I want to use it! Doctors don’t consider the costs of the treatment as they are treating a patient, and likewise data scientists don’t want to have to think about license costs of using our own data. As a data scientist, I don’t want to have to worry that incorporating data set X is going to cost me more money. In general, data scientists, want to use whatever data is available, and quite often merging multiple disparate data sets together. This is directly antithetical to how data scientists think. This means that the more data you use, the more it will cost you. Splunk is a proprietary tool and their pricing is based on how much data you ingest into Splunk. Which leads me back to the question of why would I want to pay to use a tool I can get for free? Reason 2: Splunk is Expensive The MLTK for instance is just a limited wrapper for scikit-learn. On top of this, much of the more advanced functionality that a data scientist would be interested in, such as Splunk’s Machine Learning Toolkit (MLTK), is based on open source libraries. What this ultimately boils down to is that Splunk is behind the latest developments in data science. Or if there is enough interest, the community will build it and you can benefit. This is in striking contrast to open source tools, where if a particular feature doesn’t exist you can build it and contribute it to the community. If you want a particular feature, and it isn’t available, you are at Splunk’s mercy to develop it. The closed-source nature of Splunk has other implications as well. So at that point, the conversation drifts back to why should I pay for something that I can get for free? Usually there is one that will do the same or better. (Scikit-learn, Tensorflow, Jupyter, Hadoop, Spark, Drill, Keras, R, etc.) I know every time that I see a Splunk “new” capability, my immediate reaction is to ask myself what open source tool can do the same thing. Indeed all of the cutting edge data science tools that are available are open source. The bigger data science community, which I’m defining to include big data engineers, as well as those who lean more towards mathematics and statistics, tend to gravitate towards open source platforms. So let’s look at why that is the case: Reason 1: Splunk Isn’t Open Source Indeed, when the topic of available tools comes up among most of my colleagues and the word Splunk is mentioned, it elicits groans and eye rolls. I cannot pretend to speak for any community of “data scientists” but it is true that I know a decent number of data scientists, some very accomplished and some beginners, and not a one would claim to use Splunk as one of their preferred tools.
Splunk .conf 2016 cost professional#
The inner child in me was thinking, “Splunk isn’t good at data science”, but the more seasoned professional in me actually articulated a more logical and coherent answer, which I thought I’d share whilst waiting for a talk to start. Why don’t data scientists use or like Splunk. conf in Orlando, and a director at Accenture asked me this question, which I thought merited a blog post.
