As a professional software engineer, I spend a lot of time with programming languages. Within the context of work, most of my day to day is transforming my thoughts into a particular language using it’s own constructs in a way that doesn’t make my own head explode, or the heads of my colleages. While striking this balance is an art in itself, I find the need to explore other languages, partly to learn more about programming languages as a study in it’s own right to find new ways to solve problems, but also to escape the confines of linters, code style, pull reviews, etc.
Sometimes I just want to do. And this is what lead me to the Wolfram Language.
While studying calculus in undergrad, we used a program called Maple to analyze and plot functions. On many occasions, I would have a question about how to form a construct or plot, and the tutors would respond, “I know how to do this in Mathematica, but I’m not sure how to do it here”. I remember Mathematica vaguely as a tool only professional mathematicians used, but recently, with the wave of data visualization tools like Tableau, Plotly, and even open-source tools like Jupyter Notebooks, Mathematica has found itself rebranding to capture market and provide solutions beyond an analysis tool for mathematicians.
Enter Wolfram. A suite of tools, each connected through the Wolfram Language, providing services like cloud computing, free text information retrieval, developer platforms to build and deploy API services. The goal of this project has shown itself to be a single, unified way to explore, program, and deploy informatica solutions. Standard Data Science, Machine Learning, and Artifical Intellegence algorithms are built right in and are available as soon as you open the notebook and allow you to get started with a click of a button.
On first glance, the notebook itself is something that just works, and looks clean and minimalistic from the start, and it brings an ambiance as if it would fit right at home in the screen of a MacBook Air on the wood oak desk of a modern architect. If programming languages were cars, this is certainly the Aston Martin Zagato to Python’s Civic. You don’t buy into the clean lines and elegant styling because you want something versatile for daily commutes between work and picking up your kids. You buy it because you want to have exhilarating drives through winding roads through forests overlooking bays, and you want to have fun doing it. As this image and it’s entry for Mathematica alludes, it’s damn expensive, and for good reason. It’s not a general purpose language, and it doesn’t want to be. I saw it as an elegant way to dive deeply into exploratory problems quickly, and without needing to spin up clusters or exchange conda environments, or even install requirements.
It is a tool for professional data scientists, and it really did just work. For a moment I felt like I was a greying 40 year old, with a wood oak desk in a glass office overlooking a sprawling tech campus. Of course with said Zagato parked in my partner parking space. It seemed too easy and comforting. So naturally, as I would, I wanted to exhibit chaos in this language to test how comforting it is.
A ship is only measured by how well it lasts in a storm. So I gave it storms.
The first use case I had of it was to generate a basic regression hypothesis test. A friend of mine and I were discussing the differences between using a permutation test or a t-test. Naturally, when your friends work in science fields, you coffee chats center around algorithms and methods, so we discussed preferences for basic differences of means tests for a few minutes before I remembered I registered for a Wolfram trial the night before. I had an excuse to use it, although it was a small one. Eventually, I settled on trying a basic regression hypothesis test in the language, and unsurprisingly, it worked well.
Out popped…
It worked quite well. Of course it better. This was supposed to be the Aston Martin of languages after all. What surprised me after was that without thinking, I hit “Deploy -> Published Page”. I wanted to share the regression with him. In a few seconds, a link appeared, and I sent it.
To share a simple regression and share it, I pushed a few buttons and it just happened. I didn’t need to fiddle with conda environments, install things, open a browser, search up list types in Numpy, get annoyed and install PANDAS, search up how to do a regression in Scipy, search up how to generate random numbers in Scipy… In the time it would take me to download the dependencies, I loaded a notebook, ran the regression, and shared it.
Simple.
Over the next few days, I stayed up late exploring the language. Doing basic things like importing data, plotting it, grouping it, all using the Wolfram language and it’s interesting constructs. I played with Logistic Regressions, Power Law Distributions, and even got the chance to test the capabilities of it’s Dimensionality Reduction features by Clustering TSNE results. Everything from the language, to the documentation was professional and worked well out of the box. Like a push button start for a hypercar, I knew things were happening under the hood, but the interaction was minimal and didn’t require more than a gesture to get the wheels moving.
Things were familiar seeing that I recently spent a few months doing all of my computer science coursework in Julia, a language that touts itself as being “influenced” by Mathematica, naturally because the mathematicans who designed Julia spent all their lives using Mathematica. When I ran into a problem, I remembered to myself, “Nominal dynamic type system” before I went down a foolish path of applying types to things I shouldn’t and being annoyed that I can’t define a Monad Interface. Along with this, concepts like macros raised it’s head when the documentation mentioned “Everything is symbolic”. Translation: I can pattern match at the program level and do interesting things. More on this later.
Essentially, I realized the language is fun. One of the larger hiccups I had was loading up a CSV dataset with missing values. After some reading I realized, since the dataset is represented as a list of rows, where rows are lists themselves, I could just pattern match over the rows and filter by rows that have Strings and Numbers where they should be. Functional. Map. Filter. Fold. Some Haskell began to come out…
And then I asked, “How can I apply a set of functions in concatenation?” I grew tired of this…
and I wanted more like this…
Functors over a list is such a natural construct, so I wanted a Map function over lists. I knew ‘/@’ was my friend, but how do I get it to work? The answer was…
Haskell developers know that with the right application of “Throwing some cash” into their functions, they can make extremely complicated programs intuitive. Rightfully so, because the purpose of a professional software engineer is to write code in a way that doesn’t make my own head explode, or the heads of my colleages. I learned I could write functions in an intuitive way in Wolfram, by simply adding an appropriate brackets and ‘/@’ signs before the starting variable, with occasional touches of ampersands.
At this point, this was the end of my interest in the language. Not because this was deal breaker, but because I learned:
This is a tool for professional data scientists.
Things that seemed hard, seemed so easy, but things that seemed so easy, instantly became difficult. This is the problem of languages that aren’t general purpose. Your Aston Martin just won’t stand to pick up 8 hungry functions that you don’t want applying their dirt and coffee all over your Aniline leather seats.
It’s just not what the language was designed to do.
I am hopeful for Wolfram as a language. I wanted to test the language to see if it would be an interesting addition to my tool-stack. Being someone heavily interested in software engineering, I wanted a way to explore data without having to deal with the engineering overhead of managing software packages. Much how mathematicians like to prove things to “prove to themselves they can do it, then never worry about it again until you need to”, for analytics work, I have yet to see the return on investment of maintaining exploratory analytics systems, since they don’t translate to deployment well anyway and you’ll likely need to rewrite it. Python notebook analysis can be a useful prototype, but significant rewriting needs to be done to deploy, or integrate into existing systems. Is it really worth the upfront development cost to build a fully modifiable system just to do basic work? Probably not. I can at least say, my Python notebook analysis typically stays put, and only if things are plausible do I translate it into something more useful like a web service or a package.
Can it compete with the likes of Tableau? Considering Tableau seems to only be a visualization tool for users coming from Excel, I think it can. Tableau doesn’t do it for me since I find GUI tools more invasive than code, especially if I want to reproduce actions and analysis I’ve made previously with different data. Along with this, API deployment of AI and ML code seems to be in the pipeline, if not likely an extremely expensive option for teams that have significant funding to pay for it. I consider Julia as a strong alternative for users without the cash or company provisions, especially if one wants to be able to deploy services that work fast, but I’m not sure if that language is better or worse than Python, given the package ecosystem.
The roads in the informatica landscape are vast. Python. Julia. R. SAS. Scala. Java. Sometimes you want to carry trash across town. Other times, you want to dart through traffic on a scooter. Civics and Camrys cover most daily use. Getting you from work, to class, to work with a good bit of protection and modifications along the way.
For some people though, programming isn’t a matter of getting from point A to point B. Some people want to do things with a little more style…