Submitted by thedanindanger on Tue, 04/07/2020 - 15:20
It's not as popular as it once was, but Bayesian Analytics remains a powerful tool for more supervised learning exercises.
Despite all the hype around Deep Learning Models, and AI as a Service APIs, there's still a need for Data Scientists to explain - in simple terms - what factors influence a given prediction. And even more importantly, sometimes we want to construct a model that represents real world process, rather than have a input values feed into a programmatically optimized series of neural networks and produce a predicted value.
Submitted by thedanindanger on Tue, 03/31/2020 - 17:32
Within 30 days I passed both the Google Cloud Platform Professional Data Engineer and Architect Certification exams.
However, it took me
much longer than 30 days of study and experience to pass the exams.
was a lot of overlap between the two exams, so if anyone else wants to put
their personal life on hold for a few months and attempt something as crazy as
passing two of the hardest cloud certifications in a short period of time, here
are some tips to help you out.
Submitted by thedanindanger on Thu, 05/31/2018 - 13:32
As I discussed in a previous article, Data Science is in desperate need of Devops. Fortunately, there are finally some emerging devops patterns to support Data Science development. DataBricks themselves are providing much of it.
Two concepts keep popping up in the devops patterns: “Continuous Integration / Continuous Deployment” and “Test Driven Design” (Moving toward “Behavioral Driven Design” but that’s not a widely used term).
Submitted by thedanindanger on Sun, 02/18/2018 - 18:35
The past month I’ve taken two EdX courses to brush up on Enterprise Data Integration Architecture. One on Active Directory Identity Management in Azure and the other on deploying data application interface services with C#.
What does that have to do with Data Science? Everything. At least that is my strong suspicion. This article explores industry trends towards “Enterprise” data science and how we can build our architectures to support very rapidly evolving Data Science Solutions.
Submitted by thedanindanger on Sun, 09/10/2017 - 19:39
Frequently Asked Questions
Sometime when I get asked a question I send an extremely well thought out response, which may or may not be appreciated by that person, but I feel like there might be people who would - maybe...
Regardless of my delusions of grandeur, I do want to start documenting these responses both to feed my massive ego and because I am hopelessly lazy. Since the one I get asked the most is "How do I get started in Data Science?" we might as well start there. So here's my Fall 2017 go to answer on how to get started quickly in Data Science.
Submitted by thedanindanger on Sun, 09/10/2017 - 17:47
People say data science is difficult, which it is, but even harder is explaining it to other people!
Data Science itself is to blame for this, mostly because we don’t have a concrete definition of it either, which has created a few problem. There are companies promoting ‘Data Science’ tools as ways to enable all your analysts to become “Data Scientists”. The job market is full of people who took a course on Python calling themselves “Data Scientists”. And businesses so focused on reporting that they think all Analytics, Data Science included, is just getting data faster and prettier.
But the tools we use are just that, tools. The code we use requires specialized knowledge to apply it effectively. The data pipelines we create are to monitor the success and failure of our models, it’s an added bonus it helps with reporting. To mitigate these challenges we have to come up with some clever metaphors, let's explore them a little more deeply.
Submitted by thedanindanger on Tue, 11/24/2015 - 20:38
In a previous post, I showed you how to embed excel interactive tables. I will not link to it here, because unfortunately, that feature has now been deprecated. And that was a cool feature, why must you ruin everything I love Microsoft!? :)