Emerging Data Science Architecture Patterns

The past month I’ve taken two EdX courses to brush up on Enterprise Data Integration Architecture. One on Active Directory Identity Management in Azure and the other on deploying data application interface services with C#.

What does that have to do with Data Science? Everything. At least that is my strong suspicion. This article explores industry trends towards “Enterprise” data science and how we can build our architectures to support very rapidly evolving  Data Science Solutions.

FAQs: Getting Started in Data Science - Fall 2017

Frequently Asked Questions

Sometime when I get asked a question I send an extremely well thought out response, which may or may not be appreciated by that person, but I feel like there might be people who would - maybe...

Regardless of my delusions of grandeur, I do want to start documenting these responses both to feed my massive ego and because I am hopelessly lazy. Since the one I get asked the most is "How do I get started in Data Science?" we might as well start there. So here's my Fall 2017 go to answer on how to get started quickly in Data Science.

Metaphors used in Data Science

People say data science is difficult, which it is, but even harder is explaining it to other people!

Data Science itself is to blame for this, mostly because we don’t have a concrete definition of it either, which has created a few problem. There are companies promoting ‘Data Science’ tools as ways to enable all your analysts to become “Data Scientists”. The job market is full of people who took a course on Python calling themselves “Data Scientists”. And businesses so focused on reporting that they think all Analytics, Data Science included, is just getting data faster and prettier.

But the tools we use are just that, tools. The code we use requires specialized knowledge to apply it effectively. The data pipelines we create are to monitor the success and failure of our models, it’s an added bonus it helps with reporting.  To mitigate these challenges we have to come up with some clever metaphors, let's explore them a little more deeply.

PowerBI Example

A quick demo of powerBI using movie actors and actresses:

A Modified Comparison of Vegetable Costs

An article about the cost of fresh vs frozen or canned vegetables recently made the rounds on Gizomdo. The data was taken the USDA 2013 Vegetables, cost per cup equivalent data set. The Bite.Gizomoo article did a decent job expounding on the data; however, it may have made an enrounous conclusion.

Embedding Excel in a Website

In a previous post, I showed you how to embed excel interactive tables. I will not link to it here, because unfortunately, that feature has now been deprecated. And that was a cool feature, why must you ruin everything I love Microsoft!? :)

Making a Project Calculator with Shiny Apps

The past few weeks I have been working on the Coursera course "Building Data Projects." They introduce some great tools for building a data app, one of which I have been meaning to try for quite some time - Shiny Apps. Shiny in an application framework that allows the creation of sweet data apps only using some "basic" r scripting and a little understanding of how a user would need to input data. Here's my current working POC for a business intelligence project calculator. Hopefully I'll be able to ad some information later around how I made it.

Tableau Holiday Quiz Viz

A fun viz using custom shapes in a scatterplot. Happy Holidays!!!

Integrating R in TERR

Lately, I've been building the Dallas Office for Syntelli Solutions. One of core service offering is that of integration:

Advanced R Programming (And Why that is Important)

Thanks to Reddit.com/r/programming for directing me to this excellent compilation of R programming information by Hadley Wickham.

Most of us in analytics use R just to get stuff done. It has a large number of packages that lets us produce insights at our own pace. What I mean by "at our own pace" is that we typically analyse data either individually or within our team, with the results communicated to our audience independently of the analysis itself.