Lets get started, again…

The hardest thing about having a blog is without exception, having a blog! It will sit and wait for you forever to come back to it, I think about it every day and the hundred post that need to be completed. In my case, content is not the problem it’s the fact that some posts like this one will take a few minutes to one hour, and I have posts that have taken me two days to write, not because they were difficult, but because the technical accuracy of the post had to be perfect, or at least as perfect as I could come up with. I have already decided the first person I hire will be responsible for going back and verifying my posts… I feel sorry for them already.

I have a series of posts I need to get out in the next few months, it will focus on what a SQL pro can do with R every day, I will weave in some Statistical analysis, and I will warn you some of those may be long posts, while I can demo linear regression in one hour (very badly in that limited time) the mechanics of lm (Linear Model) are pretty lengthy.

The other thing that is super boring that I will be spending little time on is SQL calling out to R to create a model, to a SQL pro its just not that useful, and even to a data scientist that is the last step of operationalizing a model, so it would be the last 5% of work out of a pretty huge project. There may be an argument for testing in SQL with larger volumes of data, but even then, I’m not sure I buy it as a necessity. As I demonstrate models, and evaluating them I will provide the code to get the model into SQL and how to run it, but the SQL piece will not be my primary focus, you will see why after the first lm model is created.

So, what to look for in the coming weeks, in no particular order
R Graphics and ggplot
Connecting R to SQL
Querying SQL
Getting data into a dataframe
MSDB reports with R graphics
Query Store reports with R graphics
Linear Regression
Classification
ANOVA
A/B testing
SVM
And whatever else I feel compelled to pontificate about

At some point I will be producing training video to go a long with all of this, but have yet to decide the platform, the problem for me is, there is a lot of content already out and much of it is very good, but there does seem to be a niche gap for the SQL Pro. I am not going to make you a data scientist, but spend enough time with my content you can at least communicate and work with them on equal footing or at a minimum pick up some of the low hanging data science fruit in your org.