In my head there is always a competition for which post is next and sometimes if there will be a post at all. ODBC and RevoScaleR have been arguing and its super annoying. ODBC was the last post, how you can connect to any version of SQL using just ODBC. If you did not go to the link I published, you can connect to Oracle, MySQL, PostgreSQL, SQLite too. The point of that will become much more clear when you start querying MSDB Job History so you can write your own R ggplot reports on job length and overlapping jobs (spoilers…). I will give you the code to get you started, later, maybe tomorrow, I don’t know yet depends on who wins the next argument. For now it is connect to SQL Server using RevoScaleR package…
So here you are, you know SQL or you at least do something with it everyday and are wondering what all the hoopla is about R and data science. Lets break down the first barrier, R and data science actually have little to do with each other, R is a language, data science is an abstract field of work, sort of like saying I am a computer scientist, that will narrow it down but not by much. What is your industry, what languages do you use, what is your education, hacker, bachelors, masters, phd…? You can be a Data Scientist and never use R.
But we are going to use R, today, right now, get ready.
The hardest thing about having a blog is without exception, having a blog! It will sit and wait for you forever to come back to it, I think about it every day and the hundred post that need to be completed. In my case, content is not the problem it’s the fact that some posts like this one will take a few minutes to one hour, and I have posts that have taken me two days to write, not because they were difficult, but because the technical accuracy of the post had to be perfect, or at least as perfect as I could come up with. I have already decided the first person I hire will be responsible for going back and verifying my posts… I feel sorry for them already.