{"id":244,"date":"2017-01-11T08:55:23","date_gmt":"2017-01-11T08:55:23","guid":{"rendered":"https:\/\/sqlshep.com\/?p=244"},"modified":"2017-01-11T22:29:24","modified_gmt":"2017-01-11T22:29:24","slug":"r-markdown","status":"publish","type":"post","link":"https:\/\/sqlshep.com\/?p=244","title":{"rendered":"R Markdown"},"content":{"rendered":"<p>This is a slight diversion into a tool built into R called <a href=\"http:\/\/rmarkdown.rstudio.com\">R Markdown<\/a>, and Shiny will be coming up in a few days.  Why is this important?  It gives you a living document you can add text and r scripts to to produce just the output from R. I wrote my Stats grad project using just R Markdown and saved it to a PDF, no Word or open office tools. <\/p>\n<p>Its a mix of HTML and R, so if you know a tiny bit about HTML programing you will be fine, otherwise, use the <a href=\"https:\/\/www.rstudio.com\/wp-content\/uploads\/2016\/03\/rmarkdown-cheatsheet-2.0.pdf\">R Markdown Cheat sheet<\/a> and <a href=\"https:\/\/www.rstudio.com\/wp-content\/uploads\/2015\/03\/rmarkdown-reference.pdf\">Reference Guide<\/a> which i just annoyingly found out existed&#8230; <\/p>\n<p>I am going to give you a full R Markdown document to get you started.<\/p>\n<p>Create a new R Markdown file;<\/p>\n<p><a href=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.09.26-PM.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.09.26-PM-300x95.png\" alt=\"\" width=\"300\" height=\"95\" class=\"aligncenter size-medium wp-image-246\" srcset=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.09.26-PM-300x95.png 300w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.09.26-PM-768x242.png 768w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.09.26-PM-1024x323.png 1024w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.09.26-PM-624x197.png 624w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.09.26-PM.png 1046w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> <\/p>\n<p>Then Run it by selecting the &#8220;Knit&#8221; drop down in the middle left of the toolbar and selecting Knit to HTML.<\/p>\n<p><a href=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.10.22-PM.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.10.22-PM-300x261.png\" alt=\"\" width=\"300\" height=\"261\" class=\"aligncenter size-medium wp-image-249\" srcset=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.10.22-PM-300x261.png 300w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-10-at-7.10.22-PM.png 508w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>This will create an html document that you can open in a browser, it comes with some default mtcars data just so you can see some output. Try out some R commands and doodle around a bit before starting the code below.  This is the file data file we will be using, <a href=\"https:\/\/github.com\/sqlshep\/SQLShepBlog\/blob\/master\/data\/US-Education.csv\">US-Education.csv<\/a>  It contains just the 2010-2014 educational attainment estimates per count in the US.  <\/p>\n<p>In the code books below i will put in each section of the R Markdown and discuss it, each R code block can me moved to r console to be run. <\/p>\n<p>The first section Is the title that will show up on the top of the doc, copy this into the markdown file and run it by itself.  I am using an html style tag as i want some of the plots to be two columns across.  <\/p>\n<p>You will also see the first R command in an &#8220;R&#8221; block  identified by &#8220;&#8220;`{r} and terminated with &#8220;&#8220;`&#8221;.  Feel free to remove options and change options to see what happens. <\/p>\n<p><strong>Notice below the style tag is wrong, when you copy it out you will need to put the &#8220;<\" back in from of the style tag.<\/strong>  If i format it correctly wordpress takes it as an internal style tag to this post.  <\/p>\n<pre><code>\r\n---\r\ntitle: \"Educational Attainment by County\"\r\n\r\noutput: html_document\r\n---\r\n\r\nstyle>\r\n  .col2 {\r\n    columns: 2 200px;         \/* number of columns and width in pixels*\/\r\n    -webkit-columns: 2 200px; \/* chrome, safari *\/\r\n    -moz-columns: 2 200px;    \/* firefox *\/\r\n     line-height: 2em;\r\n     font-size: 10pt;\r\n\r\n  }\r\n\r\n\/style>\r\n\r\n```{r setup, include=FALSE}\r\nknitr::opts_chunk$set(echo = FALSE,warning=FALSE)\r\n\r\n#require is the fancy version of install package\/library\r\nrequire(choroplethr)\r\n\r\n```\r\n<\/code><\/pre>\n<p>This will be the next section in the markup, load a dataframe for each of the four educational attainment categories. <\/p>\n<pre><code>\r\n```{r one}\r\n\r\n#Load data\r\n setwd(\"\/data\/\")\r\n usa <- read.csv(\"US-Education.csv\",stringsAsFactors=FALSE)\r\n\r\n#Seperate data for choropleth \r\n lessHighSchool <- subset(usa[c(\"FIPS.Code\",\"Percent.of.adults.with.less.than.a.high.school.diploma..2010.2014\")],FIPS.Code >0)\r\n \r\nhighSchool <- subset(usa[c(\"FIPS.Code\",\"Percent.of.adults.with.a.high.school.diploma.only..2010.2014\")],FIPS.Code >0) \r\n \r\nsomeCollege <- subset(usa[c(\"FIPS.Code\",\"Percent.of.adults.completing.some.college.or.associate.s.degree..2010.2014\")],FIPS.Code >0)\r\n \r\ncollege <- subset(usa[c(\"FIPS.Code\",\"Percent.of.adults.with.a.bachelor.s.degree.or.higher..2010.2014\")],FIPS.Code >0)\r\n\r\n#rename columns for Choropleth\r\n \r\n colnames(lessHighSchool)[which(colnames(lessHighSchool) == 'FIPS.Code')] <- 'region'\r\n \r\n colnames(lessHighSchool)[which(colnames(lessHighSchool) == 'Percent.of.adults.with.less.than.a.high.school.diploma..2010.2014')] <- 'value'\r\n\r\n# \r\n# or\r\n#\r\n names(highSchool) <-c(\"region\",\"value\")\r\n names(someCollege) <-c(\"region\",\"value\")\r\n names(college) <-c(\"region\",\"value\")\r\n \r\n \r\n```\r\n<\/code><\/pre>\n<p>The next section will create four histograms of the college attainment by category.  Notice the distribution of the data, normal distribution, right skew, left skew, bimodal?  We will discuss them next blog. <\/p>\n<p>Notice for the next section i have the \"div\" without the left \"<\", be sure to put those back. \n\n\n\n<pre><code>\r\n\r\ndiv class=\"col2\">\r\n\r\n```{r Histogram 1}\r\n hist(lessHighSchool$value,xlim=c(0,60),breaks=30, xlab = \"Percent of High School Dropouts\", ylab=\"Number of Counties\",main=\"\",col=\"lightblue\")\r\n\r\n\r\n hist(highSchool$value,xlim=c(0,60),breaks=30, xlab = \"Percent Completed High School \", ylab=\"Number of Counties\",main=\"\",col=\"lightblue\")\r\n \r\n```\r\n \r\n \r\n```{r Histogram 2}\r\n\r\n hist(someCollege$value,xlim=c(0,50),breaks=30, xlab = \"Percent Completed Associates or Some College \", ylab=\"Number of Counties\",main=\"\",col=\"lightblue\")\r\n \r\n hist(college$value,xlim=c(0,90),breaks=30, xlab = \"Percent Completed Bachelors Degree or Higher \", ylab=\"Number of Counties\",main=\"\",col=\"lightblue\")\r\n\r\n\r\n```\r\n\r\n\/div>\r\n\r\n\r\n<\/code><\/pre>\n<p>The next section is the choropleth, for the high school dropouts, notice the R chunk parameters to size the plot area.  <\/p>\n<pre><code>\r\n\r\n```{r two, fig.width=9, fig.height=5, fig.align='right'}\r\n\r\n\r\n county_choropleth(lessHighSchool,\r\n                  \r\n                   title = \"Proportion of High School Dropouts\",\r\n                   legend=\"Proportion\",\r\n                   num_colors=9)\r\n \r\n```\r\n\r\n<\/code><\/pre>\n<p>There are three more choropleths that you will have to do on your own!  you have the data, and the syntax.  If you have trouble with this, the red file i used is <a href=\"https:\/\/raw.githubusercontent.com\/sqlshep\/SQLShepBlog\/master\/Education.Rmd\">here Education.rmd<\/a><\/p>\n<p>In the end, you should have a histogram looking like this;<\/p>\n<p><a href=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-300x239.png\" alt=\"\" width=\"300\" height=\"239\" class=\"alignnone size-medium wp-image-255\" srcset=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-300x239.png 300w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-768x613.png 768w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-1024x817.png 1024w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-624x498.png 624w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01.png 1524w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>And if you make it to the first choropleth, Percentage that did not complete high school;<\/p>\n<p><a href=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-11-at-5.12.25-PM.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-11-at-5.12.25-PM-300x156.png\" alt=\"\" width=\"300\" height=\"156\" class=\"alignnone size-medium wp-image-257\" srcset=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-11-at-5.12.25-PM-300x156.png 300w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-11-at-5.12.25-PM-768x401.png 768w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-11-at-5.12.25-PM-1024x534.png 1024w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-11-at-5.12.25-PM-624x325.png 624w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/01\/Screen-Shot-2017-01-11-at-5.12.25-PM.png 1618w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is a slight diversion into a tool built into R called R Markdown, and Shiny will be coming up in a few days. Why is this important? It gives you a living document you can add text and r scripts to to produce just the output from R. I wrote my Stats grad project [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12,9,15,11],"tags":[14,17,10,16,13],"class_list":["post-244","post","type-post","status-publish","format-standard","hentry","category-choropleth","category-r","category-statistics","category-visualization","tag-choroplethr","tag-histogram","tag-r","tag-statistics","tag-visualization"],"_links":{"self":[{"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/posts\/244","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sqlshep.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=244"}],"version-history":[{"count":10,"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/posts\/244\/revisions"}],"predecessor-version":[{"id":258,"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/posts\/244\/revisions\/258"}],"wp:attachment":[{"href":"https:\/\/sqlshep.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=244"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sqlshep.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=244"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sqlshep.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=244"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}