{"id":612,"date":"2017-12-21T10:00:14","date_gmt":"2017-12-21T10:00:14","guid":{"rendered":"https:\/\/sqlshep.com\/?p=612"},"modified":"2018-01-08T19:05:32","modified_gmt":"2018-01-08T19:05:32","slug":"linear-regression-level-200-numbers","status":"publish","type":"post","link":"https:\/\/sqlshep.com\/?p=612","title":{"rendered":"Linear Regression &#8211; Level 102, all the Numbers"},"content":{"rendered":"<p>My last linear regression post i mentioned that most of the numbers come form the residual errors, thats not entirely true.  You have a basic understanding of lm you learned that R-square is the number to look at, that is based on residual error.  You are also told to examine the p-value for each coefficient and for the entire model. P-value is a little bit harder to calculate, go search and find out for yourself.  But in lieu of that i am going to provide the actual calculation for everything you may have seen reference in an lm.<br \/>\n<!--more--><\/p>\n<p>N,n: Upper case N is population size, lower case n is the sample size. <\/p>\n<p>Degrees of Freedom: Short uncomplicated, answer is the number of cases(rows) minus the number of coefficients. <\/p>\n<p>Mean: Average of a variable or column <\/p>\n<p>Variance: (sum(mean mpg &#8211; mpg) \/ n-1) <\/p>\n<p>Standard Deviation: sqrt(variance)<\/p>\n<p>Residuals: mpg &#8211; (slope + intercept * weight) <\/p>\n<p>Error Squared: Residuals^2<\/p>\n<p>Residual Standard Error RSE: variance \/ degrees of freedom <\/p>\n<p>SSE or Sum of Squared Errors: Sum(Error Squared)<\/p>\n<p>Multiple R-Squared: 1-SSE\/(sum(mpg &#8211; mean(mpg))^2)<\/p>\n<p>Adjusted R-Squared: 1-SSE\/(sum(mpg &#8211; mean(mpg))^2)*(n-1)\/(n-(1+1))<\/p>\n<p>Even i think this is ridiculous in a blog post, so i have included the <a href=\"https:\/\/github.com\/sqlshep\/SQLShepBlog\/blob\/master\/Scripts\/R\/MTCars-lm.xlsx\">excel spread sheet<\/a> on my github site if you really want to know how all of this works. IF everything goes well these numbers will match what came out of the R summary of lm for mtcars with mpg as the predictor and weight as the explanatory variable. <\/p>\n<p><a href=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.19.37-PM.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.19.37-PM.png\" alt=\"\" width=\"660\" height=\"936\" class=\"alignnone size-full wp-image-616\" srcset=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.19.37-PM.png 660w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.19.37-PM-212x300.png 212w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.19.37-PM-624x885.png 624w\" sizes=\"auto, (max-width: 660px) 100vw, 660px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.25.45-PM.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.25.45-PM-1024x925.png\" alt=\"\" width=\"625\" height=\"565\" class=\"alignnone size-large wp-image-618\" srcset=\"https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.25.45-PM-1024x925.png 1024w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.25.45-PM-300x271.png 300w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.25.45-PM-768x693.png 768w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.25.45-PM-624x563.png 624w, https:\/\/sqlshep.com\/wp-content\/uploads\/2017\/12\/Screen-Shot-2017-12-20-at-5.25.45-PM.png 1442w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>My last linear regression post i mentioned that most of the numbers come form the residual errors, thats not entirely true. You have a basic understanding of lm you learned that R-square is the number to look at, that is based on residual error. You are also told to examine the p-value for each coefficient [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9,63,32,15],"tags":[64,10,16],"class_list":["post-612","post","type-post","status-publish","format-standard","hentry","category-r","category-regression","category-statistical-learning","category-statistics","tag-linear-regression","tag-r","tag-statistics"],"_links":{"self":[{"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/posts\/612","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sqlshep.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=612"}],"version-history":[{"count":7,"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/posts\/612\/revisions"}],"predecessor-version":[{"id":855,"href":"https:\/\/sqlshep.com\/index.php?rest_route=\/wp\/v2\/posts\/612\/revisions\/855"}],"wp:attachment":[{"href":"https:\/\/sqlshep.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=612"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sqlshep.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=612"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sqlshep.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=612"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}