Tag Archives: logistic regression

Logistic Regression 5, train, test, predict

Published / by shep2010

The last in this series is splitting the the data into train and test then attempt a prediction with the test data set. It is possible to do this early in the process, but no harm in waiting as long as you do it eventually.

First off once again lets load the data. Notice i have added a new package for splitstackshape which provides a method to stratify the data.

If you are more interested in Python i have created a notebook using the same dataset but with the python solution here.

Continue reading

Logistic Regression 4, evaluate the model

Published / by shep2010

In Logistic Regression 3 we created a model, quite blindly i might add. What do i mean by that? I spent a lot of time getting the single data file ready and had thrown out about 50 variables that you never had to worry about. If you are feeling froggy you can go to the census and every government website to create your own file with 100+ variables. But, sometimes its more fun to tale your data and slam it into a model and see what happens.

What still needs done is to look for colliniearity in the data, in the last post i removed the RUC variables form the model since the will change exactly with the population, the only thing they may add value to is if i wanted to use a factor for population vs. the actual value.

If you are more interested in Python i have created a notebook using the same dataset but with the python solution here.

Continue reading

Logistic Regression 3, the model

Published / by shep2010 / 1 Comment on Logistic Regression 3, the model

Woo Hoo here we go, in this post we will predict a president, sort of, we are going to do it knowing who the winner is so technically we are cheating. But, the main point is to demonstrate the model not the data. The ISLR has an great section on Logistic Regression, though i thing the data chose was terrible. I would advise walking through it then finding a dataset that has a 0 or 1 outcome. Stock market data for models sucks, i really hate using it and really try to avoid it.

Continue reading

Logistic Regression 2, the graphs

Published / by shep2010

I should probably continue the blog where I mentioned i would write about Logistic Regression. I have been putting this off for a while as i needed some time to pass between a paper my college stats team and I submitted and this blog post. Well, here we are. This is meant to demonstrate Logistic Regression, nothing more, i am going to use election data because it is interesting, not because i care about proving anything. Census data, demographics, maps, election data are all interesting to me, so that makes it fun to play with.

[1]
Continue reading

Logistic Regression 1.

Published / by shep2010 / 1 Comment on Logistic Regression 1.

I learned logistic regression using SPSS, which if you have no plans to use SPSS in life its sort of a waste of time. My professor said “i don’t know how to do that in SPSS” a lot, professor also said look it up on google a lot too. I expected better from Harvard Extension, but after i found out how easy it is to teach a class there i suddenly wasn’t surprised. Point being, do as much research on instructors and professors before had as you possibly can, if there is no public data or this is the their first or second year teaching, maybe pass for something else. The class assembled and demanded a refund from Harvard btw, Harvard is a for profit private school, they did not amass a 37 billion dollar endowment by issuing refunds, so, buyer beware, and we did not get our money back.
Continue reading