Saturday, April 13, 2013

Data Analysis Course

UPDATE: THE BLOG/SITE HAS MOVED TO GITHUB. THE NEW LINK FOR THE BLOG/SITE IS patilv.github.io and THE LINK TO THIS POST IS:
http://bit.ly/1gWyfUT .  PLEASE UPDATE ANY BOOKMARKS YOU MAY HAVE.

Although my initial motive for signing up for the Coursera MOOC titled "Data Analysis" in Jan 2013 was to get a sense of how MOOCs work and how quickly they might make my day job extinct, I must admit I feel safer with my job likely to continue to exist (the MOOC model has its benefits and limitations, which I will share in a post at sometime). More importantly, this fun course taught by Jeff Leek from Johns Hopkins got me acquainted with R and its related packages. It required two papers (my submissions are attached to the hyperlinks in the rest of the sentence), the first one on exploratory analysis and the second involved building a predictive model. The first two figures were created for the two papers.  Figure created for the first paper on  exploratory analysis (created using ggplot).
 

Panel A shows that the distribution of interest rates (and means) is different for people who apply for a loan for different purposes. Panel B shows that the distribution of interest rates (and mean interest rate) is different for applicants from different states. Panel C shows that there is, on average, a negative relationship between FICO score and interest rate. Higher the FICO score of an applicant, lower the interest rate they paid. Panel D shows that applicants who applied for a 60 month loan generally paid a higher interest rate than those who applied for only 36 months. Panel E shows the interaction between loan length and amount of loan requested. (36 month loans are represented in red and 60 month loans are in blue). Specifically, it indicates that when the amount requested is low, there is no difference in the interest rates among applicants of 36 months or 60 months loans, but when the amount requested increased, the interest rates increased for 60 month loans but not for 36 month loans. Created using ggplot. Figure for the predictive model paper (Panel A was created using ggplot)

Panel A shows results helpful in determining number of components to retain in the principal components analysis of training data. It compares eigenvalues generated from the training data and eigenvalues from randomly generated datasets for the same sample size and number of variables using parallel analysis. Please note that for scaling purposes, the graph shows data for the first 60 components only (and not all 561 components) and it also does not plot eigenvalues for the first two components from the training data (283.39 and 36.56). According to the results, the first 42 components from the training data have higher eigenvalues than eigenvalues from randomly generated datasets. Eigenvalues of components 43 through 561 explain less than what can be done by chance. Hence, only 42 components are retained.


Panel B is a heat map of the confusion matrix showing the predictive ability of the support vector model developed on the testing data. The overlap between predicted and actual activity values are shown in the heat map. The purple region shows higher degrees of overlap (accurate classification) and the aqua blue colored regions show regions of low to no misclassification. Different shades of aqua blue denote the overlap between the predicted activity of standing with actual activities of laying and sitting. The kappa measure of this confusion matrix was .89, suggesting that the prediction was almost perfect.

23 comments:

  1. Great Article… I love to read your articles because your writing style is too good, its is very very helpful for all of us and I never get bored while reading your article because, they are becomes a more and more interesting from the starting lines until the end.
    Data Science course in kalyan nagar
    Data Science course in OMR
    Data Science course in chennai
    Data science course in velachery
    Data science course in jaya nagar
    Data Science interview questions and answers
    Data science course in bangalore

    ReplyDelete
  2. My year end undertaking was additionally on a similar theme and was acknowledged all through the school. Be that as it may, this didn't make me land a vocation where I am approached to complete a comparative thing. data science course in pune

    ReplyDelete
  3. Well, the most on top staying topic is Data Science. Data science is one of the most promising technique in the growing world. I would like to add Data science training to the preference list. Out of all, Data science course in Mumbai is making a huge difference all across the country. Thank you so much for showing your work and thank you so much for this wonderful article.

    ReplyDelete

  4. This knowledge.Excellently written article, if only all bloggers offered the same level of content as you, the internet would be a much better place. Please keep it up.

    data science institute

    ReplyDelete
  5. This is also a very good post which I really enjoyed reading. It is not every day that I have the possibility to see something like this..
    Data Science Course in Bangalore

    ReplyDelete
  6. Actually I read it yesterday but I had some thoughts about it and today I wanted to read it again because it is very well written.

    Data science course in malaysia

    ReplyDelete
  7. Attend The Data Science Training in Bangalore From ExcelR. Practical Data Science Training in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Courses in Bangalore.
    Data Science training in Bangalore

    ReplyDelete
  8. Someone sometimes with visits your blog regularly and recommended it in my experience to read as well.
    Please check Data Science Course

    ReplyDelete
  9. This is also a very good post which I really enjoyed reading. It is not every day that I have the possibility to see something like this..
    ExcelR's Data Science Course in Bangalore

    ReplyDelete
  10. Attend The Artificial Intelligence course From ExcelR. Practical Artificial Intelligence course Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Artificial Intelligence course.
    ExcelR Artificial Intelligence course

    ReplyDelete
  11. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. I would like to state about something which creates curiosity in knowing more about it. It is a part of our daily routine life which we usually don`t notice in all the things which turns the dreams in to real experiences. Back from the ages, we have been growing and world is evolving at a pace lying on the shoulder of technology. Analytics Certifications will be a great piece added to the term technology. Cheer for more ideas & innovation which are part of evolution

    ReplyDelete
  12. Nice Post...I have learn some new information.thanks for sharing.
    Data Science Course in Mumbai

    ReplyDelete
  13. Well, the most on top staying topic is Data Science. Data science is one of the most promising technique in the growing world. I would like to add Data science training to the preference list. Out of all, Data science course in mumbai
    is making a huge difference all across the country. Thank you so much for showing your work and thank you so much for this wonderful article.

    ReplyDelete

  14. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. I would like to state about something which creates curiosity in knowing more about it. It is a part of our daily routine life which we usually don`t notice in all the things which turns the dreams in to real experiences. Back from the ages, we have been growing and world is evolving at a pace lying on the shoulder of technology."institutes for data science in hyderabad" will be a great piece added to the term technology. Cheer for more ideas & innovation which are part of evolution.

    ReplyDelete
  15. Nice blog Thank you very much for the information you shared.
    data analytics courses

    ReplyDelete
  16. Nice Article...Very interesting to read this article. I have learn some new information.thanks for sharing.
    Click here

    ReplyDelete

  17. I am a new user of this site so here i saw multiple articles and posts posted by this site,I curious more interest in some of them hope you will give more information on this topics in your next articles.
    excelr data science

    ReplyDelete
  18. Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts.
    Data science course in Mumbai

    ReplyDelete
  19. Here at this site really the fastidious material collection so that everybody can enjoy a lot. ExcelR Data Science Courses

    ReplyDelete
  20. ExcelR is a good place to start an career in data science certification in pune Best Data Science Courses in Pune

    ReplyDelete
  21. I have been searching to find a comfort or effective procedure to complete this process and I think this is the most suitable way to do it effectively. ExcelR Data Science Classes In Pune

    ReplyDelete