The amunategui.github.io Applied Data Science Portal

Data Exploration & Machine Learning, Hands-on

Practical walkthroughs on machine learning, data exploration and insight finding.

Become a Data Scientist Essential Tips
Hey there, sign up for the latest updates and I'll personally send you my free eBook:
Thanks for your interest in the amunategui.github.io blog!! Best, Manuel
The Amunategui.GitHub.io Applied Data Science Portal


Welcome to amunategui.github.io, your portal for practical data science walkthroughs in the Python and R programming languages


I attempt to break down complex machine learning ideas and algorithms into practical applications using clear steps and publicly available data sets. If you're looking for applied walkthroughs of ML and AI concepts, you've come to the right place - happy learning!



Popular/New Posts:


All Posts:
  1. How to Create Your Own Free Email Signup Form and Enjoy 100% Creative Freedom - For Static & Semi-Static Web Sites Published Sep 15, 2018

  2. From Financial Compliance to Fraud Detection with Conditional Variational Autoencoders (CVAE) and Tensorflow Published Aug 18, 2018

  3. How Blogging and Making YouTube Videos Landed Me the Best Job Published Jul 24, 2018

  4. Your Git Commit Comments, and What They Reveal About You Published Jul 21, 2018

  5. Exploring Some Pair-Trading Concepts with Python Published Jul 19, 2018

  6. My Six Favorite Free Data Science Classes and the Giants Behind Them Published Jul 8, 2018

  7. Hosting a Flask Application on AWS Beanstalk Published Jun 26, 2018

  8. TensorFlow Won the Attention Battle, Who’s Next? Published Jun 23, 2018

  9. GPUs on Google Cloud - the Fast Way & the Slow Way Published Jun 16, 2018

  10. Executive Time Management — Don’t Suffocate the Creative Process Published Jun 13, 2018

  11. Pairing Reinforcement Learning and Machine Learning, an Enhanced Emergency Response Scenario Published Jun 4, 2018

  12. Find Your Next Programming Language By Measuring “The Knowledge Gap” on StackOverflow.com Published May 24, 2018

  13. My #1 Piece of Advice for Aspiring Data Scientists Published May 21, 2018

  14. Chatbot Conversations From Customer Service Transcripts Published Apr 29, 2018

  15. Serverless Hosting On Microsoft Azure - A Simple Flask Example Published Apr 8, 2018

  16. Google Video Intelligence, TensorFlow And Inception V3 - Recognizing Not-So-Famous-People Published Mar 30, 2018

  17. Rapid Prototyping on Google App Engine - Build a Trip Planner with Google Maps and Yelp Published Jan 20, 2018

  18. Yelp v3 and a Romantic Trip Across the USA, One Florist at a Time Published Jan 6, 2018

  19. Show it to the World! Build a Free Art Portfolio Website on GitHub.io in 20 Minutes! Published Jan 1, 2018

  20. Google Video Intelligence and Vision APIs - Automatically Recognize Actors and Download their Biographies in Real Time Published Dec 16, 2017

  21. Life Coefficients - Modeling Life Expectancy and Prototyping it on the Web with Flask and PythonAnywhere Published Dec 2, 2017

  22. Convolutional Neural Networks And Unconventional Data - Predicting The Stock Market Using Images Published Nov 2, 2017

  23. The Fallacy of the Data Scientist's Venn Diagram Published Nov 1, 2017

  24. Reinforcement Learning - A Simple Python Example and a Step Closer to AI with Assisted Q-Learning Published Sep 30, 2017

  25. Simple Heuristics - Graphviz and Decision Trees to Quickly Find Patterns in your Data Published Sep 13, 2017

  26. Office Automation Part 3 - Classifying Enron Emails with Google's Tensorflow Deep Neural Network Classifier Published Jul 2, 2017

  27. Office Automation Part 2 - Using Pre-Trained Word-Embedded Vectors to Categorize the Enron Email Dataset Published Jun 18, 2017

  28. Office Automation Part 1 - Sorting Departmental Emails with Tensorflow and Word-Embedded Vectors Published Jun 14, 2017

  29. Easy Market Profile in Python: Grasp Price Action Quickly Published Apr 21, 2017

  30. What-if Roadmap - Assessing Live Opportunities and their Paths to Success or Failure Published Mar 5, 2017

  31. Where Are Your Customers Coming From And Where Are They Going - Reporting On Complex Customer Behavior In Plain English With C5.0 Published Dec 26, 2016

  32. Databricks, SparkR and Distributed Naive Bayes Modeling Published Nov 26, 2016

  33. R and Azure ML - Your One-Stop Modeling Pipeline in The Cloud! Published Nov 20, 2016

  34. Get Your "all-else-held-equal" Odds-Ratio Story for Non-Linear Models! Published Nov 15, 2016

  35. Predict Stock-Market Behavior using Markov Chains and R Published Aug 31, 2016

  36. Big Data Surveillance: Use EC2, PostgreSQL and Python to Download all Hacker News Data! Published Jul 20, 2016

  37. The Peter Norvig Magic Spell Checker in R Published Jun 16, 2016

  38. Actionable Insights: Getting Variable Importance at the Prediction Level in R Published May 2, 2016

  39. Survival Ensembles: Survival Plus Classification for Improved Time-Based Predictions in R Published March 19, 2016

  40. Anomaly Detection: Increasing Classification Accuracy with H2O's Autoencoder and R Published Jan 11, 2016

  41. H2O & RStudio Server on Amazon Web Services (AWS), the Easy Way! Published Dec 27, 2015

  42. Analyze Classic Works of Literature from Around the World with Project Gutenberg and R Published Dec 12, 2015

  43. Speak Like a Doctor - Use Natural Language Processing to Predict Medical Words in R Published Nov 22, 2015

  44. Supercharge R with Spark: Getting Apache's SparkR Up and Running on Amazon Web Services (AWS) Published Sep 30, 2015

  45. R and Excel: Making Your Data Dumps Pretty with XLConnect Published Jul 7, 2015

  46. Going from an Idea to a Pitch: Hosting your Python Application using Flask and Amazon Web Services (AWS) Published Jun 12, 2015

  47. Getting PubMed Medical Text with R and Package {RISmed} Published Apr 17, 2015

  48. Find Variable Importance for any Model - Prediction Shuffling with R Published Mar 27, 2015

  49. Bagging / Bootstrap Aggregation with R Published Mar 7, 2015

  50. Feature Hashing (a.k.a. The Hashing Trick) With R Published Feb 21, 2015

  51. Yelp, httr and a Romantic Trip Across the United States, One Florist at a Time Published Jan 14, 2015

  52. Quantifying the Spread: Measuring Strength and Direction of Predictors with the Summary Function Published Dec 27, 2014

  53. Downloading Data from Google Trends And Analyzing It With R Published Dec 17, 2014

  54. Using String Distance {stringdist} To Handle Large Text Factors, Cluster Them Into Supersets Published Nov 30, 2014

  55. SMOTE - Supersampling Rare Events in R Published Nov 13, 2014

  56. Let's Get Rich! See how {quantmod} And R Can Enrich Your Knowledge Of The Financial Markets! Published Nov 10, 2014

  57. How To Work With Files Too Large For A Computer’s RAM? Using R To Process Large Data In Chunks Published Nov 4, 2014

  58. Predicting Multiple Discrete Values with Multinomials, Neural Networks and the {nnet} Package Published Nov 1, 2014

  59. Modeling 101 - Predicting Binary Outcomes with R, gbm, glmnet, and {caret} Published Oct 22, 2014

  60. Reducing High Dimensional Data with Principle Component Analysis (PCA) and prcomp Published Oct 13, 2014

  61. The Sparse Matrix and {glmnet} Published Oct 8, 2014

  62. Brief Walkthrough Of The dummyVars Function From {caret} Published Oct 2, 2014

  63. Ensemble Feature Selection On Steroids: {fscaret} Package Published Oct 1, 2014

  64. Mapping The United States Census With {ggmap} Published Sep 29, 2014

  65. Using Correlations To Understand Your Data Published Sep 27, 2014

  66. Brief Guide On Running RStudio Server On Amazon Web Services Published May 15, 2014