Collection of diary-style entries that are short in length and primarily about subjects I learned before my boot camp.
Matplotlib, scikit-learn, and pandas
June 11, 2018
Besides launching this site, in the past month or so I've mostly been heads down in learning matplotlib, scikit-learn, and pandas --> all via Jupyter Labs in Python. That's a mouthful. I could have just said data science tools. Unfortunately, life has been busy too and the trade off hasn't allowed as much time as I'd like.
Matplotlib is a library used to visualize data. It's old (relatively). It has some boiler plate. It's diverse. Based on what I've seen, it sure seems customizable. Although there are quite a few other tools, I'm taking the advice of others and making sure I have a foundation in this workhorse tool that has been a staple for some time. Here is an excellent article on other python visualization tools. I'm planning on diving into altair sometime soon.
I timed getting into this library pretty well I think! It's mentioned in most any conversation of python and data science. I'm glad I waited until I read up on some statitics before I started playing around with this library. When one is armed with the correct knowledge (aka they understand some of the underlying concepts of what the library does) - it seems to take care some of the simple yet tedious parts of creating statistical methods.
Comfort level with this library is increasing and that makes me so jazzed! I am racing ahead of myself and imagining all the neat data wrangling I can potentially do with pandas. My Ford F-150 Gas Log data set is ripe for some data cleaning and then manipulating - I can't wait to apply pandas to it.
All my pursuits so far have been learning these libraries via tutorials. It'll be that way a little longer, and I'm hoping when I dive into my own applications soon I'll have climbed a substantial part of the learning curve due to what I'm learning now.
New Domain Name
April 27, 2018
I am the proud owner of the temporary rights to a domain name for the first time in my life. This is my first post using the Django-framework powered site: spencertollefson.com ! (I was hosting this blog under a different name before this).
Going forward this will be the permanent location for my website. Fitting, cause my name is pretty permanent, you know?
I have learned a ton in the past 2+ months, although my updates here haven't kept up. I've spent time learning how to use the basics of Django, diving into reading about statistical learning models, and also learning some of the python libs for numpy, pandas, and matplotlib.
Now that I finally have a functional site, I intend to get back to writing frequent journal entries and begin to construct blog posts here! Additionally, I want to add more functionality into this site and beautify it as well. Some examples may include using Markdown in my posts, tapping the Twitter API to allow people to tweet about my posts, building in HTTPS security (I can't with the free-tier Heroku server I'm using right now, oh well!), and add a resume section.
Six Point Update
March 10, 2018
Time for another blog post! Quick update on the six main topics I’ve focused on the last 2 weeks:
1. I’ve been consuming data science and python podcasts like no other. The list includes at least:
- Talk Python to Me
- Partially Derivative
- Command_line Heroes
- Data Skeptic
- Linear Digressions
- Learn to Code with Me
- Test & Code
- Python Bytes
To all those responsible for making those podcasts - THANK YOU! YOU HAVE MY GRATITUDE!
2. I’ve worked through about 60% of Jake VanderPlas’ Python Data Science Handbook. I am working through the Jupyter Notebook version of the book, which has code samples one can interact with. It covers iPython and four widely-used Python data science libraries: Numpy, Pandas, Matplotlib, and Scikit-Learn. I’ll be honest here. It’s been great at times and a complete slog at others -- the slog part is likely due to my inexperience. I’ve learned many commands that I immediately typed out and put to use during the tutorial, but I fear I will forget them when I begin working on my own projects. I shouldn't be so naive. I'll keep the learnings somewhere in my head, and no one codes without Google handy anyway
I have been applying some of these tools to my own library. Specifically a “gas log” my dad and I maintained in his 1989 Ford F-150 pickup. The log contains information for every gas fill-up since he purchased it brand new - including date, location, number of gallons, and price. Once I get going on the matplotlib chapter (soon!) I’ll have some fun playing with that data set.
3. Began reading Introduction to Statistical Learning by authors Gareth James, Daniela Witten, Trevor Hastie and Rob Tibshirani. I'm wading into Chapter 4 right now. This highly-regarded introductory book to machine learning statistical methods has been pushing my limits of math acumen. Big fan. I aim to apply concepts I learn in this book to modeling projects such as the gas log. The Elements of Statistical Learning by some of the same authors is likely my next statistics book.
4. I updated my weather.py script to collect and present the 5 day forecast! Previously I was gathering current weather conditions only.
5. I learned more about Test Driven-Development - aka writing tests before the code itself - but have yet to write tests. I'm sold on pytest as my python library of choice as of right now.
6. Finally, I got dirty learning about .config files. A place to store and access sensitive information like passwords, keys, and usernames. I have begun implementing this with some of my scripts.
And what have I not focused on? I haven’t been committing in Github. I haven’t been working on my own scripts (much). I haven’t put any work into creating my own blog (I did purchase the rights to the domain name!). And finally I have not been collaborating with others (re: open source contributions). All these things feel so important! Logically, finishing both books I'm reading will free up time for the above things, and if I dive into open source work I can use data science tools there.
Foray into Web Dev with Django
February 23, 2018
Baby update: I've refined the cronometer script I mentioned last time and it's worknig today! I use it every morning. Small victories! Woooooooo.
I've earned a little bit of celebration because while I have been learning, I have also been struggling these past 12 days. Most of my learning has been in Test Driven Development (TDD) and the Django framework. I’ve been going through various tutorials for both of these topics, and I am certain I've learned a large amount of theory. I’ve also practiced some of this "theory" via tutorials. I thought – “oh, I’ll make a simple blog in Django and maintain my blog there” – kind of like I currently am doing on Github. But that goal feels a bit overwhelming for now, and I am often telling myself I want to get into Panda, Numpy, and other data science packages.
I should do that. This - learning Python and related topics - shouldn’t feel like a chore. Sure - learning can be tough - but when it's gone from frustrating to borderline miserable I should be changing to a topic I enjoy. I did learn a mountain of info about TDD and web dev with Django these past two weeks. I can always jump back in and pick up those topics again. But for now, I'm moving onto what I really want to get into... data science.
Automating Food Log
February 11, 2018
I spent hours today - I'd wager 8 or 9 - creating an ugly script to automate my Cronometer entries every day. After finishing it, I committed the script to Github, at which point I realized how valuable it would have been to make a few commits earlier. It would have saved me precious time the multiple instances I made changes that broke the script. One could say I was a bit frustrated during the process. Let's be optimistic and call it a good learning opportunity.
February 10, 2018
I had heard that word too many times before understanding it. Even after hearing an explanation, years passed before I had an understanding beyond an abstract level. And creating a script sounded like rocket science. What program does one use? How is it run? Do scripts only work in .exe files? Foreign concept. After diving into Python and basic computer science theory a few months ago, it clicked. Scripts were being made. I made scripts. The concept became tangible.
By slightly modifying code laid out in Al Sweigart’s Automate The Boring Stuff, I created scripts for custom Google and Google Map searches. Moving up a notch, I installed a version of Bash, a command-line language, and customized it to my liking. This allowed me to run scripts for those Google searches, Google Map searches, and even my own script for automating a work form by a few keystrokes. It was a neat feeling - I created these programs that serve special purpose to me.
It is as if I discovered some secret treasure. Hourly I'd append ideas of automatable tasks to a list. These tasks included renaming music files, moving photo files around, and entering data in frequently visited websites. I’m more excited now about my Python and coding journey than I had been in months. A lot of the learning I had done in the last few months was finally kicking in. Syntax felt less confusing. Basic workflow made sense. I spent less time wondering where to start and more time working on real problem solving.
Now I'm spending too much time blogging and not enough coding. Back to work!
February 8, 2018
Hello reader! Whether I’m addressing myself or you happen to be another human being, welcome! This is my first blog post here on www.spencertollefson.com and I hope many more will follow.
My goals for this blog begin small, but with potential to grow over time. I want to chronicle my journey of learning technology in an interesting manner. I intend for my posts to cover Python, statistical learning methods, and other data science tools.
As of now, I intend to center the site around blog posts. Perhaps I'll blog about a new python library I tested out, a Git repo I took a look at, or something new I learned about statistical methods. I intend to vary between both short blurbs and longer analytical pieces.
That will do for an inaugural entry. Until next time!