How do we science?

Teach Yourself R Without Losing Your Mind

Congratulations, you’ve decided to learn R!  This programming language will streamline data analysis, facilitate statistics, and drive you insane!  And if you’re reading this, you’ve probably decided to go next-level-nutjob and teach yourself R.  Although no blog post can take away all the pain this will entail, I’m hoping to make your experience slightly less miserable with my reflections on learning R solo.

To clarify, this post will probably be most applicable to people who need to learn R for reasons similar to mine.  As a graduate student studying biogeochemistry, I have large data sets that require statistical analysis and graphing capabilities not available in Excel.  I also wanted to have more control over my data analysis, including ensuring that it’s reproducible.  I am by no means an R expert- I spent 4 hours making a graph yesterday- but that probably means I’m similar to a lot of other R newbies out there.  The following tips are organized roughly in the order I used them to learn R.

Once you can code, you're basically one step away from becoming a mastermind detective. Image from http://www.bbc.com/news/technology-25638870

Once you can code, you’re basically one step away from becoming a mastermind detective. Image from http://www.bbc.com/news/technology-25638870

Take a course, in person or online.  You’ve probably decided to teach yourself R because you can’t commit to enrolling in a class.  Still, it’s useful to hear a real live person talk about R, so seek out a bootcamp or short course on R and/or data management.  There are also TONS of online classes for learning R that can be found through the magical portal of Google.  Your university or local library may offer free access to some popular online course platforms.  When selecting a class, consider its cost, your level of proficiency, the length of the course, and availability of official documentation.  I’d highly recommend finding a course that includes activities so you can follow along with your disembodied instructor; it’s great to be able to pause, fiddle with some code, and rewind a segment four more times until you understand what just happened.  I took “Up and Running with R” through Lynda, mostly because I could access it for free through UNC, and thought it gave me a solid foundation, but there’s a lot out there.

Buy a reference book.  And yeah, you’ll actually want to buy it, unless your university library lets you check out books for 3 years.  Books are tricky because R evolves so quickly, but a reasonably recent book will help you find your footing and provide a reference for foundational R tasks, like importing files or writing functions.  It’s also nice to have a physical reference that won’t lead you down an internet wormhole of R minutia.  I used R for Dummies by Andrie de Vries & Joris Meys and Statistics for Ecologists Using R and Excel by Mark Gardener.

R Studio is so beautiful! Especially when you make rainbow graphs. Image from http://www.sthda.com/english/wiki/running-rstudio-and-setting-up-your-working-directory-easy-r-programming

R Studio is so beautiful! Especially when you make rainbow graphs. Image from http://www.sthda.com/english/wiki/running-rstudio-and-setting-up-your-working-directory-easy-r-programming

Get the right programs.  Did you first experience R when it was a scary black box reminiscent of MS-DOS?  R Studio has given that nonsense a major upgrade, making R a bit more intuitive.  R Studio creates a four-panel workspace that allows you to enter code, view plots, check dataframes, read help guides, and open tables, all in the same screen.  It’s a terrific surprise if you’re returning to R after a years-long hiatus, although admittedly it’s probably the only terrific surprise in your R journey.

Once you’re comfortable with R Studio, try out some of the other advancements that have made R more user-friendly.  I highly recommend R Markdown.  It allows you to write code in chunks that can be hidden or expanded, which is so helpful for complicated projects that require extensive data pre-treatment.  R Markdown is like writing an essay, in which each paragraph has its own focus, whereas regular R code is more comparable to a stream-of-consciousness poem by e.e. cummings.  R Markdown scripts can also be converted to pretty HTML pages that document your analysis.

An example of version control using GitHub. The red and green lines are code that was deleted and added, respectively. Image from https://wrongsideofmemphis.wordpress.com/2013/02/19/github-for-reviewing-code/

An example of version control using GitHub. The red and green lines are code that was deleted and added, respectively. Image from https://wrongsideofmemphis.wordpress.com/2013/02/19/github-for-reviewing-code/

GitHub is another neat addition to your Rsenal (haR haR).  It’s a type of version control, which is basically the tech equivalent of a time machine.  Say your code worked like a charm on Monday.  Three days later, you make some changes and ERROR: the code’s broken and your PhD is ruined.  Not if you have version control!  Every time you make a significant change to your code, save it using GitHub on your desktop and back it up to the cloud.  GitHub can integrate different versions of code so you can view past changes and restore code if necessary.  Even if you never use your saved code, knowing that it’s there will liberate you to experiment with your scripts without worrying you’re going to ruin your work.  GitHub can also be useful if you’re collaborating with other people, especially if you’re working with a double agent who’s secretly trying to take you down.

Learn file organization.  Snore.  This sounds like the.most.boring.advice.possible right?  So wrong, at least if you’re Type-A.  Learning to properly organize my files was the single most important breakthrough for me in learning R.  Effectively, the program is all about uploading data, crunching it, and spitting it back out as a csv or graph.  At minimum, that requires an organizational plan with consistent rules for giving files meaningful names and locations.  I love nested folders for this, but do what works best for you and your project.

You should also #mark up your code like crazy.  No, I haven’t gone all #Millennial on you: # is used to denote a line of comments in R code.  Your code might make sense while you’re writing it, but trust me, it’s not going to make sense in three weeks when you return to the project.  It’s important that collaborators can decipher your code, but it’s often more critical that it’s clear to Future You.  (Bonus points if you get the HIMYM reference.)  Think of Future You as a well-meaning but slightly daffy person who needs all the help he or she can get in understanding your script, and write to that person!  You’ll thank yourself when you’re working on thesis revisions.

Ask the internet.  There is a special place in heaven for the people who answer questions on Stack Overflow and Cross Validated.  I don’t know who they are or why they have decided to grace us with their endless knowledge, but they are lifesavers.  However, they won’t be helpful until you’ve found your footing with R, which is why this suggestion is near the end of my list.  R really is a language, and you won’t know the proper vocabulary until you’ve been studying it for a while.  You’ll generally need to Google your question using standard R terminology to find a suitable answer.  Once you can do that, you’ll be golden, but I found the internet to be most helpful after I’d developed a basic proficiency.

If none of these tips is useful, just motivate yourself by thinking how you too can look slightly evil while money rains on your head once you perfect R. Image from https://www.dezyre.com/article/why-r-programming-language-still-rules-data-science/161

If none of these tips is useful, just motivate yourself by thinking how you too can look slightly evil while money rains on your head once you perfect R. Image from https://www.dezyre.com/article/why-r-programming-language-still-rules-data-science/161

Find a friend.  Do you have a labmate/friend/casual acquaintance who knows R?  Awesome.  Never let that person go.  Bake them cookies, sabotage their graduation, do whatever it takes to keep them around, because although this post is called “Teach Yourself R,” plot twist, that’s actually impossible on your own.  When you inevitably get stuck and/or reach the verge of a breakdown, your Best R Friend can help you troubleshoot and reassure you that everyone forgets commas sometimes.  If you really can’t find an R friend, see if your university has a programming group, or if you can start one up within your department.  Like many other things in life, talking your (coding) problems through with a friend is one of the best ways to get through them.

Have patience.  Don’t expect to learn R in a semester!  It will probably take you at least 6 months to feel proficient in the program, and that’s with regular practice.  Just as you can’t learn Spanish by watching the odd telenovela, you can’t learn R without coding at least a few times a week.  Keep at it, because just when you think no puedo hacerlo, you’ll turn a corner.  If you start dreaming in R, though, you’ve probably gone too far.

GOOD LUCK, you genius, you!

One thought on “Teach Yourself R Without Losing Your Mind

  1. Pingback: A year in review | UNder the C

Leave a comment