How Many Munros have I climbed?

Mapping the Munros in R using ggmap

After learning to plot graphs in R using ggplot2, the next visualisation technique in R I wanted to learn was how to plot geographical data on maps. There is a package called ggmap that works with ggplot2 to allow you to plot maps.  You download a map of your chosen location from google maps and use it as a layer in the ggplot2 plotting system.  I followed the guide in the following blog post to get started.

GGmap has a function called "get_map" that allows you to access a map of your choice from google maps (or from other types of maps such as OpenStreeMap). You can pick your location through text (e.g "London") or you can input specific longitude and latitude points. You also chose the zoom which determines what scale you want to your map to be and the type of map (Satellite, Roadmap etc.).  After this you can use ggplot2 functions to, for example' add data points to the map to get a visual look at the locations of your data.


Does Everybody Love Black Panther?

Web Scraping Reviews of Marvel’s Black Panther from Rotten Tomatoes using R

Last week I went to see the new and highly hyped Marvel film – Black Panther. I had been out the night before and was severely hungover. However my boyfriend would not allow me to lounge around in bed all day and instead made me get up and go into town. While we were out, we went to see Black Panther. There are definitely worse things to do on a hangover, however I have a feeling my pounding headache combined with the fact that I'm not really a marvel fan anyway (this was only the second marvel film I had ever seen) may be the reason I was the only person in the cinema who did not enjoy the film.  Don't get me wrong I enjoyed the visual effects and light hearted humour, It's just that I had major issues with the plot line and thought I was going to be sick most of the way through. Anyway, almost everyone else I have talked to seemed to have loved the film. In fact it is the highest grossing film of 2018 so far. That got me wondering about what data is available online about the movie and what can it tell us about what people really thought of the film.

Who has won the most Brit Awards?

Visualising the Brit Awards using ggplot2

When I first learn R at university, we were taught to do all our graphs using the base R graph functions. I had no idea until earlier this year that there was another way! While searching for R help on different forums, I kept running into a plotting package called ‘ggplot2’ that everyone seemed to be using.  I decided I needed to do my research and find out what all the hype is about.  In this post I will be demonstrating some simple ‘ggplot2’ visualisations using data about the Brit Awards.  The 2018 award show only just took place a few days ago, so as well as being topical, I thought the data would make for some pretty fun graphs.

R Plotting packages & the Basics of ggplot2.

Once I started researching the best plotting packages, I realised this is  a hot topic for debate in the data science community. While it seems that ggplot2 is preferred for most styles of graph due to its logical grammar and more attractive plots, there are some things that are better to do in the base R plotting function. However for beginners and those of us who are wanting to do standard straight forward graphs (the kind you may might do on excel) ggplot2 is the way to go. I found this post most insightful for explaining the different views.


When Should you Take your Driving Test?

Mathematical Optimisation in R

Back before I moved to London and had to get rid of my car (*sob*), I used to do some of my best pondering while stuck in traffic. A thought that kept coming back to me was an optimisation problem about when is the best time to take your driving test. Driving lessons are expensive but a driving test is even more expensive – therefore how many lessons should you have before you take your test so as to spend the least amount of money? I always figured there would be some mathematical optimisation technique to come up with an answer that takes the price of driving lessons, the price of a driving test and a function that estimates your likelihood of passing the driving test given the number of lessons you'd had. However this always seemed to hurt my head too much to work out especially given I was trying to concentrate on driving at the same time.  However since starting this blog I decided to explore this idea I've had for a few years and try to work it out.


Does Taylor Swift Have a Big Reputation?

Twitter Scraping Using R

“Big Reputation, Big Reputation, Ooh you and me we’ve got Big Reputations”

Taylor Swift has just released her 6th studio album ‘Reputation’. The old Taylor is dead, and is her place is a new edgier Taylor, toughened from the years of media scrutiny, turbulent relationships and high profile celebrity feuds. As the title suggests, this is an album all about the contrast in how the world sees you to compared to who you really are and how a negative portrayal can affect your relationships. Whether you like the album or not (personally I love it), this post is not really about Taylor swift. This is about my first experience delving into the world of twitter scraping.

Twitter was created in 2006 and since then has rapidly grown into a world wide social network with around 330 million active users in 2017. Twitter along with other social media giants such as Facebook, youtube, linkedIn, Instragram etc., have revolutionised the way we share news and interact with both friends and celebrities. Every day we leave behind a trail of our virtual movements including comments, likes, networks of friends and even just pages we view. With so much data being produced every day, companies are eager to collect this data and use it to provide customer insight and improve products and services. For example, a company can search for tweets mentioning their name to see how they are being perceived by the public or they can analyse their followers to look for patterns and expand their presence online.  Accessing data in this way is often called social media mining.