Module # 10 Time Series and Visualization

 

Time Series and Visualizations in GGPlot2

1) This week we look over some graphs in RStudio and then discuss the input of visualization on time series analysis.
> ### Hot Dog Graph Regular ###
> hotdogs <- read.csv("CENCORED/hot-dog-contest-winners.csv")
> head(hotdogs)
  Year                       Winner Dogs.eaten       Country New.record
1 1980 Paul Siederman & Joe Baldini       9.10 United States          0
2 1981              Thomas DeBerry       11.00 United States          0
3 1982               Steven Abrams       11.00 United States          0
4 1983                 Luis Llamas       19.50        Mexico          0
5 1984               Birgit Felden        9.50       Germany          0
6 1985             Oscar Rodriguez       11.75 United States          0
> library("ggplot2")
> 
> # Adjust colors
> colors <- ifelse(hotdogs$New.record == 1, "green", "purple")  # Changed to a more vibrant color
> barplot(hotdogs$Dogs.eaten, names.arg = hotdogs$Year, col=colors, border=NA,
+         main = "Nathan's Hot Dog Eating Contest Results, 1980-2010", 
+         xlab="Year", ylab="Hot Dogs and Buns (HDBs) Eaten",
+         cex.main=1.5,  # Increase title size
+         cex.lab=1.2)   # Increase axis label size
> 



2) The bar plot above shows the results of Nathan's Hot Dog Eating Contest from 1980 to 2010, showcasing the number of hot dogs and buns consumed by winners each year. The use of complimentary colors—green for new records and purple for others—enhances the visual appeal and clearly distinguishes between record-breaking performances. The axis labels and title are enlarged for better readability, creating an engaging representation of this competitive eating event.



> ### Hot Dog Graph GGPlot2 ###
> ggplot(hotdogs) + 
+   geom_bar(aes(x=Year, y=Dogs.eaten, fill=factor(New.record)), stat="identity") + 
+   labs(title="Nathan's Hot Dog Eating Contest Results, 1980-2010", 
+        fill="New Record") + 
+   xlab("Year") + 
+   ylab("Hot Dogs and Buns (HDBs) Eaten") + 
+   scale_fill_manual(values = c("lightblue", "lightyellow")) +  # Match colors to barplot
+   theme_minimal() +  # Add a minimal theme
+   theme(text = element_text(size=12, family = "Comic Sans MS"),  # Fun font for hot dog graphs
+         plot.title = element_text(size=16, face="bold", family = "Comic Sans MS"),  # Title style
+         axis.title = element_text(size=14, family = "Comic Sans MS"))  # Axis title style
> 



3) The ggplot2 version of the hot dog contest data shows the same results in a bar graph format with a much cleaner look. Light blue and light yellow fill colors differentiate new records from other performances. A minimal theme and playful font, I tried to make it in comic sans but I didn't have enough time to look up what package would allow me to do that. This version is visually appealing and easily readable.


> ### Economics Data GGPlot ###
> head(economics)
# A tibble: 6 × 8
  date         pce    pop psavert uempmed unemploy  year decade
  <date>     <dbl>  <dbl>   <dbl>   <dbl>    <dbl> <dbl>  <dbl>
1 1967-07-01  507. 198712    12.6     4.5     2944  1967   1960
2 1967-08-01  510. 198911    12.6     4.7     2945  1967   1960
3 1967-09-01  516. 199113    11.9     4.6     2958  1967   1960
4 1967-10-01  512. 199311    12.9     4.9     3143  1967   1960
5 1967-11-01  517. 199498    12.8     4.7     3066  1967   1960
6 1967-12-01  525. 199657    11.8     4.8     3018  1967   1960
> year <- function(x) as.POSIXlt(x)$year + 1900
> economics$year <- year(economics$date)  # Setting up our analysis
> head(economics)
# A tibble: 6 × 8
  date         pce    pop psavert uempmed unemploy  year decade
  <date>     <dbl>  <dbl>   <dbl>   <dbl>    <dbl> <dbl>  <dbl>
1 1967-07-01  507. 198712    12.6     4.5     2944  1967   1960
2 1967-08-01  510. 198911    12.6     4.7     2945  1967   1960
3 1967-09-01  516. 199113    11.9     4.6     2958  1967   1960
4 1967-10-01  512. 199311    12.9     4.9     3143  1967   1960
5 1967-11-01  517. 199498    12.8     4.7     3066  1967   1960
6 1967-12-01  525. 199657    11.8     4.8     3018  1967   1960
> 
> plot1 <- qplot(date, unemploy / pop, data = economics, geom = "line",
+                main = "Unemployment Rate Over Time",
+                xlab = "Date", ylab = "Unemployment Rate") +
+   geom_line(color = "blue") +  # Custom color for line
+   theme_minimal() + 
+   theme(text = element_text(size=12, family = "Arial"),  # Professional font for economics
+         plot.title = element_text(size=16, face="bold", family = "Arial"),  # Title style
+         axis.title = element_text(size=14, family = "Arial"))  # Axis title style
> 




4) This plot depicts the unemployment rate over time, utilizing a line graph to illustrate trends. Dates are on the x-axis and the unemployment rate is on the y-axis. The blue and yellow lines highlights significant fluctuations. The minimal theme keeps the focus on the data, allowing for easy interpretation of economic trends against the clear background.


> ### Economics Data GGPlot2 ###
> library(gridExtra)
> 
> plot2 <- qplot(date, uempmed, data = economics, geom = "line",
+                main = "Median Unemployment Duration Over Time",
+                xlab = "Date", ylab = "Median Duration (Months)") + 
+   geom_line(color = "orange") +  # Custom color for line
+   theme_minimal() + 
+   theme(text = element_text(size=12, family = "Arial"),  # Professional font for economics
+         plot.title = element_text(size=16, face="bold", family = "Arial"),  # Title style
+         axis.title = element_text(size=14, family = "Arial"))  # Axis title style
> 
> grid.arrange(plot1, plot2, ncol=2)
> 
> # Convert year to decade for grouping
> economics$decade <- floor(economics$year / 10) * 10
> 
> # Combined plot with adjusted aesthetics
> plot_combined <- qplot(unemploy / pop, uempmed, data = economics, 
+                        geom = c("point", "path"), 
+                        color = factor(decade)) +  # Color by decade
+   labs(title = "Unemployment Rate vs Median Duration",
+        x = "Unemployment Rate", y = "Median Duration (Months)") + 
+   scale_color_viridis_d() +  # Use a discrete viridis color scale
+   theme_minimal() + 
+   theme(text = element_text(size=12, family = "Arial"),  # Professional font for economics
+         plot.title = element_text(size=16, face="bold", family = "Arial"),  # Title style
+         axis.title = element_text(size=14, family = "Arial"),  # Axis title style
+         legend.title = element_blank())  # Remove legend title
> 
> grid.arrange(plot_combined, ncol=1)
>



5) This plot analyzes the relationship between unemployment rate and median duration of unemployment, highlighting data across decades. Points and lines represent the data, while decade-based coloring helps categorize time periods. The discrete color scale improves visibility, and the clear labeling promotes effective understanding of the economic data, facilitating a comprehensive view of employment trends. Much better than a fluctuation of the same color. 








Comments