Convert PDF pages to JPEGs using R

I often get PDFs which have interesting images in them, but the problem is how to extract them?
This R Code will find every PDF in the current folder and covert each page to a 200 dpi JPEG

library(tidyverse)
library(pdftools)
library(fs)

# Uses the fs library to list all files ending in PDF in the current directory and store them in file_list

file_list <- dir_ls(glob = "*.pdf")

# The eqivualent of a FOR loop, it iterates through each element of file_list and converts each PDF page to a 200dpi jpg using the pdftools library
lapply(file_list, FUN = function(files) {
  pdf_convert(files, format = "jpeg",dpi = 200)
})

# https://stackoverflow.com/questions/49941158/how-do-i-pull-in-multiple-pdfs-into-pdf-convert-using-r-and-pdftools-package

Categories
Uncategorized

Web Links for March 2020

R

Python

Visual Studio Code

Raspberry Pi

Compile/install Python 3.8 on Raspberry Pi | Michael Hirsch, Ph.D.
Raspberry Pi 4 Bootloader Firmware Updating / Recovery Guide
Hass.io – Home Assistant
Turns your Raspberry Pi (or another device) into the ultimate home automation hub powered by Home Assistant
Dual Fan Aluminium Heatsink Case for Raspberry Pi 4 Black Australia
Raspberry Pi 4 USB Boot Config Guide for SSD / Flash Drives
GitHub – log2ram: ramlog like for systemd (Put log into a ram folder)
Log2Ram: Extending SD Card Lifetime for Raspberry Pi

Cycling

JaYoe World Tour Homepage | Follow Matt Cycling Around The World!

Purchased “The Art of Statistics” from Amazon for $30

Statistics has played a leading role in our scientific understanding of the world for centuries, yet we are all familiar with the way statistical claims can be sensationalised, particularly in the media. In the age of big data, as data science becomes established as a discipline, a basic grasp of statistical literacy is more important than ever.

In The Art of Statistics, David Spiegelhalter guides the reader through the essential principles we need in order to derive knowledge from data. Drawing on real world problems to introduce conceptual issues, he shows us how statistics can help us determine the luckiest passenger on the Titanic, whether serial killer Harold Shipman could have been caught earlier, and if screening for ovarian cancer is beneficial.

How many trees are there on the planet? Do busier hospitals have higher survival rates? Why do old men have big ears? Spiegelhalter reveals the answers to these and many other questions – questions that can only be addressed using statistical science.

https://dspiegel29.github.io/ArtofStatistics/
The Art of Statistics: Code, Data, Errata and Additions | ArtofStatistics

Categories
Uncategorized

I built something similar — a bit more on the #NLP

from Twitter https://twitter.com/stephen_hucker

Categories
Uncategorized

Accelerate your plots with ggforce

from Twitter https://twitter.com/stephen_hucker

Publication quality figures with ggplot2

from Twitter https://twitter.com/stephen_hucker

rstats monsters illustrations

from Twitter https://twitter.com/stephen_hucker

Categories
Uncategorized

Started studying a book I purchased in February Data Visualization: A Practical Introduction

The book provides students and researchers a hands-on introduction to the principles and practice of data visualization. It explains what makes some graphs succeed while others fail, how to make high-quality figures from data using powerful and reproducible methods, and how to think about data visualization in an honest and effective way.

 Data Visualization builds the reader’s expertise in ggplot2, a versatile visualization library for the R programming language. Through a series of worked examples, this accessible primer then demonstrates how to create plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics include plotting continuous and categorical variables; layering information on graphics; producing effective “small multiple plots; grouping, summarizing, and transforming data for plotting; creating maps; working with the output of statistical models; and refining plots to make them more comprehensible.

Web Excursions for March 2019

R for Blogging

The fastest cyclists of Europe live in …
Analyzing STRAVA data to find out which city has the faster cyclists with R and R-shiny


The Best Free Books for Learning Data Science


Lecturer who uses Academic hugo theme for her website on github

Article written in Blogdown:

Article written in Radix:


Radix is based on the Distill web framework, which was originally created for use in the Distill Machine Learning Journal. Radix combines the technical authoring features of Distill with R Markdown.

Categories
Uncategorized

Started learning R

I am studying data science. I have chosen to learn R first (before Python) because of its excellent visualisation capabilities.
I have just finished the first unit
– Introduction to the Tidyverse Course

https://www.datacamp.com/statement-of-accomplishment/course/3ddf2e74ed840487f86ad22457af0ed00f20d956

Categories
Uncategorized

DataCamp .. R you learning?

This week I noticed Datacamp had a half price special, only$200 AUD for a years subscription. So I have cut down on playing games (sorry Battlefield V) and used that time to learn the tidyverse on datacamp.

Visualizing median GDP per capita over time

A line plot is useful for visualizing trends over time. In this exercise, you'll examine how the median GDP per capita has changed over time.
The interactive environment where you can write R code and see the results.

According to Datacamp’s website it is “the smartest way to learn Data Science Online. The skills people and businesses need to succeed are changing. No matter where you are in your career or what field you work in, you will need to understand the language of data. With DataCamp, you learn data science today and apply it tomorrow.