A text mining function for websites

For one of my projects I needed to download text from multiple websites. In this case, I used rvest and dplyr. Accessing the information you want can be relatively easy if the sources come from the same websites, but pretty tedious when the websites are heterogenous. The reason is how the content is kept in the HTML of the website (Disclaimer: I am not an expert at all on HTML or anything website related). Assume that you want to extract the title, author information, publish date, and of course the main article text. You can identify the location of that information via Cascading Style Sheets (CSS) or XML Path Language (XPath). As soon as you have the CSS or XPath locations, you can access it in R. The following text will walk you through an example and provide the relevant code.

Continue reading A text mining function for websites

Using RStudio and LaTeX

This post will explain how to integrate RStudio and LaTeX, especially the inclusion of well-formatted tables and nice-looking graphs and figures produced in RStudio and imported to LaTeX. To follow along you will need RStudio, MS Excel and LaTeX.

Continue reading Using RStudio and LaTeX

A modern take on how to formulate and answer a research question experimentally and empirically

Science is there to answer questions, and it is a powerful tool at that. However, the scientific method cannot answer all questions. In this post I outline how I approach the task of coming up with research questions, how to answer them and how to create a publishable manuscript describing this procedure. It is very idiosyncratic, but I hope that it might be useful for some readers, especially students.

Continue reading A modern take on how to formulate and answer a research question experimentally and empirically

Using RStudio and Git version control

lIt is fairly easy to link Github or Bitbucket with RStudio, in order to enable version control, or in order to work collectively on a data project, science article, or book. It can also be used to make your data or project publicly accessible (however, there is no guarantee that it will be accessible forever, and also it doesn’t get a DOI, so e.g. OSF might be a better alternative).

Github and Bitbucket are web-based filehosts that support the version control Git. Git allows you to track changes to files, to revert files to earlier stages, and to work on files in groups. This makes it especially important for work among programmers, data analysists, and also researchers. Github and Bitbucket store all the information on different versions of your project on their server, so that others can see exactly what others on the same projects worked on, or changed.

This post will explain to you how to set up Github and Bitbucket with RStudio in order to enable version control and storage in an external repository. In nerd-speak, it explains how to “push your commits to an external repo”. Note the main differences between Github and Bitbucket relevant to this post are that the former allows you to create a public repo free of charge, while the latter allows you to create a private repo free of charge. Choose one of both platforms (or both) so that it suits your needs.

I am not going to explain how to download, install, or set up Git on your computer. I expect that you did all that and now want to link it to RStudio.

Continue reading Using RStudio and Git version control

Nudging or coercing people to protect the climate?

Even though there is some disagreement as to who is responsible for climate change, it is beyond substantial scientific doubt that climate change is a major threat for humanity in the 21st century. However, there still is doubt among the public. Why? There is an intuitively appealing theory why this is the case: Solution aversion. The theory attempts to explain why Republicans are much more likely to deny the reality of man-made climate change, while Democrats tend to accept it as fact. The model proposes that people deny the existence of a problem partly or primarily because they disagree with the solutions that have been proposed to solve the problem. In the case of climate change these proposed solutions would be regulations that disagree with what conservatives favor: market solutions.

Continue reading Nudging or coercing people to protect the climate?

Calculating the smallest effect size of interest with G*Power

In this post I give a brief instruction on how to calculate the smallest effect size of interest with output from G*Power. My instruction is largely based on an excellent blog post from a blog named “The 20% Statistician” by Daniel Lakens. Mr. Lakens is an experimental psychologist at the Human-Technology Interaction group at Eindhoven University of Technology, The Netherlands.

Continue reading Calculating the smallest effect size of interest with G*Power

a drop in the ocean

I am a member of an NGO called Projekt Seehilfe e.V. We support refugees in Sicily with materialistic and idealistic help. Last year, I was one of three people that went to Sicily, together with another NGO from Germany: Hanseatic Help e.V.

I wrote two texts that summarized what I experienced there and what I thought when I was confronted with a type of problem that plays an important role in contemporary Europe. Because these texts are in German, I will publish an English version here. You can find the original article here.

A drop in the ocean

Mustafa* stands in front of the table, and he is looking at me. I just gave him the card of the NGO called Projekt Seehilfe e.V. with which I am in Catania, Sicily. Initially, I just gave it to his friend, who was much more talkative and who appeared to be interested. However, because he was standing right next to him, it felt wrong not to give him one as well. Holding the card, he looks like he does not know what to do with it. Timidly, he laughs. It seems that my assessment was correct. He stands in front of me, says thank you and looks me in the eyes. We didn’t even talk during the whole dinner, and I had the impression that he was just about to leave. We start talking. I cannot remember why and how we start the conversation – probably with one of these generic questions, like: “Where are you from?” Then, he starts talking. Continue reading a drop in the ocean

mean differences or mean changes?

While analyzing data from an experiment, I found myself writing things like “The treatment changes the outcome variably by…” or “the treatment leads to changes in the outcome variable”. However, I often thought that talking about changes sounded too ‘dynamic’. After all, I was referring to two different groups of subjects (between-subjects design). What I was doing was to statistically compare means of the outcome variable of different groups. I was ok to talk about changes when referring to within-subject differences, i.e. changes in outcomes for the same subject due to an intervention, but for the between-subjects case, shouldn’t I rather talk about differences instead of changes? Continue reading mean differences or mean changes?

500 refugees arrive together with us in sicily

I am a member of an NGO called Projekt Seehilfe e.V. that supports refugees in Sicily with materialistic and idealistic help. Last year, I was one of three people that went to Sicily, together with another NGO from Germany, called Hanseatic Help e.V. In April this year, we will go there and help yet again.

In Sicily, I wrote two texts that summarized what I experienced, thought and felt when I was confronted with a type of problem that nowadays plays an important role in Europe. Since the texts are in German, I will translate them into English for this blog. You can find one of the two original articles, the one translated below, here.

500 refugees arrive together with us in Sicily

In 1939, John Steinbeck published “The Grapes of Wrath”. In this novel he deals with the destiny of a family that decides to leave home in search for a future with paid work, in the West of the USA. The novel is set in the United States during a time, when farmers in the Middle West where threatened by the Dust Bowl and the Great Depression. Because if this, many , including the protagonists of Steinbeck’s novel, left for California. The author describes very empathetically and realistically how the ‘Oakies’, as they were disdainfully called by the Californian inhabitants, confront many problems during their search for a new home: An undersupply of fairly paid work, rejection by the local population, no future in Eden. Instead, they face labor camps, disdain and hate of the locals. This only fuels the desperation of those who are refugees in their own country. Continue reading 500 refugees arrive together with us in sicily

the perfect refugee

I stumbled across a recent research article published in Science magazine, called “How economic, humanitarian, and religious concerns shape European attitudes toward asylum seekers” by Kirk Bansak, Jens Jainmueller, and Dominik Hangartner.

This article struck my attention because I find the topic very interesting, because it was an experiment and not ‘just’ an empirical investigation of data, and because they decided to present their findings not only in a boring regression table, but in a colorful ropeladder plot.

Continue reading the perfect refugee