Learn R Programming

wikkitidy

Tidy analysis of Wikipedia in R

What’s in a name?

wiki: There are many wikis, but one dominates the Wikiverse. Wikipedia is the largest repository of facts ever assembled by human hands. Scholars the world over are turning to Wikipedia to understand how twenty-first century society understands itself.

quiddity: The ‘whatness’ of a thing. The kind of thing it is. What is Wikipedia? Is it merely another encyclopaedia? It is news presented as history? Is it the consensus of a global village, or the battleground of an ideological war?

tidy: The best kind of data. R programmers are lucky to have access to the tidyverse, a collection of packages that make it easy to analyse, visualise and publish data. This package embodies tidy data principles by returning results from Wikipedia’s APIs as tibbles or simple vectors, and by providing a number of vectorised analysis functions that can be applied reliably and without fuss to the data you retrieve.

Thus wikkitidy’s aim: to help you work out what Wikipedia is with minimal data wrangling and cleaning.

Getting to 1.0

VersionFeatureDone?
0.1Basic request objects:white_check_mark:
0.2Calls and response objects for Core and Wikimedia REST APIs:white_large_square:
0.3Calls and response objects for MediaWiki Action API Query Modules:white_large_square:
0.4Interface to Wikipedia XML dumps:white_large_square:
0.5Implementation of Wikiblame:white_large_square:
0.6Calls and response objects for the XTools and WikiMedia APIs:white_large_square:

Installation

You can install wikkitidy from CRAN with:

install.packages("wikkitidy")

You can install the development version from Github with:

devtools::install_github("wikihistories/wikkitidy")

ur ## Code of Conduct

Please note that the wikkitidy project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('wikkitidy')

Monthly Downloads

148

Version

0.1.14

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Michael Falk

Last Published

February 13th, 2025

Functions in wikkitidy (0.1.14)

query_tbl

Representation of Wikipedia data returned from an Action API Query module as tibble, with request metadata stored as attributes.
query_list_pages

List pages that meet certain criteria
xtools_page

prefix_params

Add required prefix to URL parameters for MediaWiki Action API request
query_category_members

Explore Wikipedia's category system
query_generate_pages

Generate pages that meet certain criteria, or which are related to a set of known pages by certain properties
wikkitidy-package

wikkitidy: Tidy Analysis of Wikipedia
query_page_properties

Choose properties to return for pages from the action API
wikimedia_rest_apis

Build a REST request to one of the Wikimedia Foundation's central APIs
wikipedia_rest_apis

Build a REST request to one of Wikipedia's specific REST APIs
wikkitidy_example

Get path to wikkitidy example
tidyeval

Tidy eval helpers
verify_xml_integrity

Check that a Wikimedia XML file has not been corrupted
wiki_action_request

get_rest_resource

gracefully

Gracefully request a resource from Wikipedia
id_or_title

Determine if a page parameter comprises titles or pageids, and prefix accordingly.
get_diff

Search for insertions, deletions or relocations of text between two versions of a Wikipedia page
get_history_count

Count how many times Wikipedia articles have been edited
get_query_results

page_vector_functions

Get data about pages from their titles
continue_query

Query the Action API continually until a continuation condition no longer holds.
append_query_result

Combine new results for a query with previously downloaded results
parse_response.wikidiff2

Convert a response from a Wikipedia API into a convenient format
%>%

Pipe operator
new_prop_query

Constructor for the property query type
check_namespace

Ensure namespace arguments are valid
check_limit

Ensure that the limit is correct for the endpoint. Raise an error if not.
process_timestamps

Convert passed objects into ISO8601 strings for API requests
new_generator_query

Constructor for generator query type
new_list_query

perform_query

Perform a single request to the Action API.
query_by_

Query the MediaWiki Action API using a vector of Wikipedia pages