Wikistats files

Maintained by Erik Zachte

Hourly page views per article for around 30 million article titles (Sept 2013) in around 800+ Wikimedia wikis. Repackaged (with extreme shrinkage, without losing granularity), corrected, reformatted. Daily files and two monthly files (see notes below).

Hourly page views per wiki, corrected for site outages and underreporting. Also repackaged, as one tar file per year.

Raw data for reports at stats.wikimedia.org.


Notes for hourly page views

Both sets of hourly files have been derived from Domas' pagecount/projectcount files but the format is different.
What remained is: each line contains a wiki code (subproject.project) as follows:
The subproject is the language code (fr, el, ja, etc) or meta, commons etc.
The project is one of b (wikibooks), k (wiktionary), n (wikinews), o (wikivoyage), q (wikiquote), s (wikisource), v (wikiversity), z (wikipedia).

The huge hourly files for page views per article per wiki have been massively compressed by merging 720 files per month,
thus removing massive redundancy (80% of record space is article title, and a title can occur in all 720 files). All of this shrinkage without losing hourly granularity.

Line format:

Hourly counts can be deciphered as follows:
Hour:
from 0 to 23, written as 0 = A, 1 = B ... 22 = W, 23 = X
Day:
from 1 to 31, written as 1 = A, 2 = B ... 25 = Y, 26 = Z, 27 = [, 28 = \, 29 = ], 30 = ^, 31 = _
Example: 33 views on day 2, hour 4, and 155 views on day 3, hour 7 are coded as 'BE33,CH155'