Visualizing the PyPI pandas download statistics using Tableau - Downloads by version

For some background, read my previous posts on the topic - https://rahulporuri.blogspot.in/2017/01/on-whos-downloading-pandas.html and https://rahulporuri.blogspot.in/2016/12/pandas-download-statistics-pypi-and.html.

Having looked at the total number of downloads for all versions of Pandas and downloads by month in the last post, we now come to the total number of downloads by month by version.

The relevant query is

SELECT
  STRFTIME_UTC_USEC(timestamp, "%Y-%m") AS yyyymm,
  file.version,
  COUNT(*) as total_downloads,
FROM
  TABLE_DATE_RANGE(
    [the-psf:pypi.downloads],
    DATE_ADD(CURRENT_TIMESTAMP(), -6, "month"),
    CURRENT_TIMESTAMP()
  )
WHERE
  file.project = 'pandas'
GROUP BY
  file.version,
  yyyymm
ORDER BY
  total_downloads DESC

which returns a data set, that can be downloaded as a CSV file. The file is available at https://drive.google.com/file/d/0BxwQdgnuTo6JYzR1dUI0Zm5jbWs/view?usp=sharing.

I visualized the data file using Tableau Public, which is the free version of Tableau.



I wanted to use Jupyter widgets or Plotly to make similar interactive plots but I was too lazy and too tired. So, for now, let's make do with the Tableau viz instead.

Popular posts from this blog

Farewell to Enthought

Arxiv author affiliations using Python

Elementary (particle physics), my dear Watson