Arxiv author affiliations using Python

Arxiv author affiliations using Python - Part III

April 05, 2016

After a very long time, I was reminded of this project on Friday. I stopped working on this pet project of mine because I couldn't find a library that let me convert a dataset to a network. I probably (definitely) search hard enough and I would've found a solution had I kept working on the problem for long enough (because look, I have a solution!)!

The following is the network of affiliations. You can read the last two blogposts I wrote on this topic here and here. You can find the relevant codes in this Github repository. Note that the make_graph.py file in the Github repository will generate this interactive image, containing the network.

Also note that there are a couple of dependencies to be able to run the code. Firstly, the code is written in Python. Secondly, you will need to have the following libraries installed - BeautifulSoup and graph-canvas.

As I've mentioned, albeit briefly, in the previous blog posts, the Python code uses the arXiv API to scrape the most recent 500 papers. It then searches through the file, referred to as soup in the Python code, for the arxiv:affiliation tag, whose string contains the information we want! We finally have a list containing lists of affiliations.

We then convert this list to a dictionary, which is then fed to the the graphcanvas library, which generates this image.

I am going to build on top of this in time so if you're interested, keep checking the blog for future updates.

Search This Blog

Rahul gives unsolicited advice

Arxiv author affiliations using Python - Part III

Popular posts from this blog

Arxiv author affiliations using Python

You need to start writing Architecture Decision Records

Talk proposals submitted in Dec 2024