Skip to main content

My last day as an Intern at Enthought, India.

For those of you who don't know about Enthought, which is probably all of you, it's provides "Scientific Computing Solutions" and I have been lucky enough to land an internship in the India office. It's been an amazing two months during which I have personally accomplished a lot! Two projects that I have wanted to work on for a very long time finally bore fruit over the last two month.

The first was my interest in making a map that showed the locations of universities around the world where astronomical research is pursued. I wanted to,  and will in the near future, add information about admission deadlines, application fees, expected GPA/GRE/TOEFL scores and what not for the universities and a way for people to select a date range or score range to decide what places to apply to. The months I spent looking and deciding on what universities to apply to could've been put to better use had there been a better way than to visit each department to look for the aforementioned details. While the list is of course not complete, the website is hosted off of Github and you can create an issue or submit a PR if you want/suggest any changes. Here is the latest blog post in which I talk about it, which in itself links to a couple of more blog posts, and the website in itself is hosted here.

Secondly, I wanted to make a network/graph that depicted collaborations between universities. Checkout the website to get an idea of what I have in mind. At the bottom of that page, there is a bouncy, touchy, network graph. In that graph, each dot/node represents a researcher at a university/lab. The lonely triangles, squares, pentagons and what not that you see floating around independent of one another are individual papers. The lines in those shapes represent the fact that the researchers of this paper collaborated with one another. Now, roughly in the middle of the graph, there is a biggish blob, containing a lot of dots/nodes. If you look closely, you can say that this mess is actually made up of individual squares/triangles/pentagons/etc sharing one or maybe two nodes/edges. This means that researchers at these common universities are more collaborative than the rest of the lonely dots/nodes/universities. I want to quantify this network to learn more things about it but that'll take some more time. Again, if you have any comments/suggestions/corrections that you would like to suggest, you can open an issue/Pull Request on the Github repository that hosts this work.

I just wanted to highlight those two things because they are what I am most proud of. There are a couple of more things that I worked on but those are relatively incomplete so I'll try working on them next month to bring them to a satisfiable checkpoint and then write about them! And I'll keep mum about what I did here at the company because I really don't want to get into trouble because I don't know what's acceptable and what's not!

And I have learnt a great deal about Python in the last two months. It's been a steep learning curve and there have been days when I felt like I was rolling back down the hill instead of making progress or even staying at the same location! But still, having reached the summit, it all feels worth the effort because what I can see from here is spectacular. I have been a little lazy because I haven't written about these new Python quirks I have learnt but I will slowly start doing that starting next month. Anyway, I guess that's all for now. Later.

Popular posts from this blog

Animation using GNUPlot

Animation using GNUPlotI've been trying to create an animation depicting a quasar spectrum moving across the 5 SDSS pass bands with respect to redshift. It is important to visualise what emission lines are moving in and out of bands to be able to understand the color-redshift plots and the changes in it.
I've tried doing this using the animate function in matplotlib, python but i wasn't able to make it work - meaning i worked on it for a couple of days and then i gave up, not having found solutions for my problems on the internet.
And then i came across this site, where the gunn-peterson trough and the lyman alpha forest have been depicted - in a beautiful manner. And this got me interested in using js and d3 to do the animations and make it dynamic - using sliders etc.
In the meanwhile, i thought i'd look up and see if there was a way to create animations in gnuplot and whoopdedoo, what do i find but nirvana!

In the image, you see 5 static curves and one dynam…

on MOOCs.

For those of you who don't know, MOOC stands for Massively Open Online Course.

The internet is an awesome thing. It's making education free for all. Well, mostly free. But it's surprising at the width and depth of courses being offered online. And it looks like they are also having an impact on students, especially those from universities that are not top ranked. Students in all parts of the world can now get a first class education experience, thanks to courses offered by Stanford, MIT, Caltech, etc.

I'm talking about MOOCs because one of my new year resolutions is to take online courses, atleast 2 per semester (6 months). And I've chosen the following two courses on edX - Analyzing Big Data with Microsoft R Server and Data Science Essentials for now. I looked at courses on Coursera but I couldn't find any which was worthy and free. There are a lot more MOOC providers out there but let's start here. And I feel like the two courses are relevant to where I …

Pandas download statistics, PyPI and Google BigQuery - Daily downloads and downloads by latest version

Inspired by this blog post :, I wanted to play around with Google BigQuery myself. And the blog post is pretty awesome because it has sample queries. I mix and matched the examples mentioned on the blog post, intent on answering two questions - 
1. How many people download the Pandas library on a daily basis? Actually, if you think about it, it's more of a question of how many times was the pandas library downloaded in a single day, because the same person could've downloaded multiple times. Or a bot could've.
This was just a fun first query/question.
2. What is the adoption rate of different versions of the Pandas library? You might have come across similar graphs which show the adoption rate of various versions of Windows.
Answering this question is actually important because the developers should have an idea of what the most popular versions are, see whether or not users are adopting new features/changes they provide…