Skip to main content

An obsession

I don't have anything new to add so i am going to write about an obsession of mine. Almost an year back, I was at IIST Trivandrum working with Prof. Anand Narayanan on Quasars. It was my first serious summer project on astronomy and i was having a fun time learning about quasars and working with python and sql query. I was asked to reproduce the results of the richards et al. 2001 paper but as I eventually learnt, it was getting harder and harder to find the original data set. So, instead of reproducing the results from the original data set, I intended to extend the results to a new data set - the data for quasars from the SDSS DR9. The original data set in question was put together using quasars from the SDSS DR3 and other surveys. As i reproduced the results for the new (much larger) data set, weird artifacts started popping up in the results. As i dug deeper, I found that these artifacts were grouped in red-shift space and in coordinate space. Further, when i looked at the spectra for these objects, they looked nothing like those of quasars. Either their SNR was too low or the spectra looked more like that of late-M type stars. I kept dugging up digital sky survey and SDSS images for the portion of sky where these objects were grouped and went through my data analysis one more time to be sure i didn't make any mistakes. Eventually, I mailed Prof. Richards with my data set and doubts I had. He replied saying that the pipeline which identifies quasars from the SDSS data is (ofcourse) not perfect (DUH!) and if I wanted a data set which was only quasars, I should use either Schender et al. 2010 or Paris et al. 2014 which were checked and cleared of such artifacts which could've crept through the SDSS pipeline. I spent close to two months on this only to come at the conclusion that I should've used a more refined data set and not the crude one. I wouldn't say that the time spent was a complete waste because i got better at python. The data set is available as a github repository here and you can look at my previous blog posts on this topic from June and July 2013 for more on quasars and my work.

21:00:30 - 21:19:00

Popular posts from this blog

Animation using GNUPlot

Animation using GNUPlotI've been trying to create an animation depicting a quasar spectrum moving across the 5 SDSS pass bands with respect to redshift. It is important to visualise what emission lines are moving in and out of bands to be able to understand the color-redshift plots and the changes in it.
I've tried doing this using the animate function in matplotlib, python but i wasn't able to make it work - meaning i worked on it for a couple of days and then i gave up, not having found solutions for my problems on the internet.
And then i came across this site, where the gunn-peterson trough and the lyman alpha forest have been depicted - in a beautiful manner. And this got me interested in using js and d3 to do the animations and make it dynamic - using sliders etc.
In the meanwhile, i thought i'd look up and see if there was a way to create animations in gnuplot and whoopdedoo, what do i find but nirvana!

In the image, you see 5 static curves and one dynam…

Pandas download statistics, PyPI and Google BigQuery - Daily downloads and downloads by latest version

Inspired by this blog post : https://langui.sh/2016/12/09/data-driven-decisions/, I wanted to play around with Google BigQuery myself. And the blog post is pretty awesome because it has sample queries. I mix and matched the examples mentioned on the blog post, intent on answering two questions - 
1. How many people download the Pandas library on a daily basis? Actually, if you think about it, it's more of a question of how many times was the pandas library downloaded in a single day, because the same person could've downloaded multiple times. Or a bot could've.
This was just a fun first query/question.
2. What is the adoption rate of different versions of the Pandas library? You might have come across similar graphs which show the adoption rate of various versions of Windows.
Answering this question is actually important because the developers should have an idea of what the most popular versions are, see whether or not users are adopting new features/changes they provide…

Adaptive step size Runge-Kutta method

I am still trying to implement an adaptive step size RK routine. So far, I've been able to implement the step-halving method but not the RK-Fehlberg. I am not able to figure out how to increase the step size after reducing it initially.

To give some background on the topic, Runge-Kutta methods are used to solve ordinary differential equations, of any order. For example, in a first order differential equation, it uses the derivative of the function to predict what the function value at the next step should be. Euler's method is a rudimentary implementation of RK. Adaptive step size RK is changing the step size depending on how fastly or slowly the function is changing. If a function is rapidly rising or falling, it is in a region that we should sample carefully and therefore, we reduce the step size and if the rate of change of the function is small, we can increase the step size. I've been able to implement a way to reduce the step size depending on the rate of change of …