Skip to main content

Mid-week update of Week 2 as an intern at Enthought, India

I've been formally assigned to work on the new Data Import Tool add-on to Canopy. I used to think that it is similar to the data-toolbox in Matlab or R but having tried out the Canopy add-on, I can say that it is very different. Well, I don't know if I can get into the details as to how it's different from the others so I'll postpone that for another time. Otherwise, it's been a steep learning curve and it's a bit hard to climb at times. I guess i'll just have to keep climbing up even if i roll back down a bit.

Enough meta stuff. Let me now tell you about two interesting things I learnt about git.

I wish someone had told me about "$ git stash". Say I have a working code. Say I am making corrections/improvements to the working code. Now say that my prof calls me up and asks me to run some data through the working code, without the changes. Now, because I don't want to throw away the changes I'm making, I can *stash* them away using git. Git will create save all the changes in a *location* and remove them from your working directory. Eventually, when youre done working and want to apply the changes back to the directory, all you need to do is "$ git stash apply". That link explains it better than I do, I'm sure, but I had to try.

I also learnt how to branch repositories and how to submit pull requests having made changes/additions/corrections. So, there are two ways. Note that my workflow is based around code hosted on GitHub. The first would be fork the repository into your account, work on it, update it and then submit a pull request having made the changes. Note that the changed code is residing in a clone of the original repository on your account. The second would be to create a branch of the repository, work on the branch, push to the branch and then submit a pull request. In this case, the code resides in a branch of the original repository. The first half of this article explains branching that I was talking about. Don't use merge if you want people to review your pull request/the changes you've made to the code.

I think I'll stop here for now. There are a bunch of other things that I learnt but am not able to put together into something concrete. I'll leave that for the next time. Until then ...

Popular posts from this blog

Animation using GNUPlot

Animation using GNUPlotI've been trying to create an animation depicting a quasar spectrum moving across the 5 SDSS pass bands with respect to redshift. It is important to visualise what emission lines are moving in and out of bands to be able to understand the color-redshift plots and the changes in it.
I've tried doing this using the animate function in matplotlib, python but i wasn't able to make it work - meaning i worked on it for a couple of days and then i gave up, not having found solutions for my problems on the internet.
And then i came across this site, where the gunn-peterson trough and the lyman alpha forest have been depicted - in a beautiful manner. And this got me interested in using js and d3 to do the animations and make it dynamic - using sliders etc.
In the meanwhile, i thought i'd look up and see if there was a way to create animations in gnuplot and whoopdedoo, what do i find but nirvana!

In the image, you see 5 static curves and one dynam…

Pandas download statistics, PyPI and Google BigQuery - Daily downloads and downloads by latest version

Inspired by this blog post : https://langui.sh/2016/12/09/data-driven-decisions/, I wanted to play around with Google BigQuery myself. And the blog post is pretty awesome because it has sample queries. I mix and matched the examples mentioned on the blog post, intent on answering two questions - 
1. How many people download the Pandas library on a daily basis? Actually, if you think about it, it's more of a question of how many times was the pandas library downloaded in a single day, because the same person could've downloaded multiple times. Or a bot could've.
This was just a fun first query/question.
2. What is the adoption rate of different versions of the Pandas library? You might have come across similar graphs which show the adoption rate of various versions of Windows.
Answering this question is actually important because the developers should have an idea of what the most popular versions are, see whether or not users are adopting new features/changes they provide…

Adaptive step size Runge-Kutta method

I am still trying to implement an adaptive step size RK routine. So far, I've been able to implement the step-halving method but not the RK-Fehlberg. I am not able to figure out how to increase the step size after reducing it initially.

To give some background on the topic, Runge-Kutta methods are used to solve ordinary differential equations, of any order. For example, in a first order differential equation, it uses the derivative of the function to predict what the function value at the next step should be. Euler's method is a rudimentary implementation of RK. Adaptive step size RK is changing the step size depending on how fastly or slowly the function is changing. If a function is rapidly rising or falling, it is in a region that we should sample carefully and therefore, we reduce the step size and if the rate of change of the function is small, we can increase the step size. I've been able to implement a way to reduce the step size depending on the rate of change of …