Let us Begin...
Google Summer of Code: The KickoffNot too long ago, I was looking for opportunities. Looking for things I could do during the summer. After having some experience with internships and the tech industry, I was set on developing for the community. So yes, I decided I will be working for an open-source organization. That rung a bell because of the one thing that we were all told as freshmen in college. The Google Summer of Code. As a senior in the Computer Science department, we had all the information we needed to get a crack at this. So I went ahead with it, and this was around October 2017. I started working on the what, and more importantly, the why.
Starting with the Program and How to Prepare for it
So yes, I believe most of you are familiar with GSoC. Nevertheless, here is a brief introduction to the program. It is an initiative taken by Google, to partner up with the best open-source organizations out there and get the best student developers to work with these organizations through the summer. The projects could be anything from what the organization is already working on, to what you might want to propose to them. The actual program lasts for 3 months, that is the actual time you spend coding. So yes, right out the gate I knew where to look. GSoC was it. But now to think of it, let us come to the how.
I will soon dive into the practical things that a GSoC aspirant can do to better his chances of getting into the program. But before we get to that, there is the equally important, abstract aspect to it. Put your mind to it. Develop the interest. Find out what field you would like to dive into. What interests you? What doesn't interest you? Until you apply your mind and figure these out, no one can help you put in the hours that are needed for these next few steps. Do not be afraid to dig deeper. And no, that is not an implication to go looking for opportunities in Deep Learning. Follow your own interest. It is the only way to achieve something of a quintessential feat.
The First Few Months
After you are through with figuring out what it is that intrigues you, it is time to narrow down your focus and get better search results. Around December 2017, I started looking for organizations that had been a part of GSoC that year and also, had done some work with linguistics. I was interested in Knowledge Graphs, Ontology, and stumbled across the concept of Knowledge Base Embeddings. Aah, the beautifully structured internet. I knew I wanted to work in this field. So I spent this month and the one after that getting in touch with the organization that I would, later on, get selected for. By the end of January, I was sure where I wanted to work and had a brief idea as to what I wanted to work on. It wasn't until February this year, February 12th to be exact when the accepted organizations were disclosed. This was it. This is when you need to be most active. The exact selection process for any organization might differ, but the essentials remain the same. You need to work for it. So during the first few weeks, before the student application period, I read about my organization. I read up on the work they had done so far. I read about the people who were going to be mentors. I frequently contacted them, asking them for pointers, reading the material they had on their website, reading about previous projects, and most important of all, the warm-up tasks.
Student Application Period
This was the time to either make it or break it. Believe me, mentors in GSoC are very supportive, so long as you are interested. I am grateful for my mentors who helped me through the application period, reviewing my proposal, providing feedback actively. It helped me out a lot. And you can rely on them as well. They are there for you.
So, how to write the proposal... The application process is fairly straightforward. Nothing out of the ordinary you need to know here. The main focus is the proposal. If you have been following along, you must be aware of how important it is to know about your organization and what they do. Once I had the idea, I knew exactly where to look. With all the help I could get from my mentors, I started reading research papers and the recent development that took place in the field of KB Embeddings. Through all those readings, I devised a solution to the project that I was interested in. Don't get me wrong, there was still a lot of figuring out to do. But I had at least figured out where to start.
That is it. Figure out the solution, at a heuristic level first. Then you can dive into the feasibility of your proposed solution. Is it practical to implement? How would you go about it? Why this particular solution? Next thing you do is find the answers to those questions. Up to this point, you should be sure of two things:
- Your proposal describes what you are planning to do exactly.
- Why you chose this solution over other potential solutions
If you are here, I think we are almost done. The application period ends with all the proposals being submitted for undergoing the review process and all you need to do now is wait. Once the application review period ends and the Student projects are announced, this is when the first phase of GSoC officially commences. The Community Bonding period.
The community bonding period is exactly for what the name suggests. This is the time for all the selected student developers to get familiar with their organizations. Get to know your mentors, your peers who were selected in other projects, and the entire community as a whole. That is how it worked out for me. I talked to my mentors about how we will go about the entire thing. The process, the workflow, and most importantly, how we will be managing all of that. This is what we came up with...
To keep my mentors updated on my progress, we will be having weekly calls. Along with that, I will actively keep track of my progress through this blog. Rest is all the same, a repo on Github, some Jupyter notebooks with all the experimentation going on.
Bringing everyone up to speed
So right now, I am planning to work on deriving the word embeddings for the OOV entities with the help of definition and context encoders. The definitions will be extracted from WordNet. They provide us with more than 100k lemmas, and each has its own definition. Next week still goes under the Community Bonding period. Currently, I have an outline of how I will start off with this in a gist. You can follow my posts for the next few weeks to track my progress. Until then, see you in the next post. Cheers!