Phd

November 2023 (1002 Words, 6 Minutes)

research grad school

At the moment, I am applying to graduate schools to continue working on research into making machine learning and AI more reliable/trustworthy. I first started working in research academia during my second semester of undergrad. Since then, I think that I have gained enough experience to be able to articulate some of the important parts of the ‘research process’. I don’t expect this information to be particularly novel to anyone who has also been involved with research in computer science, but for my own recollection, and perhaps anyone else with an interest in what it’s like to do research, these are some of the lessons I have learned.

1. Research is iterative and incremental

Meaningful contributions that advance the state of the art don’t come often. A successful piece of work is often a minor improvement upon a previous result in the field. This is probably the most difficult part of doing research. In the worst cases, you may spend months designing an experiment, only to have your original hypothesis proven wrong, or, even worse, have someone else publish similar results before you. Times like this are very discouraging and can be hard to deal with when you have invested so much time and effort into producing results. Getting past such hurdles is the biggest part of being an effective researcher, and, to me, often feels like an accelerated form of the five stages of grief. Realizing that results aren’t directly correlated with the amount of work poured into their discovery is difficult, and not something that I can ever get used to. However, the corollary to this is that rarely, some of the most important and foundational work ends up being simple and elegant. Striving for such discoveries is the light that gets me through the tunnel.

2. 90% of your time is spent on 5% of your problems

This can range from debugging compiler issues to reproducing the build environment necessary to run a model from 3 years ago. In my own experience, the time spent doing this often makes it feel as though research is a secondary experience to debugging. You may spend the entire week on a single one of these problems and end up at your weekly meeting with your advisor reporting that you are still stuck. In these cases, collaboration and communication are crucial. Ask your advisor and lab mates if they have any idea how to fix your problem, as usually they have encountered similar, if not the same one themselves. If you can’t find help here, go to the internet, though YMMV. Most crucially, as a researcher, it is up to you to decide which problems are worth solving. If your confidence in the overall method is weak to begin with, then spending the next week making it run may not be worth the time. Know when to cut your losses and pivot to a new approach.

3. Make your results reproducible

Source code, seeded RNG, Docker, documentation, fixed dependencies. Do everything you can to make it as easy as possible for someone else to run the exact same experiment as you have published. It isn’t glamorous work and often doesn’t even feel worth doing when your research will be superseded by the new state of the art within the next few years. However, your research is only as relevant as the body of work that it contributes to. If no one can run your code four years from now, then that is where the extent of your contributions end. The tools exist to accomplish this, but the marginal benefit that they provide relative to our immediate desire for results cannot be overstated.

4. Learn how to read academic writing

Reading a paper can mean different things to different people. Some may skim through certain sections to get an overall idea of what is being talked about and others may closely read each paragraph and write notes about every theory and lemma. Neither of these approaches are wrong, but in order to be effective, they must be able tell you two things

1. What were the main contributions of the paper?
2. Did it change the state-of-the-art? If so, how?

An academic paper is meant to inform other researchers of the exact details of how they approached and solved a problem. When discussing a paper as part of a reading group or as background for your own methods, your goal isn’t to reproduce their implementation, but to understand why/if it is important. Of course, if you end up using their work as part of your own, it will likely be necessary to know the exact details, but when evaluating the approach, focus on these important questions first before devoting more time into the approach.

5. Keep accurate records

Document your process as you go through your work. Even if it is just a short daily summary of what you did, it can be crucial for yourself and others. The number of times I have gotten lost in my work is more than I can accurately count, so having a record of what approaches you take, just as a way to recall how you may have solved a problem can save your future self hours. It also helps to have notes in front of you when talking with your advisor so that you can accurately explain what problems you’re having and what progress you have made. Another benefit of all this is that it helps you practice your writing skills, which is a crucial part of being an effective researcher that can often fall through the cracks when most of your time is spent in VScode.

6. Be confident in your results

Science, to me, is quite intimidating. As an individual researcher, we are given the opportunity to contribute to an ever growing body of work that defines our field. It may span 50, 100, or even 1000s years, and we, the humble graduate student, have the opportunity to push the veil of knowledge just a little further. This fact can, at times, come into conflict with the reality of being a graduate student, where the mantra “Publish or Perish” is demonstrated to us daily. We need to publish new works and push the boundary of science forward, but we shouldn’t feel so pressured as to compromise our values to do so. If you put your name on a piece of published work, you are endorsing the results and every word that is part of that paper. If you need to conduct one more test to feel 100% confident in your results, don’t be afraid to simply because you are 99% of the way to a paper. The grueling process of finding a meaningful result can make us simply want to write down what we have so far and say “good enough”; but it simply isn’t. Be confident in what you publish and be prepared to defend your methods, as quality research is always a direct result of rigorous testing and review.