What's your favorite resource on how to write code in research? What are the research-code-specific equivalents of Rich Hickey's talks or SPJ's posts or the many many SWE blogposts posted to HN?
What's your favorite resource on how to write code in research? What are the research-code-specific equivalents of Rich Hickey's talks or SPJ's posts or the many many SWE blogposts posted to HN?
4 comments
My favorite research code tends to look like the mathematics it implements. And that's really hard to do well. You need to pick abstractions that are both efficient to compute and easy to modify as the underlying model changes. My favorite research code also does the reader a lot of favors (eg documents the shape of the data as it flows through the code, uses notation consistent with the writeup or standard conventions in the field).
Industry research code... I'm happy to see basic things. Version control (not a bunch of Jupyter notebooks). Code re-use (not copy+paste the same thing 20x). Separation of config and code (don't litter dozens of constants throughout thousands of lines of code). Functions < 1000 lines apiece. Meaningful variable names. Comments that link the theory to the code when the code has to be complicated.
Overall it's probably most helpful to find a researcher in your field whose code you like to read, and copy the best aspects of that style. And ask readers of your code for feedback. I really enjoy reading Karpathy's code (not my field), but that may be an exception because a lot of what I've read is intended to teach a more or less codified approach, rather than act as a testbed for iteration in a more fluid design space.
Researchers that think their code is "throwaway" dramatically limit their reach.