Semester Project
Goals
- Apply course concepts to a larger, open-ended project.
- Gain parallel/distributed systems experience that you can draw from in your career.
- Improve your confidence in dealing with complex and ill-formed problems.
- Gain an improved understanding of academic research.
Description
Academic computer science research may be a bit different than the sorts of research that you may have done in high school or other undergraduate courses. Academic research is not so much about finding or aggregating existing information; it is about discovering and creating new knowledge. Research is the process of extending the general body of human knowledge, improving our lives in both direct and indirect ways. As you approach the end of your time as an undergraduate student, it is important to understand what research is and how it works, particularly if you plan to go to graduate school.
In this project, you will gain a greater understanding of research while reinforcing general course concepts by doing a small-scale research project ("small" relative to a M.S. thesis or a Ph.D. dissertation; it is still the most significant project you'll work on this semester). This course is particularly well-suited to doing a research project because so many of the topics we cover are related to open research questions.
Guidelines
Below are some general guidelines for the semester-long research project.
- Find a topic you are personally interested in that is related to our course topics.
- Find two or three like-minded students in the class to work with.
- Include software development, rigorous experimentation, and analysis of results.
- Avoid naturally-parallel problems if possible; try to find something non-trivial.
- Prefer strong and weak scaling applications of distributed systems.
- Do some background research and see what others have already done.
- Find a faculty member (from another department?) who can answer domain questions.
It is impossible to build a comprehensive grading rubric for these projects because they are so highly individualized, and not every group need aim for an 'A' grade to be successful. However, here are some general guidelines for a successful project at each grade level:
- A level (novelty) - This project demonstrates a significant and well-executed new implementation and/or analysis of a parallel or distributed system. This might include a novel system or variation on an existing system, and it should apply many of the concepts discussed in class. The writeup will include a thorough analysis and discussion of performance and scaling characteristics as well as any other relevant topics. In general, a project like this is a good candidate for publication.
- B level (insight) - This project demonstrates significant insights into a parallel or distributed system. This may or may not include new implementations, but it should apply many of the concepts discussed in class. The writeup will include a thorough analysis and discussion of performance and scaling characteristics as well as any other relevant topics.
- C level (parallelism) - This project demonstrates a parallel or distributed system, using at least two or three concepts discussed in class. The writeup will include a discussion of performance and scaling characteristics as well as any other relevant topics.
There are a few examples of exceptional projects on my research website, and there are some more examples posted in the Files tab on Canvas.
Most importantly: DON'T PANIC! This project has in the past caused a lot of undue stress primarily because it is so open-ended. However, succeeding at an open-ended project is a skill that will likely be crucial at some point in your career. This is an opportunity to work on that skill in a relatively low-stakes context. I encourage you to think of this project as an opportunity rather than an obstacle as much as possible.
Further, I am here to help you through this process. In academic research, students always work closely with an advisor, and in this course I will assume that role for all of you even though I will not be able to work as closely with each individual as I would with a normal research project. However, in the past the groups that performed the best have been the ones that stay in frequent contact with me via email and office hour visits. If you start work early, schedule time weekly to meet and work with your group, leave time to deal with setbacks, and communicate regularly with me, your project will be successful.
Ideas
Below are some general ideas for the semester-long research project.
- Work with a faculty member (try math or chemistry!) to develop a new parallel code or parallelize an existing serial code.
- Parallelize an existing serial code from another source.
- Perform a performance analysis and optimize an existing parallel code.
- Design and implement or extend software that operates on a distributed system (e.g., our cluster or the Internet at large).
- Design and implement or extend a cloud-based system.
- Design and implement or extend a GPU-based highly-parallel code.
- Compare and analyze at least three existing algorithms or approaches to solving a problem in parallel.
- Perform floating-point analysis on a given scientific code, preferably incorporating scaling effects. See Dr. Lam for more information.
- Study variability in floating-point accuracy in heavily-multithreaded code. See Dr. Lam for more information.
- Extend a previous class project. See the Files tab in Canvas or Dr. Lam for more information.
- Deploy and use Horovod for TensorFlow/Keras/PyTorch on the cluster.