NP Complete Project

Learning Objectives

  • implement a brute-force approach to solve a known NP-complete problem and analyze its runtime complexity
  • present/discuss an NP-completeness proof to a group of colleagues and justify that an approximation method is appropriate
  • create an approximation method for a known NP-complete problem and present its strengths and weaknesses.
  • improve presentation skills (both speaking and slide creation) by giving and receiving peer feedback

Introduction

In this project you will explore a known NP-complete problem and perform the following tasks:

  • Develop an optimal solution – develop code that solves the problem with the optimal answer (no approximation).
  • Problem presentation explain the problem and its reduction to another accepted NP-complete problem.
  • Approximation code – develop code that performs a polynomial approximation
  • Approximation presentation – explain your approach for computing the approximation and illustrates its performance
  • Evaluation of Other Projects – you will submit one peer review of another group’s project.

The sections below detail each of these components. Individual assignments will be created for all submission pieces (or they will be submitted in Github and an assignment will be created in Canvas to represent your grade).

Problem and Group Selection

This is a group project that can have up to 2 members.
It may be necessary to work in a group of 3, and if so a group of this size will be required to complete 2 different approximations (Part C). Your group will select and rank three NP-complete problems that you would like to work on for this project from the list below. You should do some research before selecting one of these problems. Ideally, we will get a nice distribution of projects across these problems.

  • Max 3-Sat
  • Max clique
  • Traveling Saleman Problem (TSP) with a complete graph
  • Longest Path
  • Min Vertex Cover
  • Min Graph Coloring
  • Min Set Cover

I have created example input and output files for each problem. After I receive and review all the group preferences, I will assign each group a single NP-complete problem to work on.

Role Selection and Tasks

Each of the tasks for the project are outlined below. One person in the group will be responsible for Parts A and B, and the other person will perform Parts C and D.

Part A - Exact Solution Code

Develop Python code that provides the optimal solution to the problem you are studying. You must develop test cases of varying sizes and illustrate how the runtime (wall clock) of your program varies for different size input. Your test cases must include at least one instance that causes your program to run for more than 20 minutes.

Make sure you cite in the comments of your program ALL external sources used in the development of your code.

Submission

Create a folder named exact_solution with the GIT repository. Place your code in this folder including a README.txt file that provides instructions and exact commands to run your code. Within the exact_solution folder, create a test_cases subfolder that contains all test cases. Create a shell script named run_test_cases.sh that executes your program on all test cases. Clearly label which test case causes your program to run for more than 20 minutes.
An example set of shell scripts for performing this work is provided (with a credit to Martin Nester for developing these very nice scripts).

Part B - Problem Presentation

Perform a 5 to 6 minute presentation on your group’s work for Part A. This presentation must include:

  • a description of the problem being solved. Show the difference between the decision and optimization version of the problem
  • example inputs and outputs while describing the problem
  • why the problem is important (some applications that use this problem)
  • an explanation of the certifier process being polynomial
  • a reduction from a known/accepted NP-hard problem (with examples graphically shown)
  • an explanation of the coded solution your group has created.
  • an analytical (big \(\mathcal{O}\)) runtime analysis of your coded solution and showcase the code that is the dominant term (the code that drives the highest order term).
  • a wallclock (empirical) runtime analysis of your code on your test cases. This analysis must show input size on the X axis and runtime on the Y axis.

Submission

Create a folder named presentations with the GIT repository. This presentation should use slides in either PowerPoint, Keynote, or Google slides (other slides formats can be requested, but are subject to instructor approval). The slides (or a file with a link to the google slides) should be placed in the git repository. Name this presentation problem_presentation with whatever file extension is appropriate.

Part C - Approximation Solution Code

Develop Python code that provides a reasonable approximation of the solution. Your approximation must run in polynomial time and process large problems (n > 1000) quickly. Some popular strategies are:

  • Make greedy local choices
  • utilize randomness coupled with an anytime algorithm

Your program needs to run without arguments (so that it works on gradescope), but you can incorporate optional command line arguments to augment the function of your program.

Submission

Create a folder named approx_solution with the GIT repository. Place your code in this folder and create a README file that provides instructions on how to run your program. You must create a file named run_examples.sh that successfully runs your program on several test cases you created. One of these test cases must be one where your approximation solution does not achieve the optimal answer. Document notes on your solution in a file named approximation_notes.pdf and at a minimum these notes must addresses the items listed in the rubric below.

Part D - Approximation Presentation

Perform a 5 to 6 minute presentation on your group’s work for Part C. To manage all of the presentations, you will be asked to stop presenting promptly at 6 minutes, so, it is a good idea to practice before class to make sure that your timing works. This presentation must include:

  • pseudocode for the approximation and a discussion of the strategy used (greedy, random, etc.)
  • analytical run time analysis (the worse case \(\mathcal{O}\) time) of your approximation algorithm
  • lower bound analysis of the problem and the approximation
  • plots that illustrate the run-time (wall clock) performance of your exact solution versus the approximation solution on your test cases
  • plots that compare the result/solution of your exact solution versus the approximation on your test cases.

For the data that is used to create run times on your test cases, you should provide these in a script named compute_approx_wallclock.sh. Running this script should run your test cases that you used to gather your data for the plots. Recall that UNIX has a time command that may proof useful in collecting this information.

Submission

Within the folder presentation add a presentation/slides file constructed in PowerPoint, Keynote, or Google slides (other slides formats can be requested, but are subject to instructor approval). The slides (or a file with a link to the google slides) should be placed in the git repository. Name this presentation approximation_presentation with whatever file extension is appropriate.

Rubrics

Common rubric (13 points)

ItemPoints
Team Formation3
Student Participation during in-class work days6
Peer review submission4

Exact solution rubrics (18 points).

ItemPoints
Gradescope tests8
Small test cases are documented and explained to show correctness4
Large test case (more than 20 mins to run) is discussed3
run_test_cases.sh script works and executes all test cases3

Exact solution presentation rubrics (18 or 19 points).

ItemPoints
Problem Description (what, why, applicaitons). Illustrate8
Show proof of NP-Completeness (decision, NP certificate, and reduction). Include a small sample of the reduction4
Longest Path Only (show why taking negative weights on all edges and using Bellman-Ford does not work)1
Sketch Exact Solution3
Discuss test case generation3

Approximation code (18 points).

Discussion points should be in approximation_notes.pdf within the approx_solution folder in the github repo.

ItemPoints
Explain strategy (greedy, anytime, etc.).3
Discuss the analytical runtime analysis of your approximation. Recall it must run in polynomial time6
Program performs “reasonably well” on test cases on Gradescope6
run_test_cases.sh showscases at least 3 test cases (1 with nonoptimal results)3

Approximation Presentation (18 points).

ItemPoints
Explain strategy (greedy, anytime, etc.) and how the code works3
Discuss problem instance where approximation does not achieve the optimal value. Use illustrations.3
Provide wallclock runtime results comparing the exact and approx solutions on one plot.6
Discuss a lower bound for your solution and show the delta to this lower bound (from the value you computed) for large problem instances you created for run_examples.sh6
Last modified April 21, 2024: Update deploy.yml (1c66268)