The official editorial provides a thorough solution for this problem, but here are a few details to pay attention to.

Doing a Floyd-Warshall at the start of the algorithm is necessary because, although the bitmask DP enumerates the next unvisited node, it may be optimal or even necessary to go through a node we've already visited in order to get there.

For instance, consider the following adjacency list:

1 -> 2
2 -> 1
1 -> 3

To get from node 2 to node 3, we have to pass through node 1 no matter what.

Also, it may seem more intuitive to define the bitmask DP as dp[mask][i], where mask represents the nodes we've currently visited (in reverse order), and i represents the node we're currently at.

The issue with this definition, however, is that our state has no idea what the final set of visited nodes is, meaning we have no way of accurately computing the effect of traversing an edge.

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!