Skip to content

We went to the SoCal ICPC regional and got 3rd. Here's our code and a brief postmortem.

Notifications You must be signed in to change notification settings

tjleing/2018-ACM-ICPC-SoCal-Regional

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

2018-ACM-ICPC-SoCal-Regional

We went to the SoCal ICPC regional and got 3rd. Here's our code and a brief postmortem. Brief disclaimer: I hand-copied the code from a Linux box running off of a USB that we got after the contest, since our login didn't work for some reason. So some code is probably typed wrong, although I'm fairly sure that it'll all compile and run (but maybe not get the right answer).

Final standings

Problems

Postmortem:

(it's basically two weeks late, so take it with a grain of salt)

Going into the competition we though it was going to be 10 questions, so we split the initial reading 3-4-3, with Victor reading the first segment, Sean reading the middle, and me reading the end. It was actually 11, so I just took the last four instead.

Initial impressions:

(feel free to fill in your own impressions here)

I personally started reading from 11 backwards through 8 for some reason. 11 looked a bit like a knapsack, so I left that for later. 10 was number theory, and we'd had some tricky number theory in recent practices, so I tabled that one as well. 9 stuck out as comparatively really easy, just a straightforward simulation. Finally, 8 seemed doable but with some very annoying input -- I naively assumed that if the question had tricky input, the algorithm is probably pretty simple (how wrong I was).

Problem-by-problem, chronologically:

It didn't seem like either of Victor or Sean's segments had any particularly simple questions at the time, so I just hopped on the computer to start doing 9. The algorithm was in fact as easy as I had predicted (just read in the probabilities associated with each letter, and multiply all the letter probabilities in a word to find the total probability of reading it correctly, since the probabilities are independent), but unfortunately a few implementation details prevented me from getting it until very late into the contest. First of all, the sample test case in the packet was extraordinarily misleading -- the first word, "PR0GRAM" has a 0 in the middle instead of an O, which had me absolutely convinced that I was misreading the problem statement somehow. Then, after that, reading in the input was a nontrivial task in C++. If you look at 9.cpp, you'll see my attempted workaround that doesn't work if the first word after reading the letter probabilities happens to also be one character long. I eventually had to recode it in Python, which immediately fixed it, but in between gave the computer up to Sean to code 6.

6 was made a lot easier in Python, where the function int(s) can take an optional second argument that is the base you want to convert it into. It looks like a brute force on the two bases for the two different integers, with some amount of pruning to avoid erroring if you can't convert a string to a particular base. All in all, I probably should've ceded to him earlier, instead of bashing against 9 for so long (and started by using Python instead of trying to use C++'s input). Either way, we got the first two questions done in the first hour, which put us fairly far down the standings, but still top 10 if I recall correctly.

In the meantime, Victor had been working on 1, which he was able to finish pretty quickly after a brief mishap with the "No mismatches" case. Essentially the algorithm goes like this: store the list of backup images in a set for quick access, and then for each backup file name, scans backwards through the file name for the second-to-last underscore (since the image name can include underscores, which messed up at least one other team of ours), and adds the full backup file name to a dictionary from image names to lists of corresponding backup file names. Then from there it's straightforward to see which image file names don't have any backup files and which backup files are orphaned. Another question which is much easier to do in Python because of its input format.

At the time, 10 and 11 seemed like the next-most approachable, so Sean was working on 11 using DP and I was looking at a brute-force method of doing 10. As always, brute-force is much easier to code than DP, so I took the computer next to do 10. The problem essentially boils down to quickly finding the prime factors of a number, which I did by first sieving to find all primes less than 2,000,000 and then just using it to find the list of factors of a, b, and a+b=c -- rad(a * b * c) = (product of list of primes of a) * (... of b) * (... of c), as long as a and b don't share any prime factors. I wasn't quite sure about the running time of a sieve but was fairly certain that over all of the runs in a particular test case memoizing the list of primes would help. To be honest, I really should have thought more about the time complexity of the algorithm as a whole, because as it is now, it looks like it's extremely close to TLEing. A simple prune would be to cut off the factoring algorithm when the current prime you're looking at is greater than n, but even the list of primes will be fairly short (~150,000) so that really shouldn't be a problem. Also for some reason the contest time limits were ~30 seconds, so there were essentially no problems with TLE.

I was slightly on the wrong track when I judged 11 to be a knapsack problem. Sean (being our resident math major) quickly saw the DP approach: first generate a 2D DP array ways[# dice][value] where dp[n][v] = the number of ways to get a value of v by rolling n dice (distinguishable), and then use a bit of bruteforce to check every subset of the initially-rolled dice to see how many ways there were to roll them and have the resulting total be the target, T. If you have the number of ways to reroll the subset of dice and reach the target, the probability that you do so is (# ways) / (6^(size of subset)). Ultimately you want to choose the largest of these probabilities, and if tied then the smallest subset size. Personally, I was worried at first about the time complexity of the algorithm, but it fits incredibly neatly. Filling the DP array takes O(K^3) with a large constant, but it's still quite fast enough; the brute force takes O(K * 2^K), which dominates the running time but is still less than a billion and certainly fast enough compared to a 30 second time limit. It seemed like this solution was particularly tricky to debug, but Sean got it in quite quickly. In fact we solved 11 the fastest out of any team, with only 8 teams out of 100 getting it. This was right at 2 hours, and we had solved 5 questions total, putting us in the top 5 at the time.

9 6 1 10 11 2 3 7 5 (4) ((8)) In the meantime, Victor and I had been working on 2 and 3. For 2, the main difficulty was just hardcoding all of the different cases, as each input couldn't be longer than 4000 words and thus a solution that was O(n^2 * (max word length)) would be well under the time limit. Just loop through each pair of words and check to see if they are similar, and implement each similarity check in the simplest (to code) way possible. One simplification: a deletion from one is the same as an insertion into the other, so only three checks per word pair are necessary.

3 was quite intimidating at first -- for good reason; only 6 of the teams got it. When looking back at the contest, Victor and I both agreed that it was probably the best problem. Admittedly, that decision is extremely subjective, but considering how much of the test was brute-force this year, it was nice to have a question that required actual algorithmic thinking. When I first started brainstorming with Victor, he immediately brought up binary searching on the distance required by the two companies, which would lead to a valid solution if we could quickly (in ~O(n^2)) determine whether the companies can partition the customer set into two groups, each of which has a diameter of less than or equal to D, for any given D. Abstracting away the specific details by connecting two vertices (customers) with an edge if the Manhattan distance between them is less than or equal to D, the problem boils down to whether a graph can be split into two complete graphs. Here, the final insight required is to consider the complement graph. Now, instead of requiring that the two subgraphs be complete, we require that there are no edges between members of the same subgraph. But wait, this is just asking whether the initial graph is bipartite or not! Just start 2-coloring the complement graph, and if you ever have to assign a vertex two different colors, it isn't bipartite. A straight BFS will run in O(time to construct complement graph) + O(BFS) = O(n^2) + O(|E|) = O(n^2), and combined with the outer binary search, the entire algorithm runs in O(n^2 log n).

Continuing along with the graph theme, Sean took problem 7 as I began to consider if we could actually win the whole contest. For 7, the number of geometric figures is just the number connected components, and if the degree of every vertex in a connected component is 2, it's also a polygon. To find the number of connected components, whenever you process a new edge, if you've seen neither node before, they both belong to a new component; if you've seen one but not the other, the other joins the one's component; and if you've seen both, either do nothing (both nodes were already in the same component) or merge the two components (make every vertex in one of the components in the other component instead). It was a bit unfortunate that we got to this question so late, but it ended up being much more straightforward than I had anticipated.

At this point, only 4, 5, and 8 remained unsolved; we had two hours left to solve as many of those as we could. Victor and I brainstormed for a while and came up with an algorithm for 4:

About

We went to the SoCal ICPC regional and got 3rd. Here's our code and a brief postmortem.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published