Advent of PowerShell 2018, pt I

Advent of Code is a fun christmas serial consisting of 50 small programming challenges. Let's solve them using PowerShell! In this post we'll cover the 2 challenges from December 1st

Day 1

Advent of Code has been a delightful christmas pastime of mine for the last two years. According to the website:

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. People use them as a speed contest, interview prep, company training, university coursework, practice problems, or to challenge each other.

You don't need a computer science background to participate - just a little programming knowledge and some problem solving skills will get you pretty far. Nor do you need a fancy computer; every problem has a solution that completes in at most 15 seconds on ten-year-old hardware.

This year I haven't really been following along with the daily challenges, so I thought this would be a nice opportunity to solve them all and invite you along - it might just turn out to be a nice excuse to discuss and showcase some aspects of my favourite language - PowerShell!

Without further ado, let's proceed to sign up for Advent of Code over at https://adventofcode.com (you can do so using your GitHub profile!), and then check out the first challenge!

Challenge 1

For the first challenge we must callibrate a device to its correct frequency.

The test input for December 1st is a list of signed integers and, starting from zero as our "base frequency", we must apply each number as a change to the frequency in order to calculate the correct one.

We're given the following examples of the end result after correct calculation of the frequency:

  • +1, +1, +1 results in 3
  • +1, +1, -2 results in 0
  • -1, -2, -3 results in -6

Even if it might not have been obvious from the description, we can immediately see that we simply need to sum all the frequency changes to calculate the answer - easy peasy, PowerShell already has Measure-Object -Sum built in, so we end up with a lovely one-liner:

Challenge 2

Once we submit the output from our solution to the first challenge, a second challenge based on the same test input is revealed!

This time, we need to apply the changes one by one (repeating the list of changes if necessary), and then looking at the intermediate results find the first frequency that we reach twice.

Basic setup

Alright, this is obviously a bit trickier than before and we'll have to sum the changes ourselves. Let's write a simple but slightly more involved function for this purpose:

param(
  [int[]]$Changes
)

$done  = $false
$index = $sum = 0

do {
  # using the modulo operator (%) we can iterate over the $Changes 
  # array as if it were an infinite circular list
  $sum += $Changes[$index % $Changes.Length]

  # now we just need to inspect $sum and figure out if we're $done

} while(-not $done)

return $sum

Now that we have a mechanism for the ongoing tracking of the sum, we'll need a way to check if it's the second time that we've reached that frequency. Let's try with an array of integers - we add each intermediate sum to the array, and add a check just before to see if the array already contains the sum, if so it must mean that we've seen it once!

param(
  [int[]]$Changes
)

$done  = $false
$index = $sum = 0
$array = @($sum)

do {
  # using the modulo operator (%) we can iterate over the $Changes 
  # array as if it were an infinite circular list
  $sum += $Changes[$index++ % $Changes.Length]

  # now we just need to inspect $sum and figure out if we're $done
  if($array -contains $sum){
    $done = $true
  }

  # keep track of the intermediate sum for the second time
  $array += $sum
} while(-not $done)

return $sum

Initial testing

Now, if we test this solution against the sample inputs, it'll work out just fine:

PS C:\> $changes = +7, +7, -2, -7, -4
PS C:\> .\2.ps1 -Changes $changes
14

But if we then test it against the real input, we'll find that it seems to run forever...

Is += really a good idea...?

If you're a seasoned PowerShell adventurer, you may have already spotted what's going on here: we foolishly tried to add the $sum to the $array variable using the += operator, causing PowerShell to re-size the underlying array.

Alright, we'll simply switch to a List<int> instead! While we're at it, we'll change the $array -contains $sum test to List<int>.Contains() - using the internal containment method will be faster than having PowerShell iterate over the entire list until it finds a match (or not) using -contains:

param(
  [int[]]$Changes
)

$done  = $false
$index = $sum = 0
($array = [System.Collections.Generic.List[int]]::new()).Add($sum)

do {
  # using the modulo operator (%) we can iterate over the $Changes 
  # array as if it were an infinite circular list
  $sum += $Changes[$index++ % $Changes.Length]

  # now we just need to inspect $sum and figure out if we're $done
  if($array.Contains($sum)){
    $done = $true
  }

  # keep track of the intermediate sum for the second time
  $array.Add($sum)
} while(-not $done)

return $sum

On my i7-6700HQ it now solves the puzzle in about 25 seconds - still pretty slow... so what else can we do?

The remaining slowdown is likely due to the contaiment test - as the list grows larger and larger, going through every item takes longer and longer every time. We need a data structure which, just a list or an array, is able to hold multiple items and for which testing if a specific item exists is fast - sounds familiar?

Crank it up to 11

Sounds to me like a we need some akin to a hashtable - but since we don't really need to keep track of any values, we can instead use a HashSet<int>!

A HashSet is an unordered list of unique values - a bit like the Keys of a hashtable - and that is exactly what we need, since we only need to know if we've seen each frequency one.

param(
  [int[]]$Changes
)

$done  = $false
$index = $sum = 0
$array = [System.Collections.Generic.HashSet[int]]::new()

$null  = $array.Add($sum)

do {
  # using the modulo operator (%) we can iterate over the $Changes 
  # array as if it were an infinite circular list
  $sum += $Changes[$index++ % $Changes.Length]

  # now we just need to inspect $sum and figure out if we're $done
  if($array.Contains($sum)){
    $done = $true
  }

  # keep track of the intermediate sum for the second time
  $null = $array.Add($sum)
} while(-not $done)

return $sum

Woah, all of a sudden execution time goes from 25 seconds to ~100ms - a 250x increase in speed!

Pretty sweet - my only worry is that this solution is confusingly verbose - I think we can simplify it significantly...

Shine it up

The keen-eyed will have noticed that we now need to assign some return value from HashSet<int>.Add() to $null - because whenever you attempt to add a new value to the set, it'll tell you whether that value was added or whether it already exists in the set!

We can abuse this to simplify the do{}while loop condition - if a frequency value has previously been seen, it means we've reached the end result and we can end the loop as soon as HashSet<int>.Add() returns $false!

param(
  [int[]]$Changes
)

$index = $sum = 0
$array = [System.Collections.Generic.HashSet[int]]::new()

$null  = $array.Add($sum)

do {
  # using the modulo operator (%) we can iterate over the $Changes 
  # array as if it were an infinite circular list
  $sum += $Changes[$index++ % $Changes.Length]
} while($array.Add($sum))

return $sum

Finally we can trigger the loop conditional expression by using a while{} loop instead and then all we need to do is give our variables a little more meaningful names and we have ourselves a lovely solution to the second puzzle!