Some tips and tricks for doing Advent of Code with Haskell
As I write this, it is almost time for the yearly Advent of Code again. If you have never done one of those, it is a yearly event where the organizers publish a new programming puzzle every day from the 1st of December until the 25th, with each day being slightly harder than the one before. There is a leaderboard for those who want to try and solve the puzzles as fast as possible, but most people don’t bother. Some use the puzzles to practice a new programming language, to make the most optimized version of the code they can, to make the code “beautiful” in some special way, or even (especially!) just for fun. It is an entirely voluntary activity, so you can do it in any way you want. I usually solve these puzzles in Haskell, as it is the language I enjoy most.
Since every year there seems to be a sizable contingent trying to learn Haskell, I thought it would be nice to share some tips and tricks that I have picked up over the years. This blog post is aimed at intermediate Haskellers; it assumes you understand the basic syntax and have a grasp of what monads and monoids are, but it doesn’t go into super deep type level stuff. It is also not meant to be a style guide for production-grade Haskell code, I merely show off some tricks that have helped me write clearer and more concise code for situations that seem to show up often in Advent of Code problems.
Reducing boilerplate with <$>
Many times, Advent of Code puzzles will involve reading some input from a file and then doing something for each line. A very simple example would be to read a file with an integer on each line and return their sum. Reading a file into a String
is straightforward with the readFile
function, which takes a FilePath
and returns the contents of the file. You can then use lines
to split the string into separate lines, convert the String
to an Integer
with read
and finally sum
them all together. A basic first try would be:
= do
day1part1 <- readFile "input_day1.txt"
contents let fileLines = lines contents
-- Type annotation is needed to let GHC know we want to convert
-- to a list of Integer and not (say) a list of Floats
let integers = map read fileLines :: [Integer]
return $ sum integers
This is probably close to how you would write this in an imperative language and it works fine. However, it is much longer than necessary and uses several intermediate variables that are not really required. One way to shorten this function would be to compose several of the pure functions into one by composing them with the .
operator. We can also apply a function to the result of an IO
operation with <$>
.
= do
day1part1 <- (map read) . lines <$> readFile "input_day1.txt" :: IO [Integer]
integers return $ sum integers
This works because <$>
works for all functors, readFile
is an operation in the IO monad, and all monads are functors. Now it is much clearer what the solution is: just the sum of the lines of the file, after converting them to Integer
s. This is usually short enough for me, but if you wanted to make this a oneliner you could of course also compose the sum
application:
= sum . (map read) . lines <$> readFile "input_day1.txt" :: IO Integer day1part1
Do be careful though. It is easy to make very long function pipelines this way and they can make your code very difficult to comprehend.
Generating coordinates
Many puzzles in Advent of Code involve doing some operation on a grid. This can be a 2D grid, a 3D grid or even a grid with more dimensions than we can easily visualize. Almost always, you need to compute something for all coordinates on the grid. In imperative languages this would usually take the form of nested for-loops, but in Haskell these can be unwieldy. A more functional way of doing this is to define all the coordinates upfront and then map
over them. The simplest way to generate these coordinates is with list comprehensions and with Applicatives.
List comprehensions follow a math-like notation and work a follows:
> [ (x,y) | x <- [1..3], y <- [1..3]]
ghci1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)] [(
Another very straightforward way is with Applicative
s:
> (,) <$> [1..3] <*> [1..3]
ghci1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)] [(
This works by combining the applicative instance for lists and the fact that the tuple constructor (,)
is a normal function taking two values. In the above example, [1..3] <*> [1..3]
will generate a list of all combinations from both lists, then <$>
will feed them to the tuple constructor to generate a list of tuples.
I personally slightly prefer the applicative version, as it is slightly shorter. All that typing does add up over 25 days!
Using laziness to your advantage
A very common pattern in Advent of Code puzzles is to have some state and a step
function of type State -> State
. The question will then usually be something like “after how many iterations will the state have some property?”. Conway’s Game of Life is a great example of this type of question, see for example … and … . A very neat pattern for solving this kind of question is to use the iterate
function. It will take a starting state with some type a
and a function from a -> a
, and (lazily) generate an infinite list by repeatedly applying the function to the latest state:
> take 5 $ iterate (*2) 1
ghci1,2,4,8,16] [
Using iterate step startState
, you can generate a list of all states that the system will reach. Then, the takeWhile
and dropWhile
functions from Data.List
can be used to answer questions about the list of states. For example length . takeWhile verifySomeConditionIsNotSatisfiedYet $ iterate step startState
finds how many iterations it takes until the condition is satisfied (for some function verifySomeConditionIsNotSatisfiedYet :: State -> Bool
of course), while head . dropWhile verifySomeConditionIsNotSatisfiedYet $ iterate step startState
will give you the first state for which the condition is True
.
Parsing the input files with parsec
In Advent of Code problems, the input is often given in the form of a text file that will need some parsing before any useful computations can be done with the information. A good example is day 22 from 2021:
on x=-3..47,y=-33..16,z=-29..20
off x=33..46,y=-26..-13,z=-21..-3
on x=-36..16,y=-33..17,z=-7..42
off x=-20..-10,y=-25..-10,z=-11..6
<many more lines with the same format>
While this particular problem would lend itself well to being parsed with regular expressions, Advent of Code inevitably throws in a few problems each year (2021 day 16 for example) that cannot be solved with a regex. Also of course, if you encounter a problem and decide to use a regex you now have two problems. A more robust solution is to use one of the many parsing libraries available. My personal favorites are the ones based upon the idea of parser combinators, which let you build up a big parser from multiple smaller parsers. This allows for solutions that read very natural to humans, as they mirror the way a human would describe the structure of the text to be parsed. For the example above:
- First the string “on” or “off”, designating the operation to be done;
- A space;
- The string
"x="
followed by a range forx
; - A comma;
- The string “
y="
followed by a range fory
; - A comma;
- The string
"z="
followed by a range forz
.
Where a range is defined as:
1. A signed integer designating the start of the range;
2. The string ".."
;
3. Another signed integer designating the end of the range.
This translates almost directly to Haskell:
data Switch = On | Off deriving (Show,Eq)
data Instruction = Ins {
switch :: Switch,
xRange :: (Int,Int),
yRange :: (Int,Int),
zRange :: (Int,Int)
deriving (Show,Eq)
}
switchOn :: Parser Switch
= try $ string "on" *> pure On
switchOn switchOff :: Parser Switch
= try $ string "off" *> pure Off
switchOff
range :: Parser (Int,Int)
range = do
min <- integer
".."
string max <- integer
return (min,max)
instruction :: Parser Instruction
= do
instruction <- switchOn <|> switchOff
s " x="
string <- range
xr ",y="
string <- range
yr ",z="
string <- range
zr return $ Ins s xr yr zr
Since parsing input files happens so often, I usually define some utility functions for applying a parser to a file and for applying a parser to each line in a file. Both are very straightforward:
-- Parse an entire file with the given Parser.
parseFile :: Parser a -> FilePath -> IO (Either ParseError a)
= parse p fp <$> readFile fp
parseFile p fp
-- Parse each line of a file with the given Parser.
parseFileLines :: Parser a -> FilePath -> IO (Either ParseError [a])
= parseFile p'
parseFileLines p where p' = p `sepBy` endOfLine >>= \ps -> eof >> return ps
If you want to know more about parser combinators, there are a lot of great posts available describing how they work and how to make custom parsers with them.
String vs Text vs ByteString
As a sidenote, Haskell has long had several different types to represent text. The String
type is the oldest and is internally a linked list of Char
s. This makes it easy to consume character by character or to pattern match on the first few characters, but is quite memory inefficient. Text
is the type you want for storing UTF encoded text in a memory efficient manner. Finally, ByteString
is also memory efficient but treats all its contents as just bytes or ASCII characters.
As a rule of thumb, you almost never want to use String
. Human readable strings should almost always be Text
and binary data should always be ByteString
.
Miscellaneous tips and advice
This article is already getting quite long, so I’ll finish with a motley assortment of advice for solving:
- For all but the most basic problems, take a moment to think about the best data structure to represent the problem. When your program needs to remember if you have seen an item before, lists and arrays don’t have the best credentials for inserting and lookup time but a (hash)set might be just what you want. The containers and unordered-containers packages offer a host of handy data structures.
- The fgl package provides a plethora of useful graph algorithms.
- If you are doing a depth first search and the sub-solutions overlap, try memoizing intermediate solutions to prevent having to search an exponentionally growing search space.
- If you have a bug and just want to check the state of a variable at some point,
Debug.Trace
is a great way to quickly do some “printf debugging” even in pure functions where you usually wouldn’t be able to runIO
functions. - Every so often, a question pops up that requires a hexagonal grid. The guide over at Red Blob Games has everything you would ever want to know about hexagonal grids and more.
Conclusion
In this post I wrote up a few tips and tricks to solve problems that come up often in Advent of Code problems. I hope you learned something interesting and even if you didn’t, I hope I managed to inspire you to take part in this years’ edition. Happy coding!