Introduction to state repair in Terraform

Sometimes you want to change Terraform code without destroying and restoring resources. Following use case might come up: You create an S3 bucket which is the input bucket for some other party. The other party is supposed to upload data into that bucket to be processed by your application. Following code might be used to create the bucket:

1
2
3
resource "aws_s3_bucket" "bucket" {
  bucket = "4bc627ca-23f7-41ab-a9b0-d800128beb56"
}

After an apply the bucket is created. Let’s see the current state:

1
2
$ terraform state list
aws_s3_bucket.bucket

Now that the bucket exist you tell the other party the bucket name so that they know where to upload data to. We ignore policy setup here. After some thinking you conclude it is time to refactor the name bucket into input_bucket as it is more appropriate for it’s purpose. So you change the name and do an apply:

1
2
3
4
5
6
7
8
9
10
11
12
13
  # aws_s3_bucket.bucket will be destroyed
  - resource "aws_s3_bucket" "bucket" {
    - bucket = "4bc627ca-23f7-41ab-a9b0-d800128beb56" -> null
    ... 
  }
 
  # aws_s3_bucket.input_bucket will be created
  + resource "aws_s3_bucket" "input_bucket" {
    + bucket = "4bc627ca-23f7-41ab-a9b0-d800128beb56"
    ...
  }
 
Plan: 1 to add, 0 to change, 1 to destroy.

This change destroys the old bucket and re-creates it. For a brief moment the bucket name will be available again for others to claim because S3 bucket names are a globally shared namespace. We want to avoid that scenario even though it is unlikely.

To reconnect Terraforms with the new variable name we need to move the item in the state:

1
2
3
$ terraform state mv aws_s3_bucket.bucket aws_s3_bucket.input_bucket
Move "aws_s3_bucket.bucket" to "aws_s3_bucket.input_bucket"
Successfully moved 1 object(s).

A listing of the state now shows:

1
2
$ terraform state list
aws_s3_bucket.input_bucket

The variable is now moved to the new name. Let’s see what happens when we try an apply:

1
2
3
4
$ terraform apply
aws_s3_bucket.input_bucket: Refreshing state... [id=4bc627ca-23f7-41ab-a9b0-d800128beb56]
 
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

We could avoid a re-creation of the resource and a potential loss of the bucket name!

Multi-Account Terraform

Software project for the cloud often involve multiple accounts. A typical setup associates a root account with multiple secondary accounts. The root account holds the Terraform state for all infrastructure on the secondary accounts. The secondary accounts commonly represent stages for development, testing and production.

Let’s say you are working with AWS and use an S3 bucket to hold the state. The following Terraform code manages this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
terraform {
    required_providers {
        aws = {
            source = "hashicorp/aws"
            version = "~> 3.29"
        }
    }
  
    backend "s3" {
        bucket = "mybucket"
        key = "terraform.tfstate"
        region = "eu-central-1"
    }
}
  
provider "aws" {
    profile = var.profile
    region = "eu-central-1"
}

In a basic scenario with just one account Terraform just uses the current AWS profile normally provided via the environment variable AWS_PROFILE. In a multi-account scenario we want to use the S3 bucket on the root account and the infrastructure should be build on the secondary account. Terraform can be told about two accounts like that:

1
2
export AWS_PROFILE='root-account-profile'
terraform apply -var 'profile=secondary-account-profile'

The backend block in Terraform does not allow for variables. Therefor we set the AWS_PROFILE to the root account profile which will apply to the backend block. The secondary account profile will be passed in as variable. We have chosen to use a -var parameter on the command line but you could use -var-file to pass multiple variables in one file.

This works fine until you need to build the next secondary account. You need another state file in the S3 bucket to hold the state of the new account. This is what workspaces are for. At the moment we are in the default workspace which always exists. Export the variable AWS_PROFILE if you have not done already. Show the current workspaces:

1
2
> terraform workspace list
* default

Let’s create a new workspace and use that for the new secondary account:

1
2
3
4
5
6
> terraform workspace new prod
Created and switched to workspace "prod"!
 
You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.

Now that we have another separate state we can repeat the apply for a different secondary account:

1
terraform apply -var 'profile=another-secondary-profile'

To switch back to another workspace use select and the workspace name:

1
terraform workspace select default

You can now manage a cloud infrastructure setup with root account and secondary account with Terraform and workspaces.

Spock: Nothing to test here

A reminder why a test should fail first. The following test is green:

1
2
3
4
5
6
7
8
9
10
def "test absolutely nothing"() {
 
    when:
    def list = [1, 2, 3]
 
    then:
    list.each {
        it > 0
    }
}

The following test is also green:

1
2
3
4
5
6
7
8
9
10
def "test absolutely nothing"() {
 
    when:
    def list = [-1, -2, -3]
 
    then:
    list.each {
        it > 0
    }
}

The Spock framework documentation states for then and expect blocks:

Except for calls to void methods and expressions classified as interactions, all top-level expressions in these blocks are implicitly treated as conditions.

To repair the test we have to assert ourselves:

1
assert it > 0

Kotlin’s return

Recently I tried out Kotlin and was a bit surprised about its return expression. If you have a Java background like me this might hit you, too.

Look at the following function which does not work as intended:

1
2
3
4
5
fun sum(numbers: List<BigDecimal>): BigDecimal {
    return numbers.fold(BigDecimal.ZERO, { acc, value ->
        return acc + value
    })
}

The fold function accepts an initial start value and feeds this and an element of the list into the lambda. The result of the lambda is the new initial value and fold iterates through the whole list until no more elements are left. This example function clearly wants to sum all elements of the list and return the result.

Now, let’s testdrive it with a couple of calls.

1
2
3
4
println(sum(listOf()))
println(sum(listOf(BigDecimal(3))))
println(sum(listOf(BigDecimal(3), BigDecimal(7))))
println(sum(listOf(BigDecimal(9), BigDecimal(7))))

The output is:

0
3
3
9

It seems ok at first but the last two calls got wrong results. The outcome hints that the function did not iterate through the whole list. Instead it stopped after the first element.
Have another look at the function and you see there are two different return expressions. The most inner return is inside a lambda and on first call the return ends the enclosing function not just the lambda. From the Kotlin documentation:

– return. By default returns from the nearest enclosing function or anonymous function.

So, the intention of returning the result lead us to use return but this ended the whole function prematurely.
How to repair this? Just remove the inner return as lambdas return the last expression anyway:

1
2
3
4
5
fun sum(numbers: List<BigDecimal>): BigDecimal {
    return numbers.fold(BigDecimal.ZERO, { acc, value ->
        acc + value
    })
}

Now it works as intended. There is more than one way to patch it up. Sometimes you really want to jump out on the spot and you can.
Kotlin allows a qualified return. Put a label at the lambda and tell the return about it:

1
2
3
4
5
fun sum(numbers: List<BigDecimal>): BigDecimal {
    return numbers.fold(BigDecimal.ZERO , accumulator@ { acc, value ->
        return@accumulator acc + value
    })
}

Most often you even do not need to label it yourself but can use an implicit label:

1
2
3
4
5
fun sum(numbers: List<BigDecimal>): BigDecimal {
    return numbers.fold(BigDecimal.ZERO , { acc, value ->
        return@fold acc + value
    })
}

Getting Started with Envers in Grails

Envers is a natural choice for revisioning your Grails domain classes. To get it to work with Grails is not hard but reliable information about the setup is scattered. First obstacle is to use the correct dependency setting (Grails 3.1.0):

1
2
3
compile('org.hibernate:hibernate-envers:4.3.11.Final') {
    transitive = false
}

The dependencies somehow break things. You would get weird exceptions about transactions. So we need to disable transitive dependencies.

Domain classes are considered for auditing when they are annotated with @Audited.

1
2
3
4
5
@Audited
class Machine {
    String name
    Date purchaseDate
}

A class like above spawns two tables: Machine and Machine_Aud. Current data lives in the Machine table. Older revisions are stored in Machine_Aud and can be queried with Envers’ own query tools. Each time you save, modifiy or delete a machine it will be recorded into the revision table.

Now let’s assume that your domain model dictates that the purchaseDate for your machine never changes once it is set. Putting an information into the revision again and again which is never touched would be a waste. To keep from revisioning a single field annotate its getter with @NotAudited. Annotations on fields do not seem to have any effect therefore imposing a getter just to annotate.

1
2
3
4
5
6
7
8
9
10
@Audited
class Machine {
    String name
    Date purchaseDate
 
    @NotAudited
    Date getPurchaseDate() {
        return purchaseDate
    }
}

Machine_Aud is now recording name only and purchaseDate not anymore.

Let’s enhance our domain model and create a Part class to track parts the machine is build of:

1
2
3
4
class Part {
    String name
    Machine machine
}

This is the simplest unidirectional one-to-many relation possible between Machine and Part. Part is not audited but references a Machine. This is no problem – audited and non-audited entities do not interfere here.

We want to make the connection bidirectional and so we add hasMany to Machine. For the first iteration, we do not want Envers to audit the connection and use @NotAudited on it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
@Audited
class Machine {
 
    String name
    Date purchaseDate
 
    @NotAudited
    Date getPurchaseDate() {
        return purchaseDate
    }
 
    static hasMany = [parts: Part]
 
    @NotAudited
    Set<Part> getParts() {
        return parts
    }
}

Why is this necessary at all? Envers audits connections because they are valueable information for the state of an audited entity. If you do not create a getter and annotate it, Envers will try to audit the connection AND will assume that the target entity is also audited. Part is not marked as audited and this would end in an exception.

The current configuration means we can request the current parts of the machine. But since they are not revisioned, they always point to the current state even if you pull out revisioned machine states.

For our final iteration, we want to make the connection of the parts from machine audited but not the parts themselves.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
@Audited
class Machine {
 
    String name
    Date purchaseDate
    List<Part> parts
 
    @NotAudited
    Date getPurchaseDate() {
        return purchaseDate
    }
 
    static hasMany = [parts: Part]
 
    @Audited(targetAuditMode = RelationTargetAuditMode.NOT_AUDITED)
    List<Part> getParts() {
        return parts
    }
}

This achives exactly what we want. The audit data for the relation is saved in the table Machine_Part_Aud. Another little gotcha here is that you should set both sides of the bidirectional connection due to some bugs creeping around with properties not being set (could be https://github.com/grails/grails-core/issues/9290).

1
2
3
4
5
6
7
8
9
10
Machine machine
Machine.withTransaction {
    machine = new Machine(name: 'My Machine', purchaseDate: new Date()).save(failOnError: true, flush: true)
}
 
Part.withTransaction {
    Part p = new Part(name: 'My Part')
    machine.addToParts(p)
    p.save(failOnError: true, flush: true)
}

Above code creates revisions of a machine with parts added in the last revision.

Log4j 2 filtering with multiple filters

Filtering in Log4j 2 is an extra way to control which messages make it to the appender. The documentation states that:

Filters may be configured in one of four locations

Filters at these locations do not behave the same but this is explained right afterwards.

This post is about the fact that the configuration does not really work the same in each of these locations. Most often I started out with a filter in the appender section because this is what the examples show you. Take the MarkerFilter example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="warn" name="MyApp" packages="">
    <Appenders>
        <RollingFile name="RollingFile" fileName="logs/app.log"
                     filePattern="logs/app-%d{MM-dd-yyyy}.log.gz">
            <MarkerFilter marker="FLOW" onMatch="ACCEPT" onMismatch="DENY"/>
            <PatternLayout>
                <pattern>%d %p %c{1.} [%t] %m%n</pattern>
            </PatternLayout>
            <TimeBasedTriggeringPolicy/>
        </RollingFile>
    </Appenders>
    <Loggers>
        <Root level="error">
            <AppenderRef ref="RollingFile"/>
        </Root>
    </Loggers>
</Configuration>

Adding a second MarkerFilter seems a no-brainer. Just duplicate the line and change some attributes:

1
2
3
4
5
6
7
8
9
<RollingFile name="RollingFile" fileName="logs/app.log"
             filePattern="logs/app-%d{MM-dd-yyyy}.log.gz">
    <MarkerFilter marker="FLOW" onMatch="ACCEPT" onMismatch="NEUTRAL"/>
    <MarkerFilter marker="FLOW2" onMatch="ACCEPT" onMismatch="DENY"/>
    <PatternLayout>
        <pattern>%d %p %c{1.} [%t] %m%n</pattern>
    </PatternLayout>
    <TimeBasedTriggeringPolicy/>
</RollingFile>

Oddly enough, this corrupts the whole configuration. You get:

1
ERROR appender RollingFile has no parameter that matches element MarkerFilter

This log message is easily missed as it happened to me. How to fix this? Use a CompositeFilter which is a surrounding element named Filters.

1
2
3
4
5
6
7
8
9
10
11
<RollingFile name="RollingFile" fileName="logs/app.log"
             filePattern="logs/app-%d{MM-dd-yyyy}.log.gz">
    <Filters>
        <MarkerFilter marker="FLOW" onMatch="ACCEPT" onMismatch="NEUTRAL"/>
        <MarkerFilter marker="FLOW2" onMatch="ACCEPT" onMismatch="DENY"/>
    </Filters>
    <PatternLayout>
        <pattern>%d %p %c{1.} [%t] %m%n</pattern>
    </PatternLayout>
    <TimeBasedTriggeringPolicy/>
</RollingFile>

The logic here is quite simple. Only one filter is allowed. If you want more you have to wrap them in a CompositeFilter. Now, at the beginning I told you that there are four locations where filters are permitted. Another location is context-wide. These are filters which are direct sub-elements of the Configuration element. And there it is totally fine to add more than one filter. You get no error by log4j and it works as expected:

1
2
3
4
5
6
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="warn" name="MyApp" packages="">
    <MarkerFilter marker="FLOW" onMatch="ACCEPT" onMismatch="NEUTRAL"/>
    <MarkerFilter marker="FLOW2" onMatch="ACCEPT" onMismatch="DENY"/>
    ...
</Configuration>

It is also ok to wrap them in a CompositeFilter:

1
2
3
4
5
6
7
8
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="warn" name="MyApp" packages="">
    <Filters>
        <MarkerFilter marker="FLOW" onMatch="ACCEPT" onMismatch="NEUTRAL"/>
        <MarkerFilter marker="FLOW2" onMatch="ACCEPT" onMismatch="DENY"/>
    </Filters>
    ...
</Configuration>

Best practice I derive from this weird behavior is to always wrap them. Then I can copy and paste between locations without thinking too much.

Provided Dependencies in Gradle

To this day Gradle is lacking a so called provided scope. This scope is probably best known from Maven which offers it out of the box. It allows to set dependencies which are provided by the runtime environment the program will later run on. Examples for such a runtime environment could be an application server like Wildfly or a distributed computing platform like Hadoop. Dependencies which are classified as provided are still needed to compile, yet you don’t need and don’t want them to ship in your deployable package. You don’t need them because they will be in your target environment anyway. And you don’t want them to ship to avoid conflicts of different versions of the same library.

To be honest, Gradle does not totally lack provided. The war plugin offers providedCompile and providedRuntime. So when you are building a web application you can just use that.

In case you build a jar you need to handcraft a provided scope. There haven been discussions about that already. If you’re interested in progress of that matter you can watch GRADLE-784.

The basis for the following code was a nice article by Mick Brooks. Here we go:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apply plugin: 'java'
 
repositories {
    mavenCentral()
}
 
configurations {
    provided
}
 
sourceSets {
    main {
        compileClasspath += configurations.provided
        runtimeClasspath += configurations.provided
    }
    test {
        compileClasspath += configurations.provided
        runtimeClasspath += configurations.provided
    }
}
 
dependencies {
     provided '...'
}

The fundamental idea is to create a new configuration and add it to all classpaths of main and test. There are other smaller variations of this code out there. They tend to change the compileClasspath of main only. This will work most of the time. It depends on your needs. For example, when you write integration tests where you need the provided dependencies you have to add them to your tests, too.

But why add it to the runtimeClasspath? Because there are relationships between runtime and compile configurations. The runtime configuration extends compile which means that any dependency in compile also appears in runtime. The two classpath properties use the respective configurations. So compileClasspath uses the compile configuration and runtimeClasspath uses runtime configuration. For short: I consider it good practice to add provided to both compileClasspath and runtimeClasspath because it would naturally happen with dependencies anyway. If your needs differ you can still remove these.

Our favorite build tool is now set up. How about our IDE? I like to generate the correct dependencies from our single source of truth – the build script. For IntelliJ IDEA this is achieved by:

1
2
3
4
5
6
7
apply plugin: 'idea'
 
idea {
    module {
        scopes.PROVIDED.plus += [ configurations.provided ]
    }
}

Gradle and IntelliJ IDEA – JUnit System Properties

In several unit test case scenarios I had the need to inject system properties to make the tests work. This is done easily with gradle:

1
2
3
test {
    systemProperty "mySystemProperty", "myValue"
}

This is straightforward. To make it a little bit more useful we can extract system properties given to gradle and pass them onto the test like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
ext {
    testProperties = [
            'javax.net.ssl.keyStore',
            'javax.net.ssl.keyStorePassword',
            'javax.net.ssl.keyStoreType',
            'javax.net.ssl.trustStore',
            'javax.net.ssl.trustStorePassword',
            'javax.net.ssl.trustStoreType'
    ]
}
 
test {
    project.testProperties.each {
        systemProperty it, System.getProperty(it)
    }
}

This is nice if your build environment defines these properties so you can pass them. Also, there are passwords within and we don’t want to put them into the build in cleartext.

Next thing to do is to incorporate the properties into IntelliJ IDEA. I really like it, when I can execute tests in my IDE and so do others:
http://blog.proxerd.pl/article/setting-system-properties-for-the-default-junit-run-configuration-in-intellij-from-gradle

Basically, we add the idea plugin to our gradle script and hook into the project file generation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
String createVmParameters(List keys) {
    keys.collect { key ->
        def value = System.getProperty(key)
        "-D$key=$value"
    }.join(" ")
}
 
idea.workspace.iws.withXml { XmlProvider provider ->
    Node node = provider.asNode()
    def runManager = node.component.find { it.'@name' == 'RunManager' }
    def defaultJUnitConf = runManager.configuration.find { it.'@default' == 'true' && it.'@type' == 'JUnit' }
    def vmParametersOption = defaultJUnitConf.option.find { it.'@name' == 'VM_PARAMETERS' }
    vmParametersOption.'@value' = createVmParameters(project.testProperties)
}

This hooks into the generation of the project file and modifies the xml. If all works ok you can generate your project files with a call to gradle idea.