First, some facts.
- Terraform Enterprise is a wonderful product. It's the self-hosted distribution of Terraform Cloud for organizations that want the privacy and scale of an enterprise-grade installation.
- There exists a Terraform Cloud/Enterprise Provider that can easily template and manage how an organization creates Workspaces (and other TFC/TFE resources).
Given the generally painful user experience of entering dozens of Workspace Variables into the TFE, it makes sense that nearly everyone I've worked with has stated a desire to use the
tfe provider to manage workspaces. I'm here to tell you why this ends up being a rougher idea than you hoped for.
Terraform Enterprise is priced by the number of Workspaces you have.
They're not cheap.
If you start dedicated Workspaces to creating and managing other Workspaces, you're effectively shorting yourself out of your own licenses.
On the other hand, if you're in Terraform Cloud, you're not paying per-Workspace, so feel free to use this method if the following hiccups don't pertain to you.
For what it's worth, if HashiCorp had a concept of "Configuration Workspaces" that didn't hit against your Workspace count, then this would obviously not be an issue.
Let's say you can get over the pricing issue. Talking through how the Configuration Workspaces would be architected and deployed brings up some other potential issues.
First, a quick overview of how we had tried this out.
The most important piece of functionality here is the Terraform module that uses the
tfe provider to create the Workload Workspaces.
Everything about the Workload Workspace is contained within this Workspace Repository, including the Workspace Variables.
Another name for this repository could be a "Configuration Repository," as it configures the implementation of the Workload Repository.
The Workload Repository defines the Terraform resources to create the workloads that you are configuring. For us, this is a bunch of resources from the
aws provider, but it could be anything.
The Configuration Workspace in Terraform Enterprise is pointed to the Workspace Repository. When it executes a Run, it generates Terraform Enterprise resources, such as the Workload Workspaces and the requisite Workspace Variables in each one.
The Workload Workspace is created by the Configuration Workspace and pointed to the Workload Repository. When it executes a Run, it generates Workload-specific resources. In our cases, this is generally AWS resources such as EC2 instances, EBS volumes, etc.
The biggest problem with this setup is the back-and-forth you have to do in order to make changes.
Let's say you want to add a resource to the Workload Repository. This is straightforward, and it doesn't cause many issues. You would just commit your changes to the Workload Repository, and the Workload Workspace would pick those up.
What if, however, that introduces a new variable to the repository? The changes you would have to make in order to get it through the system look something like this:
- Add the Workspace Variable resource to the Workspace Repository.
- Push the Run through the Configuration Workspace to add the TFE Workspace Variable to the Workload Workspace.
- Add the variable to the Workload Repository.
- You can finally push the Run through the Workload Workspace.
Four steps for a simple variable addition seems like a tough solution to roll out to any team.
What Should You Do Instead?
Let's be honest. I'm certain there are some really creative ways to work around these limitations managing Terraform at-scale. Feel free to use them and tell me what they are! Hit me up on Twitter if you find a really neat solution!
For our use cases, we are building out a Control Plane that abstracts that business-level functions from TFE itself. So, instead of thinking about "I need to create this TFE Workspaces with these Workspace Variables," we are now thinking "This team needs to create this application cluster."
This abstraction is not unique. I've talked to many people in the community that do a similar thing.
This solution works really well for us because we have a number of backend systems that we need to integrate together during an "application cluster spin-up." These include SaaS tools, such as PagerDuty, Splunk, New Relic, and many others. Given that Terraform Enterprise has a robust API, it makes our Control Plane much more straightforward to implement.