How does Terraform work?
Terraform is an automation tool that excels at provisioning infrastructure. How do Terraform providers work? Let’s take a close look at the example hashicups provider.
Terraform providers are Go programs. After checking out this repo and setting up a Go environment they can be compiled as normal:
git checkout https://github.com/hashicorp/terraform-provider-hashicups.git
cd terraform-provider-hashicups
go build -o terraform-provider-hashicups
If we try to run the provider by itself we get this message:
$ ./terraform-provider-hashicups
This binary is a plugin. These are not meant to be executed directly.
Please execute the program that consumes these plugins, which will
load any plugins automatically
Exit code 1
Terraform providers are called as subprocesses to the main Terraform process. There must be something in the environment that lets the subprocess know it is being called by Terraform.
Terrform has a nice plugin architecture where the main type of plugin is a provider. Plugin architectures are great for large number of applications. Design a simple core program that is capable of loading, executing, and communicating with plugins and then extend the core in whatever domain you are working with. This is especially good for teams of developers because most of them will be writing plugins to the core.
Hashicorp wrote a plugin library for go called go-plugin which Terraform providers use. Looking throught this code we see a hint in go-plugin/server.go:
248 // Validate the handshake config
249 if opts.MagicCookieKey == "" || opts.MagicCookieValue == "" {
250 fmt.Fprintf(os.Stderr,
251 "Misconfigured ServeConfig given to serve this plugin: no magic cookie\n"+
252 "key or value was set. Please notify the plugin author and report\n"+
253 "this as a bug.\n")
254 exitCode = 1
255 return
256 }
257
It is looking for a MagicCookieKey
and MagicCookieValue
. In Terraform core we can find the values
for these:
26 var Handshake = plugin.HandshakeConfig{
27 // The ProtocolVersion is the version that must match between TF core
28 // and TF plugins.
29 ProtocolVersion: DefaultProtocolVersion,
30
31 // The magic cookie values should NEVER be changed.
32 MagicCookieKey: "TF_PLUGIN_MAGIC_COOKIE",
33 MagicCookieValue: "d602bf8f470bc67ca7faa0386276bbdd4330efaf76d1a219cb4d6991ca9872b2",
34 }
Now if we set these as environment variables and run the provider again we get:
$ export TF_PLUGIN_MAGIC_COOKIE=d602bf8f470bc67ca7faa0386276bbdd4330efaf76d1a219cb4d6991ca9872b2
$ ./terraform-provider-hashicups
{"@level":"debug","@message":"plugin address","@timestamp":"2025-04-15T05:34:18.840723-04:00","address":"/var/folders/fn/w9s_vznd7qs71qz8rxv3mfb80000gn/T/plugin2463171971","network":"unix"}
1|6|unix|/var/folders/fn/w9s_vznd7qs71qz8rxv3mfb80000gn/T/plugin2463171971|grpc|
This is a mix of stderr and stdout. The first line is a log line sent to stderr. The second line is a protocol descriptor sent to stdout. This line communicates to the core how it should communicate with this plugin. Here we have:
1
- Probably a schema version for the message6
- The plugin protocol version/var/folders/fn/w9s_vznd7qs71qz8rxv3mfb80000gn/T/plugin2463171971
- A unix socketgrpc
- The remote procedure call protocol to use. Here it is grpc.
GRPC is a really nice RPC protocol with libraries for it in nearly every language. This means we can write a client in Python to explore how Terraform providers work in more detail.
Now we know how they communicate, but what do they communicate? GRPC uses protocol buffers which require that the endpoints agree on the schema of the binary communication in advance. This is not as easy as inspecting the JSON communication that Ansible modules use. We will need to dive into the code a bit more to find the schema. Protocol buffers have a schema file that explain all the messages that can be sent between RPC peers. We are looking for the 6th version of this schema which we can find here
// Copyright (c) HashiCorp, Inc.
// SPDX-License-Identifier: MPL-2.0
// Terraform Plugin RPC protocol version 6.9
//
// This file defines version 6.9 of the RPC protocol. To implement a plugin
// against this protocol, copy this definition into your own codebase and
// use protoc to generate stubs for your target language.
//
// This file will not be updated. Any minor versions of protocol 6 to follow
// should copy this file and modify the copy while maintaing backwards
// compatibility. Breaking changes, if any are required, will come
// in a subsequent major version with its own separate proto definition.
//
// Note that only the proto files included in a release tag of Terraform are
// official protocol releases. Proto files taken from other commits may include
// incomplete changes or features that did not make it into a final release.
// In all reasonable cases, plugin developers should take the proto file from
// the tag of the most recent release of Terraform, and not from the main
// branch or any other development branch.
//
We can use this file to generate a client in Python to communicate with the hashicups provider using
protoc. Download this file and call it tfplugin6.proto
. Run protoc on it like this:
protoc tfplugin6.proto --python_out=.
This produces a file name tfplugin6_pb2.py
which will use in our Python client.
If we look at the tfplugin6.proto
protobuf schema file we can see a service section that lists
all the RPC endpoints that we can connect to:
327 service Provider {
328 //////// Information about what a provider supports/expects
329
330 // GetMetadata returns upfront information about server capabilities and
331 // supported resource types without requiring the server to instantiate all
332 // schema information, which may be memory intensive. This RPC is optional,
333 // where clients may receive an unimplemented RPC error. Clients should
334 // ignore the error and call the GetProviderSchema RPC as a fallback.
335 rpc GetMetadata(GetMetadata.Request) returns (GetMetadata.Response);
336
337 // GetSchema returns schema information for the provider, data resources,
338 // and managed resources.
339 rpc GetProviderSchema(GetProviderSchema.Request) returns (GetProviderSchema.Response);
340 rpc ValidateProviderConfig(ValidateProviderConfig.Request) returns (ValidateProviderConfig.Response);
341 rpc ValidateResourceConfig(ValidateResourceConfig.Request) returns (ValidateResourceConfig.Response);
342 rpc ValidateDataResourceConfig(ValidateDataResourceConfig.Request) returns (ValidateDataResourceConfig.Response);
343 rpc UpgradeResourceState(UpgradeResourceState.Request) returns (UpgradeResourceState.Response);
If we dig into the message types we can see that some of the request messages do not take any arguments. So we can call these easily:
379 message GetMetadata {
380 message Request {
381 }
382
383 message Response {
384 ServerCapabilities server_capabilities = 1;
385 repeated Diagnostic diagnostics = 2;
386 repeated DataSourceMetadata data_sources = 3;
387 repeated ResourceMetadata resources = 4;
388 // functions returns metadata for any functions.
389 repeated FunctionMetadata functions = 5;
390 repeated EphemeralMetadata ephemeral_resources = 6;
391 }
Let’s write a little Python script to call GetMetadata
GRPC service:
#!/usr/bin/env python3
from subprocess import Popen, PIPE
import io
import grpc
import tfplugin6_pb2
import tfplugin6_pb2_grpc
def main():
env = {
"TF_PLUGIN_MAGIC_COOKIE": "d602bf8f470bc67ca7faa0386276bbdd4330efaf76d1a219cb4d6991ca9872b2"
}
proc = Popen(
"./terraform-provider-hashicups",
stdout=PIPE,
stderr=PIPE,
shell=True,
env=env,
)
message_version = None
tf_proto_version = None
socket_type = None
socket_address = None
protocol = None
for line in io.TextIOWrapper(proc.stdout, encoding="utf-8"):
line = line.rstrip("\n")
print(line)
data = line.split("|")
message_version = data[0]
tf_proto_version = int(data[1])
socket_type = data[2]
socket_address = data[3]
protocol = data[4]
break
print(message_version, tf_proto_version, socket_type, socket_address, protocol)
if message_version is None:
for line in io.TextIOWrapper(proc.stderr, encoding="utf-8"):
line = line.rstrip("\n")
print(line)
if tf_proto_version == 5:
raise Exception("Not supported in version 5")
elif tf_proto_version == 6:
with grpc.insecure_channel(f"{socket_type}://{socket_address}") as channel:
stub = tfplugin6_pb2_grpc.ProviderStub(channel)
response = stub.GetMetadata(tfplugin6_pb2.GetMetadata.Request())
print("received: " + str(response))
else:
raise Exception(f"Unsupported protocol version {tf_proto_version}")
if __name__ == "__main__":
main()
If we put the hashicups terraform provider into the directory with this script and run it we get:
$ python get_metadata.py
1|6|unix|/tmp/plugin2003463635|grpc|
1 6 unix /tmp/plugin2003463635 grpc
received: server_capabilities {
plan_destroy: true
get_provider_schema_optional: true
move_resource_state: true
}
data_sources {
type_name: "hashicups_coffees"
}
resources {
type_name: "hashicups_order"
}
Success! We can retrieve some data from the provider. We can see that it has data sources and resources and that it provides some capabilities.
So we have learned that Terraform providers are self-contained programs that run as plugins to the main Terraform process. They communicate with the main process over unix sockets using the GRPC protocol and protobuf messages to exchange data.
In future posts we will explore the other services that providers have to better understand the Terraform provider lifecycle.