How Conduit uses Buf to work with Protobuf

Conduit, our Kafka Connect alternative written in Go, uses Protobuf on two fronts:

  1. To define the gRPC API,
  2. It is the protocol used for communicating with standalone connectors.


However, we started facing challenges when working with Protobuf that impacted our developer experience, so we began looking for ways to resolve these problems. Continue reading to learn more about the challenges we faced and how we resolved them with Buf.

What is Protobuf?

Protobuf is the short-hand term for “protocol buffers”, a data format with an accompanying interface definition language. You can imagine it as XML or JSON, the difference being that the same data encoded with Protobuf generally results in a smaller memory footprint and better (de)serialization performance. Protobuf is commonly used as the data format in gRPC.

Challenges working with Protobuf

While Protobuf solves a whole set of problems it also introduces some challenges. These are the ones we ran into:

  • Managing Tools: To use Protobuf, you need to write a .proto file that describes the data structure you intend to serialize. Once you have a Protobuf file, you can run the Protobuf compiler to generate code in any of the supported languages. This in turn means you need to make sure you have the correct version of the compiler, as well as the correct version of any plugins you might need. Managing these tools quickly becomes a problem when multiple developers are involved since they need to ensure their environments are configured the same way.
  • Managing Dependencies: Protobuf files can import dependencies that need to be provided to the compiler at compile time. Developers are left on their own to figure out how to find existing Protobuf definitions, manage the dependencies, and ensure they are up to date.
  • Evolving the Schema: Data structures evolve, and so do Protobuf files. When you need to change the data structures in a Protobuf file there are rules you have to follow to ensure the new schema is backwards compatible. These rules are not enforced and are easy to miss.


What is Buf and how are we leveraging it?

Buf is a set of tools that aim to alleviate the challenges when working with Protobuf. We leverage the following tools to solve the above problems when developing Conduit:

  • Buf CLI comes with a built-in Protobuf compiler, linter, breaking change detection and formatter.
  • Buf provides Github Actions for setting up Buf, running the linter, detecting changes, and pushing Protobufs to their schema registry using the Github CI/CD system.
  • Buf Schema Registry is an online registry where you can push your Protobuf schemas. It will automatically generate a nice UI to browse the documentation of your schema, make it easily available to consumers to import as a dependency, and even automatically generate code so that consumers can entirely skip the compilation step (currently only available for Go).

Conduit Connector Protocol

Conduit has the ability to run connectors as plugins that don’t have to be included in the Conduit binary. Standalone connectors are invoked by Conduit and run in their own process that communicates with Conduit through gRPC (see this document for more information). The gRPC service definitions and data structures are defined in the Github repository ConduitIO/conduit-connector-protocol which is using Buf to manage Protobuf definitions. Here we will describe how we structured our workflow.

CI Actions

We use Github Actions provided by Buf to lint our proto files, detect breaking changes and upload them to the Buf Schema Registry. You can find the full workflow file here.

Let’s first look at the validate job that contains the first two steps.


First, we need to do some setup — we check out the repository (actions/checkout) and install the latest Buf CLI (bufbuild/buf-setup-action). After that, we are ready to call the lint action (bufbuild/buf-lint-action) that ensures our proto files follow the defined style guide.

After the lint is successful, we execute an action ensuring the new schema is backwards compatible with the old one. We achieve this by first fetching the main branch and executing the breaking action (bufbuild/buf-breaking-action) against the current content of the main branch.

If the validate job succeeds and the action is being executed on a commit to the main branch, then we trigger the job push.


You’ll notice this job also starts with the checkout and Buf setup actions, followed by the push action (bufbuild/buf-push-action) that takes a secret token to authenticate with the Buf Schema Registry and pushes the new Protobuf definitions.

These Github Actions result in a workflow that doesn’t rely on the developer having their local environment set up correctly, as the CI/CD is the single place where all Protobuf files are validated. Additionally, we don’t need to share secrets between developers, the CI/CD takes care of pushing schemas to the registry.

Schema Registry

We use the Buf Schema Registry to host the Protobuf definitions and get a UI for our docs. The registry also tracks old versions of the same schema file so anyone referencing an older version can keep using it or update to the new version using buf mod update.

Remote Code Generation

Pushing our Protobuf definitions to the Buf Schema Registry opens up the possibility of using remote code generation. The registry will take care of generating the Go code for us and expose it as a go module, ready to be imported. This feature allows us to entirely skip the manual compiling step and simply import the compiled code as a dependency.

For instance, to fetch the latest Conduit connector protocol code we can invoke this command:

go get go.buf.build/protocolbuffers/go/conduitio/conduit-connector-protocol

Every time we update the Protobuf definitions and push them to the registry, the code will be remotely generated and ready to be used in any dependent code.

Local Development

Our workflow heavily leans on hosted services like Github Actions and the Buf Schema Registry, so the natural question is how can we do local development? The answer are go mod replace directives.

To switch to locally generated Protobuf code we follow the following steps:

  • buf generate — executing this in the proto folder will compile the proto files and generate Go code locally in the folder internal
  • go mod init github.com/conduitio/conduit-connector-protocol/internal — executing this in folder internal will initialize a (temporary) Go module in the newly generated Go code
  • go mod edit -replace go.buf.build/library/go-grpc/conduitio/conduit-connector-protocol=./internal — executing this at the root of the repository will replace any references to the remotely generated code with the locally generated code (similarly we can do this for other repositories that depend on remotely generated code)


Conclusion

Buf is a great tool that allows us to streamline the management of our Protobuf files, ensures we follow code guidelines and don’t unknowingly introduce breaking changes. It solves these problems in an elegant way and enhances the developer experience.

You know what else enhances the developer experience? Conduit! We’re still very much in the early stages and rely on the feedback of our community to steer the project in the right direction. Try it out… if you like it join the discussion and show us some love!

Topics: Conduit