Go into AI with Go - The SP Blog

# Python in AI These days, everything from prompting to training to inference is done in Python. There are other languages that also see usage, such as Rust (see [HuggingFace tokenizers](https://github.com/huggingface/tokenizers) or [Text Generation Inference](https://github.com/huggingface/text-generation-inference)) or C++ ([TensorRT](https://github.com/NVIDIA/TensorRT)), but these languages augment rather than replace. Interestingly, Python seems to be the primary language used for prompting as well, even though there is nothing unique to Python language or runtime that provides an advantage for prompt engineering. There are many libraries that are meant to aid in prompt engineering, such as [DSPy](https://github.com/stanfordnlp/dspy), [LangChain](https://github.com/langchain-ai/langchain), and [Instructor](https://github.com/jxnl/instructor). While the necessity of these libraries is debatable, I would argue that the core functionality of these libraries can really be built in any language. Prompt engineering, at its core, boils down to sending specially crafted strings to a text generation server. How fast these strings can be sent to the server and a response returned is often the bottleneck. # Prompt Engineering is Language Agnostic Prompt engineering is programming language agnostic, since the actual inference of the model is essentially behind a REST interface, which can be called in any programming language. If you really wanted, you can do prompt engineering with just `curl` and Bash. ## Convergence on a single spec Whether is be a foundation model hosted by one of the giants in the AI field, or one of the GPU cloud offering inference on popular open source models, everyone has converged on a single API spec: the OpenAI spec. This makes it easy to re-use the code scaffolding across providers and models. In addition, there are model inference projects that offer the OpenAI spec for *local* inference (such as [ollama](https://github.com/ollama/ollama)). ## Structured Prompting is just JSON (Or YAML or XML) Some libraries offer structured prompting in one of two ways. They either do constrained decoding, which requires access to the models logits (and its tokenizer!), or they simply specify *a JSON schema in the prompt*. The latter is very simple and universal, and most decently sized powerful models can output valid, schema-following JSON with just some handcrafted prompting. This means that any language that has some kind of support for reading JSON can be used to parse the model's output, which is pretty much every language. In addition, language models can also be finetuned to simply output valid JSON when given schema, as the recent JSON mode of OpenAI and inference platforms following up ([Together](https://docs.together.ai/docs/json-mode) and [Anyscale](https://docs.endpoints.anyscale.com/guides/json_mode)) has shown. # Go for AI (for prompting) Despite prompt engineering inherently being programming language agnostic, I want to make the case that everyone is using the wrong language for prompt engineering. I want to make the case that Golang is built for this type of workload, where there is minimal compute but lots of IO-bound operations and concurrency juggling to do. I will introduce what Go is, and the specific features of Go that will help with prompt engineering. ## Meet Go For those that don't know, here is a brief introduction. Go is a garbage collected language that has a C-like simplistic syntax, and an M:N threading model (otherwise known as green threads). The garbage collection means you don't have to worry about memory usage or memory safety, being a typed, compiled language means its pretty fast, and best part of all, the M:N threading model means that you can lots of IO-bound concurrent operations with minimal overhead. There is no async function coloring syntax, there is built syntax sugar for best practice concurrency (channels), and most of the day-to-day functionality of building server apps is part of the standard library. All of this leads me to believe that the prompting language of choice should be Go. ## Syntax While the syntax of a programming is not directly helpful to prompt engineering, it is helpful if the language is readable and simple to write. Golang syntax is pretty damn simple, modeled after C. One of the core tenets of that language is that code is read way more often than written, so the creators of the language made sure that the syntax of the language is basic enough that even a junior programmer that knows a few mainstream languages can pick up the syntax very fast. ## Templates Golang had a built in text template library similar in functionality to [Jinja](https://jinja.palletsprojects.com) templates in Python. Templates provide a powerful and flexible way to generate text output, making them particularly useful for prompt engineering. Golang templates can insert values and use control structures like `if` and `for` loops into the templates, which can dynamically generate prompts based on user input or context. ## Built-in testing framework This is one of my favorite things about Golang. The language has a built-in testing framework that supports parallel tests, table tests, and even benchmarking. You can even provide command line arguments to your tests to change the behavior on the fly. With table tests, we can create different versions of prompts we want to test while reusing the evaluation test code. You can also provide flags to run only a subset of tests. ## Concurrency A lot of prompt engineering is rewriting prompts, and testing against some evaluation or benchmark. Ideally, we want the feedback loop as fast as possible, which means that we want to run evaluations as fast as possible. With Golang, we can use channels and goroutines to run the evals as fast as our eval endpoint will allow. Goroutines are green threads, or otherwise known as the M:N threading model. What this allows is having many more number of "virtual" threads than actual CPU cores. Since goroutines exist as part of the Golang runtime, context switching, creation and deletion of goroutines are extremely cheap. The runtime also handles smart context switching with a work stealing scheduler and epoll and kqueue integration. This means that Golang is uniquely equipped to handle IO-bound workloads while utilizing all of the cores in a machine. ## Fully featured OpenAI client Of course, without a OpenAI compatible client, none of the above would really matter. The [library](https://github.com/sashabaranov/go-openai) supports everything, from chat completions to speech endpoints to image generation endpoints. ## Conclusion Hopefully I've convinced you to try out Golang during prompt engeering experiments. There are a lot of benefits to using the language for prompt engineering, especially since the language can be used for both experiments and server-side production applications.