C2go converts (some) C programs to Go.
It was developed by Russ Cox to convert the Go runtime and compiler to Go.
It is progressing toward becoming a general-purpose translation tool,
but it does not (and never will) translate all possible C programs.
You will need to refactor your C program into the subset of C that c2go
can handle, and give it translation hints in the configuration file.
Goals
-
Generate Go that is readable, maintainable,
and similar to the original C code.
-
Generate type-safe and memory-safe code;
it should not use the "unsafe" package.
-
Handle as much as possible of the subset of C that is compatible
with the other goals.
Non-Goals
Limitations
Some limitations are virtually inherent in the goals of producing safe, readable code;
others are simply things that haven’t been implemented yet.
-
Pointer arithmetic: Pointer arithmetic is translated as slice operations when possible.
But slices can only be moved forward, not backward.
(p + n
becomes p[n:]
; p - n
doesn’t work.)
-
Some constructs that are equivalent in C are only recognized one way by c2go.
For example, when a count of items is multiplied by a sizeof
expression,
the sizeof
needs to come last.
In C, &p[n]
and p + n
are equivalent, but c2go translates them differently,
as &p[n]
and as p[n:]
.
-
System headers:
System headers generally are difficult to parse; c2go doesn’t even try.
If you use a preprocessor, it
will expand macros that were defined in system headers,
but c2go itself ignores the contents of system headers.
-
goto
:
The goto
statement in Go is much more limited than in C.
So some uses of goto
will need to be refactored.
How to Use c2go
Before you even try running c2go on your program,
you should refactor it into the kind of C that c2go is likely to be able to understand.
Throughout the refactoring process,
run tests regularly to make sure you haven’t introduced any bugs.
Remove CPU-Dependent Code
Many C projects contain code that depends on the CPU’s byte order or word size.
Replace that code with something that will translate cleanly to Go.
If the program has a header like "platform.h" or "port.h", start there.
Simplify Memory Management
If you allocate memory with malloc
, c2go will generally be able to translate
it to make
or new
. But that won’t happen with a custom memory allocator.
So if the program uses a custom allocator, replace it with malloc
and free
.
Reduce Preprocessor Use
C2Go can run directly on un-preprocessed C source files,
or it can use GCC or Clang to preprocess the source (with the -cpp
flag).
In either case, the less the preprocessor is used, the better the output will be.
If you don’t use a preprocessor,
you should list the project’s .h
files on the c2go command line before the .c
files.
Macro constants will be translated into Go constants,
but function-like macros will be ignored (and probably cause errors where they are used).
In some cases, function-like macros can be replaced with Go functions,
especially now that Go 1.18 has generics.
Macros can also be defined in the config file,
but they are more limited than C macros,
because they act on the AST rather than on a stream of tokens:
macro square(x) x * x
macro ERRORF(format, ...) fmt.Fprintf(os.Stderr, format, __VA_ARGS__)
macro PRINT_NEGATIVE(n) {
if n < 0 {
fmt.Println(n)
}
}
(Note that the square
macro doesn't need parentheses like its C equivalent.
because C2Go will automaticall insert parentheses as needed.
Working on the AST instead of on a stream of tokens has its advantages…)
If you use a preprocessor,
all macros in the C code will already be expanded when c2go itself processes them.
If you want to preserve the macro names in your Go code,
you will need to define them some other way than as macros.
(Replace #define
constants with enum
constants, for example.)
Create a Config File
The config file is an important part of the process of using c2go.
It can contain various types of translation hints and fixups,
but we’ll start by just specifying what package your Go code will be in.
Create a file called c2go.cfg
.
If your project is a command, put the line package * main
in the config file;
otherwise you should replace main
with your package’s import path.
(You can also specify multiple packages, each with its own pattern that specifies which files go in that package.
But most likely all your files will go in the same package, and one line with a *
is all you need.)
When you run the c2go tool, always point it to your config file with the -c
option.
You will probably be adding a lot more to the file before you are done.
Start Translating
I like to start by translating just one file;
then once it translates and compiles successfully,
I add another file to the set of files that I’m translating,
until eventually I am translating the whole project.
When you first run c2go, you will probably wonder where the output went.
By default, the Go files are created in subdirectories of /tmp/c2go
.
You can send them somewhere else (such as your default GOPATH) with the -dst
option.
Fix Errors
You will probably have two types of translation errors to deal with:
errors from c2go itself, and errors from the Go compiler when you try to compile the output.
Errors from c2go generally should be fixed by editing the C code to make it easier for
c2go to parse or translate it.
In my experience, the majority (though not all, by any means) of errors from the Go compiler
can be fixed by adding type hints to c2go.cfg
.
Type hints look like this:
slice ReplicateValue.table
This indicates that the parameter or local variable table
in the function ReplicateValue
should be translated as a slice instead of as a pointer.
(The default is to translate char *
as []byte
, and other pointer types as pointers.)
You will need a slice
type hint for each pointer variable
that is used to point to an array of objects rather than a single object.
The other type hints available for pointer variables are ptr
and string
.
Besides the type hints for pointers, there is also bool
to indicate that an int
variable should be translated as a bool
.
The variable reference in a type hint is usually of the form
functionName.variableName or typeName.memberName.
But for global variables, or for variables in blocks nested inside a function,
you just use the variable name by itself.
If your error can’t be fixed with a type hint,
you can usually fix it by editing the C code.
Make sure it still compiles and passes the tests.
If the error can’t be fixed by editing the C code,
you will need to fix it in the Go output by putting a diff block
in the config file. A diff block replaces one or more
lines of Go output, like this:
diff {
- htrees []*HuffmanCode
+ htrees [][]HuffmanCode
}