Undefined C

There is possible to run piece of code inside online c compiler like https://www.onlinegdb.com/online_c_compiler Or run locally. With base check is done with gcc compiler. There are many small tricks around running C code in practice that aren't covered in any generic tutorials, so here is list of topics that may arise while coding real C code outside of tutorials. For each case there is just small example, each of those could take whole chapter on its own.

Compile

hello_world.c

int main() {
    printf("Hello world\n");
}
gcc hello_world.c -o hello_world
gcc -m32 hello_world.c -o hello_world_32 #for 32bit target

Syntax

Variables

Standard list of available types

Check type size

All types have size that are declared in bytes. Some of the types are machine dependents. like int/long, if there is needed machine independent types then there are int32_t/uint32_t/int64_t/uint64_t

Each architecture 8bit/16bit/32bit/64bit will have different size for those types

Use sizeof()

Running on x86 machine

#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
int main() {
    printf("Sizeof int %lu\n",sizeof(int));
    printf("Sizeof int32_t %lu\n",sizeof(int32_t));
    printf("Sizeof int64_t %lu\n",sizeof(int64_t));
    printf("Sizeof long %lu\n",sizeof(long));
    printf("Sizeof long long %lu\n",sizeof(long long));
}

Most safest/portable way is to use [u]int[8/16/32/64]_t types.

Defined macros'es to get type max and min values are

https://en.cppreference.com/w/c/types/limits

#include <limits.h>
int main() {
    printf("INT_MIN %d\n",INT_MIN);
    printf("INT_MAX %d\n", INT_MAX);
    printf("LONG_MIN %ld\n",LONG_MIN);
}

Example from AVR stdint.h
https://github.com/avrdudes/avr-libc/blob/main/include/stdint.h
Example from Libc
https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/stdint.h

How to shoot the leg

When code suppose to run on 32bit and 64bit platform the size of type may vary. Need to take in account this case.

Functions

Function syntax, there is nothing interesting on functions

<RETURN_TYPE> <NAME>(<TYPE> <NAME>,..) {
    <EXPR>
}

Write simple function

int fun1() {
    return -1;
}

Function can have multiple return statements. Here is example whne function have 3 return values.

int fun2(int i) {
    if (i<0) return -1;
    if (i>0) return 1;
    return 0;
}

Get address of function

printf("fun1 address %016x",&fun1);//64bit platform

If statement

if () ;
if () {}

One of the way to check error of returned functions is

if ((c = getfun()) == 0) {
}

Most simplest and outdated way to do this is when getting input from command line

#include <stdio.h>
int main() {
    int c;
    char ch;
    while ((c = getchar()) != EOF ) {
        ch = c;
        printf("Typed character %c\n",c);
    }
}

For cycle

For loop is one that may involve some trickery, its as simple as

for (<INITIAL>;<TERMINATE CONDITION>;<AFTER CYCLE>) {
}

Go over values from 1 till 10

int i=0;
for (i=1;i<=10;i++) {
    printf("%d\n",i)
}

Now lets do it from 10 till 1

int i=0;
for (i=10;i>0;i--) {
    printf("%d\n",i)
}

Now lets make one liner

for (i=0;i<10;i++,printf("%d\n",i));

Yes there is possible to write as many expressions as needed.

Structure

Structure allows to combine types under one new type. Structure is convenient way how to combine set of types and reuse them as one.

struct struct1 {
    uint8_t a;
    uint16_t b;
    uint32_t c;
    uint64_t d;
};

Total intuitive size of structure would be

int total_szie = sizeof(uint8_t) + sizeof(uint16_t) + sizeof(uint32_t) + sizeof(uint64_t);
int real_size = sizeof(struct1);

Types are placed inside structure to make fast access to them. Some instructions of CPU may require to access aligned memory addresses to not have penalty on accessing types inside structure.

To directly mess with alignment of types use attribute

__attribute__ ((aligned (8)))

Use attributes to pack structure and be not architecture dependent.

struct struct2 {
    uint8_t a;
    uint16_t b;
    uint32_t c;
    uint64_t d;
} __attribute__((packed));

Now let check size of structure after it packed

int new_size = sizeof(struct2);

Also there is possible to add aligmnet to each time in structure

struct struct3 {
    uint8_t a __attribute__((aligned (8)));
    uint16_t b __attribute__((aligned (8)));
    uint32_t c __attribute__((aligned (8)));
    uint64_t d __attribute__((aligned (8)));
} __attribute__((aligned (8)));

Now size of structure will be 32.

All results on amd64, other arch may differ.

How to shoot leg

Forget that struct size is not consistent.

Recursion

Recursion is technique that could be useful to write shorter code and deal with cycles. One thing that recursion suffer is that it consumes stack memory and its have default limit on platform.

#include <stdio.h>
#include <stdlib.h>

int fun_r(int i) {
    printf("val %d\n",i);
    fun_r(i+1);
    return 0;
}

int main() 
{
    fun_r(0);
}

Program will fail after its reach out of stack range. When increase the default stack limit it go more further.

Check default stack size

ulimit -s

Set stack size

ulimit -s 16384

Macro

There is many things useful as macros. There is many tricks in macros to emit useful parts of code.

Define values, as its enum.

#define VAL_0 0
#define VAL_1 1
#define VAL_LAST VAL_1

Multiline macro

#define INC_FUN(TYPE) TYPE inc_##TYPE(a TYPE){\
    TYPE c=1\
    return a + c\
}

INC_FUN(int)
INC_FUN(char)
INC_FUN(double)
INC_FUN(notype)

to check code expansion of macro run

gcc -E <SOURCE_FILE>

http://main.lv/writeup/c_macro_tricks.md

https://jadlevesque.github.io/PPMP-Iceberg/

Pointers

One the C most loved feature is pointers, they allow to access addresses without any sanity check and they dont have any lifetime, so anything is possible with those.

Pointer contains address which is interpreted according of pointer type

int c;
int ptr=&c;

Go over array of chars

#include <stdio.h>
#include <stdlib.h>

int main() {
    char s[]="asd";
    char *c=&s;
    while (*c != 0) {
        printf("NExt char %c addr %016x\n",*c,c);
        c++;
    }
}

Go over array of ints

    int i=0;
    int arr[] = {9,7,5,3,1};
    int *ptr = arr;
    while (i<5) {
        printf("Number value %d addr %016x\n",*ptr, ptr);
        ptr++;
        i++;
    }

Pointer arithmetics like +1 will move to next address that is offset of type size. As example below structure size is 12, and increment of pointer to that structure increment address to sizeof structure. And yes address is pointing to not mapped memory, so it will segfault if accessed.

struct size12 {
    int a,b,c;
}

int main() {
    struct size12 *s=0;
    s++;
    printf("%016x\n",s);
    s++;
    printf("%016x\n",s);
}

Double pointers are pointers to pointers

#include <stdio.h>

int main(int argc, char **argv) {
    char *arg = argv[0];
    printf("Program name %s\n",arg);
}

How to shoot the leg

Run pointer in while loop incrementing pointer. It will stop only when segfaults.

Dont initialize pointer and it will have random value.

Allocate memory

From programs perspective memory allocation is adding address range to executable that can be addressed.

malloc should be accompanied with free statement, otherwise it will have memory leaks.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    char *c = malloc(16);
    memset(c,0,16);
    int *arr = malloc(16*sizeof(int));
    memset(arr,0,16*sizeof(int));
    free(c);
    free(arr);
}

Signed/Unsigned

Signed and unsigned variables differ just in one bit interpretation. But they have different behavior on minimal and maximal values.

#include <stdio.h>
#include <limits.h>
int main() 
{
    int i=INT_MAX;
    unsigned int u=UINT_MAX;

    printf("i=%d\n",i);
    printf("u=%u\n",u);

    i++;
    u++;
    printf("i=%d\n",i);
    printf("u=%u\n",u);
    i=0;
    u=0;
    i--;
    u--;
    printf("i=%d\n",i);
    printf("u=%u\n",u);

}

Endianess

#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdint.h>

int main() {
    int arr[4] = {0x00112233,0x44556677,0x8899AABB, 0xCCDDEEFF};
    printf("%08x\n",arr[0]);
    printf("%08x\n",arr[1]);
    printf("%08x\n",arr[2]);
    printf("%08x\n",arr[3]);

    FILE *f = fopen("int.hex","w+");
    fprintf(f,"%08x",arr[0]);
    fprintf(f,"%08x",arr[1]);
    fprintf(f,"%08x",arr[2]);
    fprintf(f,"%08x",arr[3]);
    fclose(f);

    int fd=open("int.bin",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO);
    write(fd,arr,sizeof(arr));
    close(fd);

    int i;
    fd = open("int.bin2",O_CREAT|O_RDWR,S_IWUSR|S_IRUSR|S_IRGRP|S_IRWXO);
    for (i=0;i<4;i++) {
        uint32_t val = (arr[i]>>16) &0x0000ffff;
        val += (arr[i]<<16)&0xffff0000;
        write(fd,&val,sizeof(uint32_t));
    }
    close(fd);
}

While saving formated values to file you will get what you expect

$ cat int.hex 
00112233445566778899aabbccddeeff

Saving just memory dump of all values, will give you different result

$ hexdump int.bin 
0000000 2233 0011 6677 4455 aabb 8899 eeff ccdd
0000010

Need to swap 16bit pairs to look same as value memory dump

$ hexdump int.bin2
0000000 0011 2233 4455 6677 8899 aabb ccdd eeff
0000010

Compiler flags

Compiler have whole list of command line arguments that you can enable for different purposes, lets look into some of them https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html

Lets try to apply some of the flags to examples above.

Best starte options is, those will give you more warnings.

-Wall -Wextra

Most of the examples here was written in sloppy style, so adding extra checks like will find more issues with code, probably all of provided examples will show issues with this extra compiler flags

Wformat-security -Wduplicated-cond -Wfloat-equal -Wshadow -Wconversion -Wjump-misses-init -Wlogical-not-parentheses -Wnull-dereference

To get all macroses expanded in C code add compiler flag. Output will be C source with all macro expansion

-E

Output resulting file not to binary but to generated assembly add

-S

More readable output can be obtained with

gcc FILE.c -Wa,-adhln=FILE.S -g -fverbose-asm -masm=intel

Basic compiler optimisation flags that can speedup program or make it smaller

-O  -O0  -O1  -O2  -O3  -Os  -Ofast  -Og  -Oz

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Optimize-Options

https://panthema.net/2013/0124-GCC-Output-Assembler-Code/
https://blogs.oracle.com/linux/post/making-code-more-secure-with-gcc-part-1

Shared library

Shared library is common way how to reuse big chunks of code.

#include <stdio.h>
int fun1() {
    return 1;
}

int fun2() {
    printf("Function name fun2\n");
}

int fun3(int a, int b) {
    return a+b;
}
$ gcc -c lib_share.c
$ gcc -shared -o lib_share.so libshare.o
$ ldd lib_share.so
    linux-vdso.so.1 (0x00007ffdb994d000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f0c39400000)
    /usr/lib64/ld-linux-x86-64.so.2 (0x00007f0c39835000)

Now lets link to our binary

#include <stdio.h>

//functions that are implemented in shared lib
int fun1();
int fun2();
int fun3(int a, int b);

int main() {
    fun1();
    fun2();
    fun3();
}
$ gcc -L. -lshare use_share.c -o use_share
./use_share 
./use_share: error while loading shared libraries: libshare.so: cannot open shared object file: No such file or directory
ldd ./use_share
    linux-vdso.so.1 (0x00007ffedcad5000)
    libshare.so => not found
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f7b99a00000)
    /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f7b99c90000)

Library is not in search path

$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd`
$ ./use_share
$ ldd use_share
    linux-vdso.so.1 (0x00007fffc415c000)
    libshare.so => /your/path/libshare.so (0x00007f48b03c6000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f48b0000000)
    /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f48b03d2000)

Other way is to set custom library search location. Lets set it to search in current directory. And no need to modify LD_LIBRARY_PATH

$ gcc use_share.c -o use_share -L. -lshare -Wl,-rpath=./
$ ldd ./use_share
    linux-vdso.so.1 (0x00007fff5c964000)
    libshare.so => ./libshare.so (0x00007f791000f000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f790fc00000)
    /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f791001b000)

So now executable runs libshare from local directory. Ofc there is possible to install shared library into systems /usr/lib

Static library

Static binary

Static binary don't use any shared libraries, and its possible to built it once and distribute on other platforms without need to install dependencies.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
    return 0;
}

First step to compile file and see that is dynamically lined

$ gcc static_elf.c -o static_elf
$ file static_elf
static_elf: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bc6ac706075874858e1c4a8accf77e704f4ea25a, for GNU/Linux 4.4.0, with debug_info, not stripped
$ ldd ./static_elf
    linux-vdso.so.1 (0x00007ffccef49000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fcbb8800000)
    /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fcbb8b63000)

After adding static option we can verify that tools now report it as statically linked. Size of binary increased as all functions that require to run executable are now contained in binary.

$ gcc static_elf.c -static -o static_elf
$ file static_elf
static_elf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=c54d2e4d2a3d11fe920bee9a44af045c6f67ab56, for GNU/Linux 4.4.0, with debug_info, not stripped
$ ldd static_elf 
    not a dynamic executable

Statically compiled file should work on most platforms.

Atomic

HERE

Multithreading

HERE

Basic usage

File manipulation with libc

Create file open data using libc functions

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    FILE *f = fopen("file.txt","w+");
    char *s = "Hello";
    fwrite(s,1,strlen(s),f);
    fclose(f);
}

Open file and read data back

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    FILE *f = fopen("file.txt","r");
    char buf[128];
    int r;
    r = fread(buf,1,128,f);
    buf[r] = 0;
    printf("->%s\n",buf,r);
    fclose(f);
}

File manipulation with syscalls

Now lets do the same without using libc functions using syscall function to directly use syscalls, its also straightforward to rewrite example for assembly.

#include <unistd.h>
#include <fcntl.h>
#include <sys/syscall.h>
#include <string.h>

int main(void) {
  int fd = syscall(SYS_open, "sys.txt", O_CREAT|O_WRONLY, S_IRWXU|S_IRGRP|S_IXGRP);
  char s[] = "hello sycall\n";
  syscall(SYS_write, fd, s, strlen(s));
  syscall(SYS_close, fd);
  return 0;
}

Read data from file

#include <unistd.h>
#include <fcntl.h>
#include <sys/syscall.h>
#include <string.h>

int main(void) {
  int fd = syscall(SYS_open, "sys.txt", O_RDONLY);
  char s[128];
  int r = syscall(SYS_read, fd, s, 128);
  s[r] = 0;
  syscall(SYS_close, fd);
  syscall(SYS_write, 0, s, r);
  return 0;
}

Advanced topics

Kernel module

Linux kernel, macos kernel and *BSD's kernels written in C, so there is possibility to write kernel modules in C for some of those.

Example will not match some specific things to local distribution.


http://main.lv/writeup/kernel_hello_world.md

Linking

Linking is one of the most interesting parts of compiling of C code. When object file is created it contains functions and variables that can be of different type. And linking tries to resolve all of those. So there is possible to have fun with linking and content of object files.

First example is piece of C code that can be compiled to object file, but it will not able to resolve to executable.

gcc -c link_elf.c
int main() {
    fun1();
    fun2();
}

So we can see that fun1 and fun2 are marked as undefined in object file. If we try compile it will not able to find those. So lets create one more object file

$ readelf -a link_elf.o

Symbol table '.symtab' contains 6 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS link_elf.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 .text
     3: 0000000000000000    31 FUNC    GLOBAL DEFAULT    1 main
     4: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND fun1
     5: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND fun2

link_fun1.c

void fun1() {
    printf("Hello fun1\n");
}
void fun2() {
    printf("Hello fun2\n");
}

So now we have object file with funtions that are defined. and we see that its now have undefine pritnf/puts function there.

readelf -a link_fun1.o
Symbol table '.symtab' contains 7 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS link_fun1.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 .text
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 .rodata
     4: 0000000000000000    22 FUNC    GLOBAL DEFAULT    1 fun1
     5: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts
     6: 0000000000000016    22 FUNC    GLOBAL DEFAULT    1 fun2

we can merge both of those files together

gcc -o link_elf link_elf.o link_fun1.o

The function in object files dont have any idea about input output types. That why anything can be linked that just match name lets rewrite code like this

int fun1(int i) {
    printf("Hello fun1\n");
}
int fun2(int i) {
    printf("Hello fun2\n");
}

And this links without issue. Theat this as 2 sets that are merge together only few thins know when linking things. Return type, and function arguments arent exposed when object file is created.

Functions can have aliases.

link_fun2.c

static void fun2()  {
    printf("hello 2\n");
} __attribute__ ((alias("fun1")));

Now function is local.

Symbol table '.symtab' contains 6 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS link_fun2.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 .text
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 .rodata
     4: 0000000000000000    22 FUNC    LOCAL  DEFAULT    1 fun2
     5: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts

Lets compile all object to executable. And the function fun2 isnt used in this case,

$ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf
$ ./link_elf
Hello fun1
Hello fun2

lets witch aliasing between 2 functions fun2

link_fun1.o
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS link_fun1.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 .text
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 .rodata
     4: 000000000000001d    29 FUNC    LOCAL  DEFAULT    1 fun2
     5: 0000000000000000    29 FUNC    GLOBAL DEFAULT    1 fun1
     6: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts

link_fun2.o
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS link_fun2.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 .text
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 .rodata
     4: 0000000000000000    22 FUNC    GLOBAL DEFAULT    1 fun2
     5: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts
$ gcc link_fun1.o link_fun2.o link_elf.o -o link_elf
$ ./link_elf
Hello fun1
hello 2

So all of this plays role in linking object files. There is more interesting utilit called ld its doing things on lower level then gcc.

Extern

Attributes

PASS

Creating shared library

PASS

Create static libraries

PASS

Join all objects together

PASS

Compile with musl

The libc is not the only option as standard c library, there is few others one of them is musl

$ musl-gcc hello_world.c -o hello_world
$ file ./hello_world
hello_world_musl: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-x86_64.so.1, not stripped

Inspect elf files

There is few utilities that help to check if elf file is ok.

ldd show what kind of shared libraries elf will try to load

$ ldd hello_world
    linux-vdso.so.1 (0x00007fffcb2ae000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007ffb80c00000)
    /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007ffb80fb9000)

Readelf allows to inspect content of elf files, headers and interpret values in headers. In few example above we allready used that feature to check content of compiled objectfiles.

$ readelf -s ./hello_world
Symbol table '.symtab' contains 37 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS abi-note.c
     2: 000000000000039c    32 OBJECT  LOCAL  DEFAULT    4 __abi_tag
     3: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS init.c
     4: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
     5: 0000000000001070     0 FUNC    LOCAL  DEFAULT   14 deregister_tm_clones
     6: 00000000000010a0     0 FUNC    LOCAL  DEFAULT   14 register_tm_clones
     7: 00000000000010e0     0 FUNC    LOCAL  DEFAULT   14 __do_global_dtors_aux
     8: 0000000000004030     1 OBJECT  LOCAL  DEFAULT   25 completed.0
     9: 0000000000003df0     0 OBJECT  LOCAL  DEFAULT   20 __do_global_dtor[...]
    10: 0000000000001130     0 FUNC    LOCAL  DEFAULT   14 frame_dummy
    11: 0000000000003de8     0 OBJECT  LOCAL  DEFAULT   19 __frame_dummy_in[...]
    12: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS hello_world.c
    13: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    14: 00000000000020b0     0 OBJECT  LOCAL  DEFAULT   18 __FRAME_END__
    15: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS 
    16: 0000000000003df8     0 OBJECT  LOCAL  DEFAULT   21 _DYNAMIC
    17: 0000000000002010     0 NOTYPE  LOCAL  DEFAULT   17 __GNU_EH_FRAME_HDR
    18: 0000000000004000     0 OBJECT  LOCAL  DEFAULT   23 _GLOBAL_OFFSET_TABLE_
    19: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_mai[...]
    20: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_deregisterT[...]
    21: 0000000000004020     0 NOTYPE  WEAK   DEFAULT   24 data_start
    22: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND puts@GLIBC_2.2.5
    23: 0000000000004030     0 NOTYPE  GLOBAL DEFAULT   24 _edata
    24: 0000000000001154     0 FUNC    GLOBAL HIDDEN    15 _fini
    25: 0000000000004020     0 NOTYPE  GLOBAL DEFAULT   24 __data_start
    26: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
    27: 0000000000004028     0 OBJECT  GLOBAL HIDDEN    24 __dso_handle
    28: 0000000000002000     4 OBJECT  GLOBAL DEFAULT   16 _IO_stdin_used
    29: 0000000000004038     0 NOTYPE  GLOBAL DEFAULT   25 _end
    30: 0000000000001040    38 FUNC    GLOBAL DEFAULT   14 _start
    31: 0000000000004030     0 NOTYPE  GLOBAL DEFAULT   25 __bss_start
    32: 0000000000001139    26 FUNC    GLOBAL DEFAULT   14 main
    33: 0000000000004030     0 OBJECT  GLOBAL HIDDEN    24 __TMC_END__
    34: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_registerTMC[...]
    35: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __cxa_finalize@G[...]
    36: 0000000000001000     0 FUNC    GLOBAL HIDDEN    12 _init

No standard library

Lets write hello world without libc.

noc.c

void _start() {

}
$ gcc -c noc.c
$ ld -dynamic-linker /lib/ld-linux.so.2 noc.o -o noc
$ file noc
noc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped

Next step to make it more working then segfaulting.

void _start() {
 asm ( \
      "movl $1,%eax\n" \
      "xor %ebx,%ebx\n" \
      "int $128\n" \
    );
}

Now this is all about calling the syscalls

Lets print the message

signed int write(int fd, const void *buf, unsigned int size)
{
    signed int ret;
    asm volatile
    (
        "syscall"
        : "=a" (ret)
        //                 EDI      RSI       RDX
        : "0"(1), "D"(fd), "S"(buf), "d"(size)
        : "rcx", "r11", "memory"
    );
    return ret;
}

void _start() {
    write(1,"no libc",8);
 asm ( \
      "movl $1,%eax\n" \
      "xor %ebx,%ebx\n" \
      "int $128\n" \
    );
}

http://main.lv/writeup/making_c_executables_smaller.md

Memory leaks

Memory leaks is cruitial part of C language. Default case when they are detected are when allocated memory wasn free'd after use. If amount of this type of memory increasing then its can eventually fill whole memory and system will be unresponsive. Here is simple example how memory leak created and how to detect it.

#include <stdlib.h>

int main() {

    char *ptr = malloc(12);

    return 0;
}

The best way to detect it to use valgrind.

$ valgrind ./malloc

==778== HEAP SUMMARY:
==778==     in use at exit: 12 bytes in 1 blocks
==778==   total heap usage: 2 allocs, 1 frees, 1,036 bytes allocated

There is seen 2 allocs and 1 free. But we see that 12bytes after exit. So our created leak is detected. More complex example. So now we created leaking function and we called it 5 times. But in larger code base it would be nice to see location of leaks.

#include <stdlib.h>

int* mem_alloc(int sz) {
    int *ret=NULL;

    if (sz < 0) {
        return NULL; 
    }

    ret = malloc(sz*sizeof(int));

    if (sz>10) {
        return NULL;
    }

    return ret;

}

int main() {

    mem_alloc(0);

    free(mem_alloc(1));

    mem_alloc(100); 

    free(mem_alloc(2));

    mem_alloc(10);

    return 0;
}

There is 3 blocks that leaks, and we see where its comming from there is possible to guess but it would better to have position of where leak located.

valgrind --leak-check=full --track-origins=yes --log-file=log.txt ./memleak2

==4974== HEAP SUMMARY:
==4974==     in use at exit: 440 bytes in 3 blocks
==4974==   total heap usage: 5 allocs, 2 frees, 452 bytes allocated
==4974== 
==4974== 0 bytes in 1 blocks are definitely lost in loss record 1 of 3
==4974==    at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4974==    by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2)
==4974==    by 0x10919E: main (in /home/fam/prog/c/undefined_c/memleak2)
==4974== 
==4974== 40 bytes in 1 blocks are definitely lost in loss record 2 of 3
==4974==    at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4974==    by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2)
==4974==    by 0x1091D6: main (in /home/fam/prog/c/undefined_c/memleak2)
==4974== 
==4974== 400 bytes in 1 blocks are definitely lost in loss record 3 of 3
==4974==    at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4974==    by 0x109179: mem_alloc (in /home/fam/prog/c/undefined_c/memleak2)
==4974==    by 0x1091BA: main (in /home/fam/prog/c/undefined_c/memleak2)
==4974== 
==4974== LEAK SUMMARY:
==4974==    definitely lost: 440 bytes in 3 blocks
==4974==    indirectly lost: 0 bytes in 0 blocks
==4974==      possibly lost: 0 bytes in 0 blocks
==4974==    still reachable: 0 bytes in 0 blocks
==4974==         suppressed: 0 bytes in 0 blocks

Add compilation option g3

gcc -g3 memleak2.c -o memleak2

Now it shows source lines and trace from where the leaking code where called. Thats looks better now.

valgrind --leak-check=full --track-origins=yes --log-file=log.txt ./memleak2

==5073== HEAP SUMMARY:
==5073==     in use at exit: 440 bytes in 3 blocks
==5073==   total heap usage: 5 allocs, 2 frees, 452 bytes allocated
==5073== 
==5073== 0 bytes in 1 blocks are definitely lost in loss record 1 of 3
==5073==    at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5073==    by 0x109179: mem_alloc (memleak2.c:10)
==5073==    by 0x10919E: main (memleak2.c:22)
==5073== 
==5073== 40 bytes in 1 blocks are definitely lost in loss record 2 of 3
==5073==    at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5073==    by 0x109179: mem_alloc (memleak2.c:10)
==5073==    by 0x1091D6: main (memleak2.c:30)
==5073== 
==5073== 400 bytes in 1 blocks are definitely lost in loss record 3 of 3
==5073==    at 0x4841888: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5073==    by 0x109179: mem_alloc (memleak2.c:10)
==5073==    by 0x1091BA: main (memleak2.c:26)
==5073== 
==5073== LEAK SUMMARY:
==5073==    definitely lost: 440 bytes in 3 blocks
==5073==    indirectly lost: 0 bytes in 0 blocks
==5073==      possibly lost: 0 bytes in 0 blocks
==5073==    still reachable: 0 bytes in 0 blocks
==5073==         suppressed: 0 bytes in 0 blocks
==5073==

Code coverage

Compile file with extra flags and generate gcov file output. Ther is only one branch not used. Coverage should show with part isnt used.

#include <stdio.h>

int fun1(int a) {
    if (a < 0) {
        printf("Smaller then zero\n");
    } 
    if (a==0) {
        printf("Equails to zero\n");
    }
    if (a>0) {
        printf("Bigger then zero\n");
    }
}

int main() {

    printf("Start\n");
    fun1(0);
    fun1(1);

    return 0;
}
$ gcc -fprofile-arcs -ftest-coverage coverage.c -o coverage
$ gcov ./coverage
File 'coverage.c'
Lines executed:92.31% of 13
Creating 'coverage.c.gcov'

Lines executed:92.31% of 13

Gcov file content. So we scant see with line wasnt executed.

        -:    0:Source:coverage.c
        -:    0:Graph:coverage.gcno
        -:    0:Data:coverage.gcda
        -:    0:Runs:1
        -:    1:#include <stdio.h>
        -:    2:
        2:    3:int fun1(int a) {
        2:    4:    if (a < 0) {
    #####:    5:        printf("Smaller then zero\n");
        -:    6:    } 
        2:    7:    if (a==0) {
        1:    8:        printf("Equails to zero\n");
        -:    9:    }
        2:   10:    if (a>0) {
        1:   11:        printf("Bigger then zero\n");
        -:   12:    }
        2:   13:}
        -:   14:
        1:   15:int main() {
        -:   16:
        1:   17:    printf("Start\n");
        1:   18:    fun1(0);
        1:   19:    fun1(1);
        -:   20:
        1:   21:    return 0;
        -:   22:}

Profiling

Some parts of code can take substantial amount of time and those parts need to be identified.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

void slow_sin() {
    float r=0.0f;
    for (int i=0;i<10000000;i++) {
        r += sinf(M_PI/8);
    }
}

void slower_sin() {
    double r=0.0f;
    for (int i=0;i<10000000;i++) {
        r += sin(M_PI/8);
    }
}

void fast_sin() {
    float pre_calc = sinf(M_PI/8);
    float r = 0.0f;
    for (int i=0;i<10000000;i++) {
        r += pre_calc;
    }
}

int main() {
    slow_sin();
    slower_sin();
    fast_sin();
}

Compile and rung with profiling

gcc -pg perf_speed.c -o perf_speed -lm
./perf_speed
gprof perf_speed gmon.cov

Sanitizer

C as a greate language have good features in standart such as undefined behaviour. And also there is possible to overwrite any data you whant with your code. One of the favorite mistake is to write some buffer overruns. Its possible to catch this type of errors with stack protection

So in code belove there is possible to write in to array of size 8 more then 8 characters. This is because the is no any boundry check. C runtime will be able to detect this kind of things.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void fun(char *str,int size) {
    char local_var[8];
    memcpy(local_var, str, size);
    printf("Whats inside a stack? %s\n",local_var);
}

int main() {
    char some_str1[] = "Hello!";
    char some_str2[] = "Hello all!!!";

    fun(some_str1,strlen(some_str1));
    fun(some_str2,strlen(some_str2));
}
Whats inside a stack? Hello!
Whats inside a stack? Hello all!!!
*** stack smashing detected ***: terminated
fish: Job 1, './stack_overrun' terminated by signal SIGABRT (Abort)

If this isnt happening there is possible to add -fstack-protector to compile flags.

C have whole list of undefined behaviours incorporated in standard https://en.cppreference.com/w/c/language/behavior

functions f variable a isnt initialized so its undefined behaviour but there still will be some value. Run few times and each time it returns new value when f(0).

#include <stdio.h>

size_t f(int x)
{
    size_t a;
    if(x) // either x nonzero or UB
        a = 42;
    return a; 
}

int main() {
    printf("%d\n",f(0));
    printf("%d\n",f(1));
    printf("%d\n",f(42));
}

Division by zero. Function f dont check if divisor is 0. Programm going to abort. add flag -fsanitize=integer-divide-by-zero and it will be detected at runtime

#include <stdio.h>

size_t f(int x)
{
    return 10/x; 
}

int main() {
    printf("%d\n",f(0));
    printf("%d\n",f(1));
    printf("%d\n",f(42));
}
undefined_b.c:5:14: runtime error: division by zero
fish: Job 1, './undefined_b' terminated by signal SIGFPE (Floating point exception)

Write plugins

Preload library

Embedding C

Most of the programming languages support embeding C. As C language have where simple functiong naming when its mangled to object format it makes it easy target when linking with other languages. Most of other languages have incompatible naming for functions when compiled to binary.

Embed in C++

lib.h

#include <stdlib.h>
#include <stdio.h>

int fun_secret_1();

lib.c

#include "lib.h"

int fun_secret_1() {
    printf("Hello from C\n");
    return -1;
}

First thing to notice is when file is compiled with C++ is that the name of the function are in different format then when its compiled with C.

$ g++ -c lib.c
$ readelf -s lib.o
Symbol table '.symtab' contains 6 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS lib.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 .text
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 .rodata
     4: 0000000000000000    26 FUNC    GLOBAL DEFAULT    1 _Z12fun_secret_1v
     5: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts

Lets tell C++ that file is C language by adding extern "c"

lib.h

#include <stdlib.h>
#include <stdio.h>

extern "C" {

int fun_secret_1();

}

lib.c

#include "lib.h"

extern "C" {

int fun_secret_1() {
    printf("Hello from C\n");
    return -1;
}

}

Now compiled object file have C function names.

$ g++ lib.c -c
$ readelf -s lib.o

Symbol table '.symtab' contains 6 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS lib.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 .text
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 .rodata
     4: 0000000000000000    26 FUNC    GLOBAL DEFAULT    1 fun_secret_1
     5: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND puts

cppembed.cpp_

#include "lib.h"

int main() {
    fun_secret_1();
}

Doing oposite way running C++ from C
/writeup/wraping_c_plus_plus_exceptions_templates_and_classes_in_c.md

Embed in Go

lib.h

#include <stdlib.h>
#include <stdio.h>

int fun_secret_1();

lib.c

#include "lib.h"

int fun_secret_1() {
    printf("Hello from C\n");
    return -1;
}

main.go

package main
// #cgo CFLAGS: -g -Wall
// #include <stdlib.h>
// #include "lib.h"
import "C"
import (
    "fmt"

)
func main() {
    fmt.Println("Start program")
    C.fun_secret_1()
    fmt.Println("End program")
}
go build

https://karthikkaranth.me/blog/calling-c-code-from-go/

Embed in Swift

/writeup/linux_hello_world_in_swift.md

Embed in Rust

lib.c

#include <stdio.h>
#include <stdlib.h>

int fun_secret_1() {
    printf("Hello from C\n");
    return -1;
}
extern "C" {
    fn fun_secret_1();
}

//rustc main.rs -o hello
fn main() {
    println!("Start program");
    unsafe {fun_secret_1()}
    println!("End program");
}

Compile with

gcc -c lib.c
gcc -shared lib.o -o liblib.so
rustc main.rs -l lib -L . -o hello -C link-arg="-Wl,-rpath=./"

https://dev.to/xphoniex/how-to-call-c-code-from-rust-56do

Lua in C

/writeup/embedding_lua_in_c.md

Python in C

Multiplatform

Different flags

Check architecture

```

### AArch64

https://snapshots.linaro.org/gnu-toolchain/13.0-2022.08-1/aarch64-linux-gnu/

download any of the version of gcc and extract

Add bin directory location to env variable PATH

export PATH=$PATH:pwd


___main.c__
```c
#include <stdio.h>

int main() {
    printf("Hello world arm64\n");
}
$ arch64-linux-gnu-gcc main.c -o main
$ ./main
qemu-aarch64: Could not open '/lib/ld-linux-aarch64.so.1': No such file or directory
$ file ./main
./main: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=12448d90030e2ad23dbe6b7bc82a4fa7b7de9659, for GNU/Linux 3.7.0, with debug_info, not stripped

Download sysroot image from linaro page. With running

strace ./main

It showed that searched path for libraries are in

/usr/gnemul/qemu-aarch64/lib/

Found missing libc and ld-linux-aarch64 inside sysroot archive and copied at searched location amd now AArch64 binary is running.

$ ./main
Hello world arm64

AVR8

AVR is 8bit CPU that is quite popular for hobbiest. As baremetal device its doesnt have full libc support, and needs some setup before its possible to do basics things with it.

avr_echo.c

#include <avr/io.h>

#define FOSC 16000000UL
#define BAUD 9600
#define MYUBRR FOSC/16/BAUD-1

void USART_Init( unsigned int ubrr)
{
    UBRRH = (unsigned char)(ubrr>>8);
    UBRRL = (unsigned char)ubrr;
    UCSRB = (1<<RXEN)|(1<<TXEN);
    UCSRC = (1<<URSEL)|(1<<USBS)|(3<<UCSZ0);
}

int main()
{
    char c;
    USART_Init( MYUBRR );
    while(1)
    {
        while ( !(UCSRA & (1<<RXC))){};
        c = UDR;
        while (!(UCSRA & (1<<UDRE))){};
        UDR = c;
    }
    return 0;
}
avr-gcc avr_echo.c -mmcu=atmega16 -Wall -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums -o avr_echo.out

Next steps woule be to programm it, in case you have ISPv2 programmer and ATmega16 chip

avr-objdump -s --disassemble avr_echo.out > avr_echo.s
avr-objcopy -j .text -O ihex avr_echo.out avr_echo.hex
avrdude -pm16 -cavrispv2 -Pusb -U flash:w:avr_echo.hex

Emscripten

/writeup/web_assembly_sdl_example.md