October 25th, 2017

Reinout van Rees: PyCon.de: the borgbackup project – Thomas Waldmann

Programing, Python, by admin.

(One of my summaries of a talk at the
2017 PyCon.de conference).

Borgbackup is 2.5 years old, but the code is
older: is a fork of attic. Thomas discovered
Attic after someone blogged about it. They forked it to get more collaboration
and quicker releases.

Borg backup is a backup tool. There are 1000 backup tools. So what’s
different? Borg is one you maybe actually would enjoy using. The features
sound logical: simple, efficient, safe, secure. How borg sees this:

  • Simple. Each backup is a full backup. Restore? Just do a FUSE mount. Easy
    pruning of old backups.

    Tooling: it is just borg, ssh and a shell. It is a single-file
    binary. There’s good filesystem and OS support.

    There’s good documentation.

  • Efficient. It is very fast for unchanged files. Every backup is a full
    backup, but unchanged files don’t need to be handled a second time.

    Chunk deduplication, sparse file support, flexible compression scheme.

    Compression is chunk-based, it doesn’t compress the whole file at once.

  • Safety. Checksums, transactions, filesystem syncing, atomic
    operations. Checkpoints while backing up.

    You can have off-site remote repositories.

  • Secure. Authenticated encryption. There’s nothing to see in the repo:
    borg doesn’t trust the backup host, everything is encrypted.

    Tampering/corruption detection. SSH transport for remote
    connections. Append-only mode repos.

    It is open source: you can see the code.

About deduplication: it uses various ways of deduplication. Similar files
don’t need to be stored twice. Unchanged files don’t need to be stored a
second time. And there’s chunking. Bigger files are chopped up and the parts
get the deduplication treatment.

About the code: 90% is python, the high level logic. cython is 5% and 5% is
pure c. Testing: pytest and tox (for testing with various python
versions). They use pyenv.

Pyinstaller makes a single-file binary. It bundles your python/cython/c code
with the python binary of your choice. Only glibc needs to come from the OS.

They use GPG to sign the releases. Even all the commits in git are signed.

Documentation: sphinx. They also reuse the ArgParse output for man pages and
the sphinx documentation. The README is included in sphinx. Make the effort to
write a good README: it is your "elevator pitch".

Tip: https://asciinema.org/ for documenting how your CLI app works.


Photo explanation: simply a picture from my train trip (with a nice
planned detour through the Eifel) from Utrecht (NL) to Karlsruhe (DE). Station
hall in Trier with rail lines. Half of them aren’t operational anymore.

Back Top

Leave a Reply