Skip to content
100% in your browser. Nothing you paste is uploaded — all processing runs locally. Read more →

What is Base64?

On this page
  1. The 64 characters
  2. How it works
  3. Where it came from
  4. Where it’s used today
    1. Headers and credentials
    2. Embedding binary in text
    3. Tokens
    4. Configuration
  5. What it is NOT
  6. The 33% overhead in detail
  7. Common pitfalls
    1. 1. btoa doesn’t handle Unicode
    2. 2. URL-safe vs standard
    3. 3. Padding inconsistency
    4. 4. Base64 isn’t a transport security boundary
  8. In code
  9. Try the tools
  10. Related across the network

Base64 is an encoding that represents arbitrary bytes using only 64 ASCII characters: A-Z, a-z, 0-9, +, and /, with = as padding. The result is plain text that survives any system designed for ASCII — HTTP headers, email, JSON strings, URL query parameters (usually with the URL-safe variant).

It’s not encryption. It’s not compression — it actually adds 33% overhead. It’s just a way to put binary data into text-only contexts.

The 64 characters

The standard alphabet from RFC 4648:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/

That’s 26 + 26 + 10 + 2 = 64. Each character represents 6 bits of input data, so 4 base64 characters encode 3 input bytes (24 bits).

How it works

Three input bytes (24 bits) split into four 6-bit groups, each mapped to a character:

Input bytes:    01001000  01101001  00100001     "Hi!"  (3 bytes)
                  =72       =105     =33

Re-grouped:     010010   000110   100100   100001   (four 6-bit groups)
                =18      =6       =36      =33

Lookup:         S        G        k        h
                                           (alphabet position 33)

Output:         "SGkh"   (4 ASCII characters)

If the input length isn’t a multiple of 3, padding fills the gap:

Input bytesOutputNotes
0""empty
1 (H)SA==one byte → two output chars + two =
2 (Hi)SGk=two bytes → three output chars + one =
3 (Hi!)SGkhfull block, no padding
4 (Hi!H)SGkhSA==three bytes + one byte + padding

The = characters aren’t part of the alphabet — they’re a structural hint that says “the last group only had 1 or 2 bytes, not 3.”

Where it came from

Base64 originated in the early Internet (RFC 989, 1987) for emailing binary attachments through the strictly-7-bit MIME pipeline. Email servers couldn’t handle bytes above 127, so binary data had to be encoded as ASCII. That requirement is mostly historical now, but base64 stayed because:

The current spec is RFC 4648 (October 2006), which standardized both the original (“base64”) and URL-safe (“base64url”) variants.

Where it’s used today

Headers and credentials

Embedding binary in text

Tokens

Configuration

What it is NOT

The 33% overhead in detail

For input of n bytes, output is ceil(n / 3) * 4 characters. Concrete examples:

Input sizeBase64 outputPaddingOverhead
1 byte4 chars==+300%
2 bytes4 chars=+100%
3 bytes4 charsnone+33%
1 KB~1.33 KBvaries+33%
1 MB image~1.33 MB stringvaries+33%

Whenever bandwidth or storage matters, prefer binary transport. Use base64 only when binary isn’t an option (headers, JSON strings, email).

Common pitfalls

1. btoa doesn’t handle Unicode

In JavaScript:

btoa("café")
// → "InvalidCharacterError: failed to encode"

btoa only accepts characters where every code point is ≤ 255. To handle arbitrary Unicode, encode to UTF-8 bytes first:

const bytes = new TextEncoder().encode("café");
const b64 = btoa(String.fromCharCode(...bytes));
// → "Y2Fmw6k="

The main encoder does this for you.

2. URL-safe vs standard

Standard base64 uses + and /, both reserved in URLs. Pasting standard base64 into a URL parameter without encoding gives ambiguous results. Use URL-safe base64 for any URL context.

3. Padding inconsistency

RFC 4648 says padding is required for standard base64 and optional for URL-safe. Some decoders are strict, some lenient. When you control both sides, pick a convention; when you don’t, the decoder accepts both.

4. Base64 isn’t a transport security boundary

A base64-encoded password in a YAML file or HTTP header is as exposed as plain text to anyone who can read it. Don’t conflate encoding with security.

In code

// JavaScript / Node
btoa("hello")                                      // "aGVsbG8="
atob("aGVsbG8=")                                   // "hello"
Buffer.from("hello").toString("base64")            // Node-only, cleaner

// Python
import base64
base64.b64encode(b"hello").decode()                // "aGVsbG8="
base64.b64decode("aGVsbG8=").decode()              // "hello"

// Bash
echo -n "hello" | base64                           // "aGVsbG8="
echo -n "aGVsbG8=" | base64 -d                     // "hello"

// Go
import "encoding/base64"
base64.StdEncoding.EncodeToString([]byte("hello")) // "aGVsbG8="

Try the tools