What is Base64?

Updated April 26, 2026 5 min read by Samuel Yoo

On this page

The 64 characters
How it works
Where it came from
Where it’s used today
What it is NOT
The 33% overhead in detail
Common pitfalls
In code
Try the tools
Related across the network

Base64 is an encoding that represents arbitrary bytes using only 64 ASCII characters: A-Z, a-z, 0-9, +, and /, with = as padding. The result is plain text that survives any system designed for ASCII — HTTP headers, email, JSON strings, URL query parameters (usually with the URL-safe variant).

It’s not encryption. It’s not compression — it actually adds 33% overhead. It’s just a way to put binary data into text-only contexts.

The 64 characters

The standard alphabet from RFC 4648:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/

That’s 26 + 26 + 10 + 2 = 64. Each character represents 6 bits of input data, so 4 base64 characters encode 3 input bytes (24 bits).

How it works

Three input bytes (24 bits) split into four 6-bit groups, each mapped to a character:

Input bytes:    01001000  01101001  00100001     "Hi!"  (3 bytes)
                  =72       =105     =33

Re-grouped:     010010   000110   100100   100001   (four 6-bit groups)
                =18      =6       =36      =33

Lookup:         S        G        k        h
                                           (alphabet position 33)

Output:         "SGkh"   (4 ASCII characters)

If the input length isn’t a multiple of 3, padding fills the gap:

Input bytes	Output	Notes
0	`""`	empty
1 (`H`)	`SA==`	one byte → two output chars + two `=`
2 (`Hi`)	`SGk=`	two bytes → three output chars + one `=`
3 (`Hi!`)	`SGkh`	full block, no padding
4 (`Hi!H`)	`SGkhSA==`	three bytes + one byte + padding

The = characters aren’t part of the alphabet — they’re a structural hint that says “the last group only had 1 or 2 bytes, not 3.”

Where it came from

Base64 originated in the early Internet (RFC 989, 1987) for emailing binary attachments through the strictly-7-bit MIME pipeline. Email servers couldn’t handle bytes above 127, so binary data had to be encoded as ASCII. That requirement is mostly historical now, but base64 stayed because:

It’s well-defined and trivial to implement.
The 33% overhead is small enough to not matter for most use cases.
Almost every system speaks ASCII fluently.

The current spec is RFC 4648 (October 2006), which standardized both the original (“base64”) and URL-safe (“base64url”) variants.

Where it’s used today

Headers and credentials

HTTP Basic Auth. Authorization: Basic <base64(user:pass)>.
API tokens in some systems (though most modern APIs prefer JWT or opaque random strings).

Embedding binary in text

Data URIs. data:image/png;base64,... inlines an image directly in HTML or CSS. See the file converter.
Email MIME attachments. Every email with an attached image, PDF, or document uses base64.
JSON payloads with binary data. JSON has no binary type, so protocols like S3, Slack file uploads, and Twilio messages carry binary content as base64-encoded strings.

Tokens

JWT (JSON Web Tokens). All three parts of every JWT are base64url-encoded JSON. The signature is base64url-encoded raw bytes. See our JWT decoder.
OAuth state, CSRF tokens, session IDs sometimes use base64 of random bytes.

Configuration

Kubernetes secrets are stored as base64 in YAML. Note: this is encoding, not encryption — anyone with read access to the secret can decode it.
Environment variables containing certificates, keys, or other binary data are often base64-encoded to fit on a single line.

What it is NOT

Not encryption. Anyone can decode base64. Don’t use it to hide data from a determined attacker — they will run atob() and see your content.
Not compression. It makes data ~33% larger, not smaller. If you need both compression and ASCII-safe output, gzip first, then base64.
Not a hash. Hashes are one-way; base64 roundtrips perfectly.
Not a checksum. Errors during transmission won’t be caught — the receiver will just decode different bytes.

The 33% overhead in detail

For input of n bytes, output is ceil(n / 3) * 4 characters. Concrete examples:

Input size	Base64 output	Padding	Overhead
1 byte	4 chars	`==`	+300%
2 bytes	4 chars	`=`	+100%
3 bytes	4 chars	none	+33%
1 KB	~1.33 KB	varies	+33%
1 MB image	~1.33 MB string	varies	+33%

Whenever bandwidth or storage matters, prefer binary transport. Use base64 only when binary isn’t an option (headers, JSON strings, email).

Common pitfalls

1. `btoa` doesn’t handle Unicode

In JavaScript:

btoa("café")
// → "InvalidCharacterError: failed to encode"

btoa only accepts characters where every code point is ≤ 255. To handle arbitrary Unicode, encode to UTF-8 bytes first:

const bytes = new TextEncoder().encode("café");
const b64 = btoa(String.fromCharCode(...bytes));
// → "Y2Fmw6k="

The main encoder does this for you.

2. URL-safe vs standard

Standard base64 uses + and /, both reserved in URLs. Pasting standard base64 into a URL parameter without encoding gives ambiguous results. Use URL-safe base64 for any URL context.

3. Padding inconsistency

RFC 4648 says padding is required for standard base64 and optional for URL-safe. Some decoders are strict, some lenient. When you control both sides, pick a convention; when you don’t, the decoder accepts both.

4. Base64 isn’t a transport security boundary

A base64-encoded password in a YAML file or HTTP header is as exposed as plain text to anyone who can read it. Don’t conflate encoding with security.

In code

// JavaScript / Node
btoa("hello")                                      // "aGVsbG8="
atob("aGVsbG8=")                                   // "hello"
Buffer.from("hello").toString("base64")            // Node-only, cleaner

// Python
import base64
base64.b64encode(b"hello").decode()                // "aGVsbG8="
base64.b64decode("aGVsbG8=").decode()              // "hello"

// Bash
echo -n "hello" | base64                           // "aGVsbG8="
echo -n "aGVsbG8=" | base64 -d                     // "hello"

// Go
import "encoding/base64"
base64.StdEncoding.EncodeToString([]byte("hello")) // "aGVsbG8="

Try the tools

Encoder/decoder — paste either side, the other auto-updates
Base64url — URL-safe variant
File converter — file → data URI
Cheat sheet of formats — when each variant is right

jwt.tooljo.com — JWT segments are base64url-encoded JSON; the decoder shows the raw bytes you’d be encoding here.
hash.tooljo.com — hash output formats (hex / base64 / base64url) — compare side-by-side.
base64.tooljo.com/blog/base64-is-not-encryption — base64 is encoding, not encryption. Five places engineers still confuse them.