Skip to main content

Reverse Engineering Level 1: Strings - Your First Step into Binary Analysis

· 8 min read
fr4nk
Software Engineer
Hugging Face

Reverse engineering can seem intimidating at first. You might imagine hackers hunched over complex assembly code, decoding encrypted messages, or cracking sophisticated protections. But here's a secret: most real-world reverse engineering starts with something incredibly simple—just reading text.

Welcome to Level 1 of reverse engineering: Strings Analysis. This is the most accessible entry point into understanding how programs work, and it's more powerful than you might think.

What Are Strings?

When programmers write code, they often include human-readable text directly in their programs. These might be:

  • Error messages ("Connection failed", "Invalid input")
  • File paths ("/etc/config.txt", "C:\Windows\System32")
  • URLs and IP addresses ("http://api.example.com", "192.168.1.1")
  • Configuration keys ("username", "password", "debug_mode")
  • Debugging information
  • API endpoints and commands
  • Copyright notices and version strings

When the code is compiled into a binary (the executable file), this text remains embedded inside. It's like leaving a treasure map inside a locked box—the text is still readable even though the rest of the code is converted to machine instructions.

From Source Code to Binary:

Why Strings Matter

Developers often assume that once code is compiled into binary form, it's hidden and secure. This is a common misconception. The strings utility and similar tools can extract all this readable text without even understanding the rest of the binary. It's like finding a note left inside a wall—you don't need to understand the entire architecture of the building to read what's written on the paper.

The Strings Tool

On Linux and macOS systems, you can use the built-in strings command:

strings /path/to/binary

On Windows, you can use tools like strings.exe (from Sysinternals) or online hex viewers.

This command simply extracts and displays all readable ASCII text found in a binary file.

Strings Analysis Workflow:

Real-World Example 1: Finding Hidden Secrets

Let's say you've downloaded a suspicious program and want to know what it does before running it. Here's what you might discover:

Original Binary File (Unreadable):

4d5a 9000 0300 0000 0400 0000 ffff 0000
b800 0000 0000 0000 4000 0000 0000 0000

After Running Strings:

!This program cannot be run in DOS mode.
server.malicious-domain.com
POST /upload
password123
C:\Users\Admin\Documents\credentials.txt
Failed to connect to server
Retrying connection...

What We Learn:

  • The program tries to connect to server.malicious-domain.com
  • It uses HTTP POST requests to upload something
  • It has hardcoded credentials ("password123")
  • It references a local credentials file
  • It has retry logic built in

From just reading strings, we've discovered this is likely malware attempting to steal credentials and send them to a remote server. No complex analysis needed!

Real-World Example 2: The Huawei Router Vulnerability

The video mentioned a real example: analyzing a malware sample that was exploiting a Huawei device. Here's how strings analysis could have revealed the attack:

Suspicious Binary Strings:

/cgi-bin/luci/;stok=%s/admin/system/test?test=1
$(telnetd -l /bin/sh)
Huawei D100
192.168.1.1
admin
admin

What We Learn:

  • The malware targets Huawei routers specifically
  • It exploits the CGI interface at /cgi-bin/luci/
  • It uses command injection: $(telnetd -l /bin/sh) - this executes code within the router
  • Default credentials are hardcoded
  • It connects to the local network gateway (192.168.1.1)

A security researcher could immediately recognize this as a command injection vulnerability without even looking at the assembly code.

Attack Flow Visualization:

Practical Example 3: Finding Configuration Secrets

Imagine you're analyzing a web application's executable:

Strings Output (Partial):

Database Connection
Server: db.internal.company.com
Username: webapp_user
Password: 5up3rS3cur3P@ssw0rd
API_KEY: sk_live_1234567890abcdef
AWS_ACCESS_KEY: AKIAIOSFODNN7EXAMPLE
DEBUG_MODE: true
log_file: /var/log/app.log
version: 2.1.3-beta

Findings:

  • Database server hostname and credentials
  • API keys and AWS credentials exposed
  • Debug mode is enabled in production
  • Application version (useful for finding known vulnerabilities)
  • Log file location

Even if you can't read the program's logic, you've discovered serious security misconfigurations and exposed credentials.

Practical Example 4: You Try It!

Here's a simple example you can test yourself:

Create a test program (test.c):

#include <stdio.h>

int main() {
char *password = "SuperSecret123";
char *api_url = "https://api.example.com/login";

printf("Hello, World!\n");
return 0;
}

Compile it:

gcc -o test test.c

Extract strings:

strings test

You'll see:

[... system strings ...]
SuperSecret123
https://api.example.com/login
Hello, World!

Notice how even though you didn't look at the source code, you discovered the hardcoded password and API URL just by running strings!

Tools Beyond Strings

While strings is the classic tool, here are others that accomplish similar tasks:

ToolPlatformPurpose
stringsLinux/MacExtract ASCII text from binaries
strings.exeWindowsExtract ASCII text from Windows executables
Sysinternals StringsWindowsAdvanced version with Unicode support
GhidraCross-platformIncludes strings view with filtering and searching
IDA ProCross-platformShows strings with cross-references to code
Hex EditorsAnyManual string extraction with visual inspection

Why Developers Leave Strings Unencrypted

You might wonder: "Why don't programmers encrypt their strings?" Good question! Here are the reasons:

  1. Performance: Decrypting strings constantly would slow down the program
  2. Convenience: Hardcoding strings is easier during development
  3. Debugging: Encrypted strings make debugging harder
  4. Oversight: Many developers simply don't think about it
  5. Legacy Code: Old programs were written before security was a priority

Best Practices: Protecting Against String Analysis

If you're a developer wanting to protect your code:

  1. Never hardcode secrets - Use environment variables, secure configuration files, or secret management systems
  2. Encrypt sensitive strings - If strings must be included, encrypt them with keys loaded at runtime
  3. Use obfuscation - Tools can scramble string references, making analysis harder
  4. Minimize metadata - Remove debug symbols and unnecessary strings from production builds
  5. Think security-first - Assume your binary will be analyzed; design accordingly

Limitations of Strings Analysis

It's important to know what strings analysis can't do:

  • Can't read obfuscated or encrypted strings - If a developer encrypted their strings, you won't see them
  • Can't understand program logic - You see text but not what the program does with it
  • Can't find zero-day vulnerabilities - You might find clues, but not every vulnerability leaves string evidence
  • Can't reverse logic - Complex algorithms remain hidden

These limitations lead us to the next levels of reverse engineering: static and dynamic analysis.

Reverse Engineering Skill Progression:

Conclusion

Strings analysis is the first rung on the reverse engineering ladder, but it's not a trivial one. Many security researchers have found critical vulnerabilities, exposed APIs, and identified malware simply by running strings and carefully reading the output.

The key insights are:

  1. Binaries aren't magical black boxes - They contain human-readable information
  2. Strings reveal intent - URLs, credentials, and messages show what a program is trying to do
  3. It's accessible - You don't need advanced skills or expensive tools to start
  4. It's practical - Real-world vulnerabilities are discovered this way every day

As you move forward in your reverse engineering journey, remember: always start with strings. It might be the first level, but it's often the most revealing.


Try It Yourself

Download any legitimate program on your system and run strings on it:

# Find all string-related errors in Firefox (example)
strings /Applications/Firefox.app/Contents/MacOS/firefox | grep -i error

# Find all URLs in a binary
strings /usr/local/bin/someapp | grep http

# Find database references
strings /opt/application/binary | grep -i database

What will you discover?