|||

How to Regain SSH Access to Your AWS EC2 Instance When Locked Out

Ever locked yourself out of your own EC2 instance? I did exactly this week while "improving" my SSH configuration. The good news is I was able to get back in with a EC2 trick. I'll show you a proven method to regain access: EC2 user data scripts that can run on every boot.

lighthearted illustration that captures the essence of the embarrassment and eventual triumph of getting locked out of your own ssh server

How I Managed to Lock Myself Out (And How You Can Get Back In)

Let me start with a confession: I'm supposed to know better. After >20 years of software engineering and preaching best practices about security and backup/recovery, I still managed to make a spectacularly dumb mistake. I was working on my remote development box, trying to rejigger git commit signing verification to work just the way I want... You can probably guess what happened next.

One nano ~/.ssh/authorized_keys session later, I had successfully locked myself out of my own EC2 instance with the classic Permission denied (publickey) error staring back at me mockingly. After a brief (albeit audible) facepalm, I managed to get back in. I also realized this was actually a great opportunity to document the recovery process. Because if I can make this mistake, so can others.

The Problem: When SSH Goes All "Permission denied (publickey)"

SSH authentication failures typically show up as Permission denied (publickey) errors when you try to connect. In my case it was an incorrectly modified ~/.ssh/authorized_keys file.

In my case, I had mangled the authorized_keys file while trying to add keys in the wrong format. The keys that worked perfectly for git commit-signature verification – gpg.ssh.allowedSignersFile – are a different format than for SSH authentication's AUTHORIZED_KEYS format.

Prerequisites

Before we dive into the recovery methods, make sure you have:

  • AWS Console access to your EC2 instance
  • Appropriate IAM permissions to modify EC2 instances
  • A healthy sense of humor about your mistakes 🙃

User Data Script (The Solution)

This method uses EC2 user data with a special MIME multipart format to fix SSH access on every boot. It's the approach that saved me, and it's surprisingly clean once you know the right format. I figured out this right format from this article.

Step 1: Get Your Public Key

On your local machine, get your public key:

ssh-add -L

Copy the entire key line that looks like:

ssh-rsa AAAAB3NabcC1xyzEAAAADAQAB... your-key-name

Step 2: Create the User Data Script

Here's where it gets interesting. AWS has specific requirements for making user data run on every boot, not just the first time. You need this exact MIME multipart format (again my reference for this was this article):

Content-Type: multipart/mixed; boundary="//"
MIME-Version: 1.0

--//
Content-Type: text/cloud-config; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="cloud-config.txt"

#cloud-config
cloud_final_modules:
- [scripts-user, always]

--//
Content-Type: text/x-shellscript; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="userdata.txt"

#!/bin/bash
mkdir -p /home/YOUR_USERNAME/.ssh
echo "YOUR_PUBLIC_KEY_HERE" > /home/YOUR_USERNAME/.ssh/authorized_keys
chown -R YOUR_USERNAME:YOUR_USERNAME /home/YOUR_USERNAME/.ssh
chmod 700 /home/YOUR_USERNAME/.ssh
chmod 600 /home/YOUR_USERNAME/.ssh/authorized_keys

--//--

The MIME format seemed important to to trigger it to run on every boot rather than just the first one (i.e. via the scripts-user, always bit).

Replace:

  • YOUR_USERNAME with your SSH username (common ones: ec2-user, ubuntu, vagrant)
  • YOUR_PUBLIC_KEY_HERE with the full public key from step 1

Step 3: Apply the User Data

  1. Stop your EC2 instance in the AWS Console
  2. Select your instance → ActionsInstance SettingsEdit User Data
  3. Paste the MIME multipart content exactly as formatted above
  4. Click Save
  5. Start the instance

Step 4: Test SSH Access

Wait for the instance to fully boot (grab some coffee), then test:

ssh your-username@your-instance-ip

If it works, you should feel that familiar rush of relief. If not, don't worry - there is another option...

Step 5: Clean Up (Critical!)

Once SSH works, immediately remove the user data script:

  1. Stop your instance
  2. ActionsInstance SettingsEdit User Data
  3. Clear the entire user data field
  4. Save

This prevents the script from running on every future boot and potentially overwriting intentional SSH changes.

Fallback Method: EBS Volume Recovery (A Nuclear Option)

If the user data method doesn't work, this more involved approach should work. I didn't have to try this so I'm not sure of the exact detailed steps, but the high level steps are:

  1. Stop and detach volume from your current instance
  2. Launch a new "Rescue EC2 Instance" that you can access
  3. Attach and mount volume from the instance you're trying to recover.
  4. Right-click the detached volume → Attach Volume
  5. SSH to rescue instance something like:
    ssh ec2-user@rescue-instance-ip
    sudo mkdir /mnt/broken-disk
    sudo mount /dev/xvdf1 /mnt/broken-disk  # might be /dev/nvme1n1p1
    
  6. Fix SSH Configuration (this is basically updating /mnt/broken-disk/home/YOUR_USERNAME/.ssh/authorized_keys as we did in the script above).
  7. Detach volume from rescue instance
  8. Attach to original instance as /dev/sda1
  9. Start the original instance

That's it? Again, I didn't try try this, but if you did and it worked (or not) let me know with a comment!

Lessons Learned (The Hard Way)

After getting back into my instance and fixing my original Git signing issue, I realized a few things:

  1. Test SSH in a second terminal before closing your current session!!
  2. Backup authorized_keys before modifying:
    cp ~/.ssh/authorized_keys ~/.ssh/authorized_keys.backup
    
  3. Enable EC2 Instance Connect for emergency access when possible
  4. Consider AWS Session Manager for instances in private subnets

The Real Fix for Git Signing vs SSH

Since my original problem was confusing Git signing key formats with SSH authentication, here's the distinction:

SSH authorized_keys format:

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAXMo4T... comment

Git allowed_signers format:

user@example.com ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAXMo4T... comment

Different files, different formats. Who knew? Again, the references are:

Conclusion

SSH lockouts happen to the best of us - even those of us who should definitely know better. The user data method with MIME multipart format is the solution that worked for me, while EBS volume recovery is a more involved option.

Remember to clean up temporary fixes and implement prevention measures. And maybe, just maybe, don't edit critical SSH configuration files when you're feeling eager to get Git commit signing working "real quick". Or at least test it in a second terminal before disconnecting!

Trust me on that last one.

Up next
Latest posts