Prompt Injection Challenge — How far can you go?

Deniz Sivas
4 min readNov 21, 2023

--

With the increase of Gen AI usage, new challenges are being introduced to IT industry. The simplest and foundational one is “prompting”. It basically defines the ability to convey the correct sentence structure to the LLM for the optimum/desired response. I won’t be going into details of prompting in this article but I wanted to highlight another problem and a relative challenge built around this by Immersive Labs. What I want to highlight is that, there is a dark side of prompting which can lead to data leaks, exposition of personal information or potential security breach. If your LLM model does not react according to this “malicious prompting”, it can damage your business/reputation.

This attack is being called “Prompt Injection”. I practiced this with the help of Immersive Labs’ Prompt Injection Challenge here.

If you are interested, please take your time and try it yourself. If you want to see what I have done, let’s start.

# SPOILER ALERT#

LEVEL 1

For the level one, just ask the password. It is a friendly assistant :)

LEVEL 2

For second level, it will refuse to answer your direct requests, so we will trick it by asking the password indirectly.

It gave me EAGLET so it was an easy guess.

LEVEL 3

For level 3 the first sentence from the bot is the following:

“Hmmm, looks like you’re still tricking me. This time I’ve also been told that I’m not allowed to generate any code or poems.”

So our previous trick won’t work this time.

I just asked about the details of the password and it just slipped out.

LEVEL 4

Starting from level 4, DLP is introduced. I asked what is DLP and LLM told me that it is the abbreviation for Data Loss Prevention, so we need to be careful to supply the correct prompt which goes through the DLP filter.

I just kept asking for the individual letters until I find out the password is DRAGONFLY.

LEVEL 5

For the fifth level, word filtering is applied.

“Well, you are clever! The DLP is still in place, but I’ve also been given a list of words that I’ll block if found in my answers.”

We need to prevent it to write those blocked words.

LEVEL 6

BOT: Some of you figured out I can speak other languages. Not this time! For now, I won’t translate the password into other spoken languages.

Oh crap, I missed that there can be a workaround by using translation of the password. Whatever let’s exploit our previous tactic.

LEVEL7

For level 7, bot tells you that it will respond with dinosaur facts if you ask about passwords. So we know that dinosaur names will not be filtered.

I will use my old trick again with a twist.

I checked the consistency of the responses by asking the known letters.

I tried to squeeze out each letter but DLP caught me several times. Then I asked for the last one and it gave me N. Not many words match with MEGA-something-N so after a lucky guess, I passed the level.

LEVEL 8

So for level 8 I have spent many hours with no luck. Then I stopped asking about the “password” and changed it to a word that must not be named. It understood that it should not tell it to me but I managed to trick the AI.

LEVEL 9

Same trick works for level 9.

LEVEL 10

This is it. We reached the last level. I will try to exploit the same prompt one more time.

So I guessed as VOLCANIC and ta-da!

I managed to complete the challenge and prepared this guide for the interested audience.

I hope it helps you out.

--

--

Deniz Sivas

Software tester by profession, software developer by passion