[Development] Internet Safety and Embedded Content

[Development] Internet Safety and Embedded Content - Printable Version

+- Makestation (https://makestation.net)
+-- Forum: Technical Arts (https://makestation.net/forumdisplay.php?fid=45)
+--- Forum: Web Design & Internet (https://makestation.net/forumdisplay.php?fid=62)
+--- Thread: [Development] Internet Safety and Embedded Content (/showthread.php?tid=2988)

Internet Safety and Embedded Content - Lain - May 9th, 2020

I've been meaning to make this thread for a while just to show people how unsafe embedded content can possibly be, but I didn't want it to be abused on this site, since of course this site allows embedded content like images and videos. But, in any case, I've asked Darth for permissions to post this and even though he didn't give a clear 'yes' or 'no' I took the liberty of interpreting his answer as 'sure whatever.'

I would like to note that if you are caught doing the following on the site, you can certainly expect a ban.
Even if you don't do anything actually malicious, like launching an attack with whatever you learn. Just grabbing data is a breach of privacy.
And privacy risks are to be taken seriously, not to mention I probably take them the most seriously out of anyone on this site.

Without further ado, let's talk about why typical internet safety rules don't quite make the cut anymore.

Internet Safety and Embedded Content

Table of Contents:

0x0: Theory
0x1: Practice
0x2: Mitigation
0x3: Final Thoughts

0x0: The Risk, and the Theory behind it.

Now that we're living in a hyperconnected time, i.e. everyone's staying at home either working on their computer, watching Netflix on their computer, playing games on their computer, browsing the web on their computer, etc. there's been an incredible influx of internet traffic all over the world, and probably every site out there that has an existing userbase has most likely seen an increase in their userbase. No doubt, more people are using the internet now than they did before, and people who didn't use the internet before are probably using it more now. Not to mention, everyone is using it for much longer every day than they did before as well.

But when more people access a website or online resource, it become a much more attractive target for hackers. Maybe the hacker will try to hack the site so they can get advertising revenue. Maybe they'll want everyone's account info. Maybe they want to find who's accessing that site, then try to hack the users so that they can get all those always-online computers to mine Bitcoin for them. Anything goes.

The point I'm making is that when something is more popular online, it becomes a much bigger target for hackers or anyone with malicious intent.

For the last decade or more, most hacks you've heard about, like a company getting hacked rather, have happened because of negligence of employees. They might have opened an email that contained a virus in the attachment, then ran the virus. Or maybe they clicked a link which had a Flash/Java/Silverlight app on the webpage, which then hacked their computer. Or perhaps it was a site that looked like a login screen, and they tried to log in (a phishing attack.)

To stop this from happening as much as it did, many companies in the last few years started offering training that included simple tips like 'Don't click links you don't recognize. Don't open attachments from people you can't verify.' And so on.

Unfortunately, this is just too little, too late.

The point I'm going to make here is that these types of tips and tricks are severely outdated in the modern world. There's all sorts of features in software that people haven't properly explored or that people just aren't aware of which can also be used to hack users.

And one of them I'm going to demonstrate here is the use of Embedded Content on websites.

Q: What is Embedded Content?

A:

Embedded content is how you might display an image on a webpage. For instance, instead of making someone click a link to see an image, you can render it in the web page itself. As an example, the Makestation Logo at the top of this page is embedded onto the site in the header, instead of displaying as a link. But, the link to the Logo file on the site is http://makestation.net/images/makestation/ymes_a.png. The content in that link gets embedded using an HTML tag as such:

Code:
<img src="http://makestation.net/images/makestation/ymes_a.png" />

With some optional data as well. This tag tells your web browser to (in the background when the page is loading) automatically go to that link as well, find the file, and display it where the tag is located on the web page.

A marvel of modern technology, allowing us to do this.

There's one glaring flaw, though. The web browser does all of this automatically without you knowing or expecting it. I mean, I guess you expect it to load automatically since it's not like you want to do it yourself, but what if we played with the concept a little more and pushed its limits...

0x1: Building a Practical Example for Demonstration.

You see, as I mentioned before, people used to get infected (and still do) because they opened a file/web page that had a virus in/on it. Thus, it was relatively difficult for attackers to hack people, they might have had a 1-3% chance of infecting any person, which is why lots of spammer services would offer to send malicious data to thousands upon thousands of emails at a time, and that's part of the reason why people try to get email data from websites when they get hacked.

But as we've covered before, the browser sometimes opens files on its own as long as it can recognize the type of file it's trying to open.

So in short, we're going to try and use that concept to do something that we shouldn't.

We're going to grab an IP address of a user just by displaying an image.

I've set up an instance of Debian Linux inside a virtual machine running in VMWare Player.
On this VM, I've also installed Apache2 as a web server, and PHP7 to write our code.
In the screenshot, I'm showing my LOCAL address, or 192.168.198.128 on the virtual ethernet bridge ens33.
Note: that's an INTERNAL/LOCAL address. Trying anything with it won't do anything, so don't waste your time, skids.

[Image: SDC36hj.png]

To make sure my web server and connectivity was working, I went to that IP address on my actual computer (not the VM.)

[Image: tLwOBv4.png]

So, I can connect to the web server hosted on my VM from my actual machine. Perfect.

Now, what we want to do is to serve an image to a user and make that image do something when it's opened.
Well, first, I need an image file. So I head over to 4chan's /c/ and find a picture of a cute anime girl, then wget it to my web server folder (/var/www/html)

[Image: AjkqtTv.png]

Oh look, it's my daughter Cirno from Touhou Project, who made her first appearance in the franchise as the First Boss in Touhou 6: Embodiment of Scarlet Devil!
I'd switch to Intel for her because her ice-fairy powers can keep its temperature below 200 degrees, and I can used the money saved from buying a fancy cooling system to pay for the power bill that a 500W CPU uses Smile

f*** Intel. Whatever.

First, I'll change 1589002139579.png to cirno.png and then, let's try to access it from my computer.
I won't take a screenshot to preserve the formatting of this post, but it worked.

Now, we need to actually write some code that will be executed when this image gets loaded. We're going to use PHP since it's the easiest to install with Apache.
First, gotta make a new file, and we can name it after my daughter, so cirno.php and one more where we can store the log, log.txt.
Be sure to adjust file permissions accordingly, 777 works fine for the demo but you at least need write permissions on log.txt and execute permissions on cirno.php, and read permissions on cirno.png.
In cirno.php file, we can start writing our code:

PHP Code:
<?php
    //RELEASED UNDER GPLv2 BY Lain AT MAKESTATION.NET.
    //THIS CODE IS PURELY FOR EDUCATIONAL PURPOSES.
    //ANY ATTEMPT TO USE THIS CODE MALICIOUSLY WILL BE REPORTED AND HAVE THIS CODE WITHDRAWN FROM THE PROJECT.

    // Gets the IP address of the user viewing the web page
    $ip = $_SERVER['REMOTE_ADDR'];

    //Gets the time at which the web page is accessed
    $date_time = date("l j F Y  g:ia", time() – date('Z')) ;

    //Returns a file pointer for PHP to open the file with append text
    $fp = fopen("log.txt", "a");

    //Writes to the file pointer
    //xxx.xxx.xxx.xxx        Saturday 9 May 2020 00:00am
    fputs($fp, $ip.'\t'.$time.'\n');

    //Close the file
    fclose($fp);

    //===========================

    //Open the image file as read + binary format
    $fimg = fopen('cirno.png', 'rb');

    //Set the proper return headers
    header("Content-Type: image/png");
    header("Content-Length: " . filesize('cirno.png'));

    //Return the file to the user and exit
    fpassthru($fimg);
    exit;

?>

Forgive the code quality. I don't know how to optimize PHP.

Now, to make sure the script works, we can navigate to the web server and find the cirno.php file.

If it's set up correctly, it should just return the image without any other HTML or formatting. We can verify the headers for this:

[Image: jMeIA3T.png]

As you can see, Content-Type is set to image/png and with special binary formats like images, you also need a Content-Length header, which is also there. And, of course, the image is displayed.

But let's check if the script did anything:
[Image: fjPq7r4.png]

Sure enough, my IP (as a VM network gateway) got logged along with the time and date. As anticipated, I wouldn't ever give you guys a broken code sample ;O

So, now you can embed that URL into anything, right?! Then start getting IPs?

Maybe not.

Now, if you were to embed it into a <img> tag, I'm pretty sure it would actually work.

But in some cases, you don't have access to writing HTML code. MyBB forums are a good example of this.
I can write something <img src="https://i.imgur.com/UPKqp7a.png" /> and as you can see, the HTML tag is displayed in text instead of as an image.
But, I can use [img] tags to embed images!
[img]https://i.imgur.com/UPKqp7a.png[/img]
[Image: UPKqp7a.png]

There she is!

But there's another catch: I can try to go to a url with a different extension, and watch what happens:
[Image: UPKqp7a.php]

Note: dead link anyway, imgur doesn't have PHP pages. So in MyBB's case, it actually loads because the image loading BBcode parser doesn't verify for file-types.

But what about an email provider?
The vast majority of those providers like Gmail, Outlook/Hotmail/Yahoo! all have very strict rules on what can be sent in an email. This includes attachment filetypes (only a couple are allowed, anything that might contain a binary/executable/compressed data is blocked.) Also, only some HTML tags are allowed as well (no scripts, etc.) Among the HTML filter lies another filter for image-types when trying to embed an image. It won't allow .php extensions for this very reason.

So how do we get around that limitation, as an attacker?

We make that PHP file look like a PNG.

Apache (and basically every other web server like nginx) has a cool feature called Rewriting, or specifically mod_rewrite.
By default when you install apache2, the module isn't enabled, and the apache2 configuration doesn't allow using rewrite rules easily. So we need to do a couple things first.

On debian this is how you'd enable mod_rewrite (as root): a2enmod rewrite
Then, you'll be prompted to restart apache2, so systemctl restart apache2

Next, you need to change apache's config file. It's usually located at /etc/apache2/sites-available/000-default.conf

It's very minimal as a default, mainly filled with comments. But inside <VirtualHost *:80> [...] </VirtualHost> we need to add a little block of code:

Code:
#Change /var/www/html to your web server's root

<Directory /var/www/html>

        Options Indexes FollowSymLinks MultiViews

        AllowOverride All

        Require all granted

</Directory>

So now we can override what the server returns based on the request it receives.
And to do that, we use a different configuration file, back in the root of our web server (so /var/www/html for me)

nano /var/www/html/.htaccess

The .htaccess file is another config file for a specific website installed with Apache (since Apache allows for multiple.) It's always in the root of the web server files, and is hidden because of the dot in front. As a result, if you want to see if it's there, you need to use ls -a instead of just ls.
Inside, we only need to add two lines:

Code:
RewriteEngine on

RewriteRule ^cirno.png$ cirno.php

You would replace the filenames in your specific case, but the general rule for RewriteRule is that the first parameter is the pattern to find, and the second rule is what to replace the pattern with. The ^ for finding means 'the end of the domain.name/' or in my case, 192.168.198.128/ and the $ means that there should be nothing after the pattern, or in other words, the URL ends right there. So, the matching value for this would be 192.168.198.128/cirno.png and would display the contents of 192.168.198.128/cirno.php instead.

So let's save that file, and run back over to the web browser, and see if we can open cirno.png:
[Image: 2VPtm8I.png]

And there she is! Notice the URL and the headers.

So let's check our log.txt again...

[Image: 82Is6m9.png]

Sure enough, 41 more bytes added, and a new entry is appended to the file. Seems like our work here is done, we've successfully 'converted' a PHP file to a PNG, or at least we can fool a filter into thinking that we only have an image.

So what does this mean?

Well, we could try to send a link to someone if we wanted their IP address, and hope that they click it. But if they don't trust the link, they won't. So success rates for this has been going down in recent years.

But we can force the user to give us that data (or rather, force the BROWSER that the user is using to send us that data) without them even realizing it, just by embedding a seemingly harmless image in something like a forum post and them reading the forum post. No extra interaction needed.

Now, the after-effects of this can vary from environment to environment, and from case to case. The only thing someone could do with an IP address really is to try and geolocate it. In other words, they can find who the ISP company is that owns the IP, then find where that ISP is based, and get a rough estimate of which city you live in at best. It's not very accurate. In some cases, many people online claim they can call the ISP about the IP address, claiming to be a detective or PI, and try and get more data about it, such as who the client is, what sort of traffic is sent, where they live, etc. although I've never actually seen a reliable instance of this happening, and it's never been documented properly. In other words, it's all just rumours.

Networks are pretty secure as they are in default settings. Or rather, your home network is connected with a modem, right? Well for someone to try and connect to your computer using your IP, first, that IP would connect to your modem. Since your modem doesn't know what to do with that connection, it will simply drop it. Now, if your modem was misconfigured to allow remote access, then that would become a problem. But in 99.9% of cases, nothing will happen.

If someone sent thousands of connections to your IP (i.e. a DDoS attack) then your internet might go out for a bit. That's about it.

But if your IP address was logged in a website's database, and that website got hacked and all the user data stolen, then someone with that stolen data might be able to look up the IP they just found, and find an old username/password that you used in the past, or any other data that website stored, like if it was a shopping site, your address would also be there, as well as email, maybe phone number, and in the worst case a credit card number.

But as you can see, all of the above risks are relatively minor, the last one being major but extremely unlikely, and in each case, we're talking about a specific user/IP. Let's say someone were to use this tactic on the forum here, embedding an image in their signature or a thread.

Well, everyone who reads that thread would get their IP logged, yeah.
How many people is that? 10? 20? 1000?
Who knows. Guests browse the site as well. So do bots like Google for indexing web pages so we can show up in their search.
All those IPs will also be stored, and it would be very difficult to find a specific user, unless you simply guessed or tried to find other data from their old posts, like where they're from so you can geolocate again, etc.

In other words, the risk is minimal.

0x2 - Stop It from Happening to Your Site

So the risks might be minor, but you still want to avoid this from happening. What do you do?

Ideally you'd disable embedding data. But that's stupid since now your website looks boring and people need to click links to see any images. It's a hassle, so although it might be the 'safest' solution, it's probably the worst solution as well.

Well, like I mentioned before in the part about filtering. Make sure your website does not render non-whitelisted file-types. At the very least, users will not be able to embed websites that end in .php.

.gif, .png, .jpg, .svg, and maybe .webp should be the only filetypes allowed to be embedded in image tags.
So, at least, if someone did want to embed a malicious image, they would have to do the whole rewrite process as well, so you're just making them do extra work even if it's not perfect.

Instead of making the user load the file, in your image parser, get your website to retrieve the file, download/save it to a binary object, convert it to base64, then send the base64 data to the user like so: <img src='data:image/png;base64, [ALL THAT BASE64 DATA]' />

So your server is the IP that gets compromised, and not the user, thus preventing the data leak. Bonus points if your server uses a proxy or CDN to fetch the data. Then not even your server gets compromised.

This would be the most ideal option with the bonus IMO.

Do not allow cross-origin resources.

This one is easy to implement, but your site becomes harder to maintain/use. Simply put, if any URL on your site's embedded content is not pointing to content on your site, then do not render it. So for instance, all the images on this thread are hosted on Imgur and embedded from i.imgur.com. But if Same-Origin Policies were in place, those images would not render. Instead, for my images to render, I would need to use Makestation's very own Official Image host because then the URL would still include makestation.net instead of imgur.com. But, of course, that would mean you need to create your own image host, and since all your users would be using it, your disk space might get eaten up very quickly.

So again. Not ideal. Tough to maintain. But of course it would still allow full functionality as long as people followed the guidelines.
But people can't always follow the guidelines and if everything is too hard to use, then you're going to have problems with people leaving.

0x3 - Final Thoughts.

I thought this would be a fun experiment just for me to not only get some of my PHP knowledge back, but also do something security related again since it's been a while since I've done any major threads/projects that weren't directly related to the site.

I think that although human-error is inevitable and I also think that any efforts to try and 'educate' people about the risks is futile. But that only means we simply need better measures in place to stop malicious actors from doing what they want. The recent push in the web world has been to try and essentially 'decentralize' most resources on the web by allowing for AJAX, cross-origin resources, etc. but very little has been done to actually mitigate the risks that these technologies bring along with them. At best, HTTPOnly flags for cookies prevents from certain cookies being leaked through AJAX requests, but that's about it.

Any other way to fix data leaks is only based on whatever implementation the website can come up with. The browser itself offers no inherent protection from what was this guide covered, and as a matter of fact, this guide shows how the browser ENABLED all this to happen. We have special HTTP headers for Cross-Origin policies, but for Single-Origin Policies, the only way to implement it is by writing it yourself. No special headers or anything. The browser has no protections from running unauthorized AJAX requests to a cross-origin source, rather it says 'Alright let's go!' and ONLY IF THE RESPONSE HEADER FOR ORIGINS ALLOWED DOESN'T MATCH does the browser say 'alright, nevermind.' Otherwise, it'll load whatever you throw at it. Not to mention that if the header is missing, it'll interpret it as *, meaning 'sure, load it on whatever f*** webpage you want'

You see, when Flash did all sorts of shady shit outside the browser sandbox, all browser developers agreed that Flash had to be removed.

But now, shady shit happens INSIDE the browser sandbox, and instead they come up with a new way for more shady shit to happen. It's sickening.

We're currently considering whether we should make any changes to the current security measures, such as disabling image rendering in private messages (rather, converting them to a link, perhaps) as well as modifications to the video BBcode since it's always been prone to bugs. We are taking your security and privacy seriously here and we've even put in a way to have your personal data exported or erased. If you visit the UserCP, you'll see a link to do so.

But until we decide on what measures to implement, we'll most likely be banning/warning any users who we suspect of trying to tamper with data like this. This thread is for educational purposes, and the code can be withdrawn at any time due to its GPLv2 licensing. Not going to tolerate any users getting their data leaked.

Cheers kids. Hope you learned something.