Using Facebook Notes to DDoS any website

[Update]

Facebook Notes allows users to include <img> tags. Whenever a <img> tag is used, Facebook crawls the image from the external server and caches it. Facebook will only cache the image once however using random get parameters the cache can be by-passed and the feature can be abused to cause a huge HTTP GET flood.

Steps to re-create the bug as reported to Facebook Bug Bounty on March 03, 2014.
Step 1. Create a list of unique img tags as one tag is crawled only once

        <img src=http://targetname/file?r=1></img>
        <img src=http://targetname/file?r=1></img>
        ..
        <img src=http://targetname/file?r=1000></img>

Step 2. Use m.facebook.com to create the notes. It silently truncates the notes to a fixed length.

Step 3. Create several notes from the same user or different user. Each note is now responsible for 1000+ http request.

Step 4. View all the notes at the same time. The target server is observed to have massive http get flood. Thousands of get request are sent to a single server in a couple of seconds. Total number of facebook servers accessing in parallel is 100+.

Initial Response: Bug was denied as they misinterpreted the bug would only cause a 404 request and is not capable of causing high impact.

After exchanging few emails I was asked to prove if the impact would be high. I fired up a target VM on the cloud and using only browsers from three laptops I was able to achieve 400+ Mbps outbound traffic for 2-3 hours.

Number of Facebook Servers: 127

Of course, the impact could be more than 400 Mbps as I was only using browser for this test and was limited by the number of browser thread per domain that would fetch the images. I created a proof-of-concept script that could cause even greater impact and sent the script along with the graph to Facebook.

On April 11, I got a reply that said

Thank you for being patient and I apologize for the long delay here. This issue was discussed, bumped to another team, discussed some more, etc.

In the end, the conclusion is that there’s no real way to us fix this that would stop “attacks” against small consumer grade sites without also significantly degrading the overall functionality.

Unfortunately, so-called “won’t fix” items aren’t eligible under the bug bounty program, so there won’t be a reward for this issue. I want to acknowledge, however, both that I think your proposed attack is interesting/creative and that you clearly put a lot of work into researching and reporting the issue last month. That IS appreciated and we do hope that you’ll continue to submit any future security issues you find to the Facebook bug bounty program.

I’m not sure why they are not fixing this. Supporting dynamic links in image tags could be a problem and I’m not a big fan of it. I think a manual upload would satisfy the need of users if they want to have dynamically generated image on the notes.

I also see a couple of other problems with this type of abuse:

  • A scenario of traffic amplification: when the image is replaced by a pdf or video of larger size, Facebook would crawl a huge file but the user gets nothing.
  • Each Note supports 1000+ links and Facebook blocks a user after creating around 100 Notes in a short span. Since there is no captcha for note creation, all of this can be automated and an attacker could easily prepare hundreds of notes using multiple users until the time of attack when all of them is viewed at once.

Although a sustained 400 Mbps could be dangerous, I wanted to test this one last time to see if it can indeed have a larger impact.
Getting rid of the browser and using the poc script I was able to get ~900 Mbps outbound traffic.

I was using an ordinary 13 MB PDF file which was fetched by Facebook 180,000+ times, number of Facebook servers involved was 112.

We can see the traffic graph is almost constant at 895 Mbps. This might be because of the maximum traffic imposed on my VM on the cloud which is using a shared Gbps ethernet port. It seems there is no restriction put on Facebook servers and with so many servers crawling at once we can only imagine how high this traffic can get.

After finding and reporting this issue, I found similar issues with Google which I blogged here. Combining Google and Facebook, it seems we can easily get multiple Gbps of GET Flood.

Facebook crawler shows itself as facebookexternalhit. Right now it seems there is no other choice than to block it in order to avoid this nuisance.

[Update1]

https://developers.facebook.com/docs/ApplicationSecurity/ mentions a way to get the list of IP addresses that belongs to Facebook crawler.

whois -h whois.radb.net ā€” ā€˜-i origin AS32934ā€² | grep ^route

Blocking the IP addresses could be more effective than blocking the useragent.

I’ve been getting a lot of response on the blog and would like to thank the DOSarrest team for acknowledging the finding with an appreciation token.

[Update 2]

POC scripts and access log can now be accessed from Github. The script is very simple and is a mere rough draft. Please use them for research and analysis purposes only.

The access logs are the exact logs I used for ~900 Mbps test. In the access logs you will find 300,000+ requests from Facebook. Previously, I only counted the facebookexternalhit/1.1, it seems that for each img tag, there are two hits i.e. one from externalhit version 1.0 and one from 1.1. I also tried Google during the test and you will find around 700 requests from Google.

A Bug in the Bug Bounty Program

Recently, I wrote a post called Using Google to DDoS any website where I outlined how someone could use Google spreadsheets to DDoS a website. In this post, I will share my experience of the Bug Bounty program and point out something which at least to me seems to be a major flaw in the program.

First, let’s have a look at the timeline for the bug.

March 8, 2014: Bug report filed. Got automated response on the same day.

March 9, 2014: Response from Google Security.

Thanks for your bug report. We’ve taken a look at your submission and can confirm this is not a security vulnerability. There are many ways to perform a traffic amplification attack and we do employ some automated behavior analysis to prevent widespread abuse of these tactics. As this issue involves brute force denial of service it is not eligible for inclusion in our award program.

Bug denied initially.

March 11, 2014: Response from Google Security.

My apologies I misread your initial report. I am currently investigating this, and will let you know if and when we decide to file a security bug, if we do file a bug you will be entered into our award program.

So, it may be a bug ?

March 21, 2014: Response from Google Security.

I met with the Spreadsheets product manager, this is an issue they are aware of however your method of adding different parameters to evade their solution was new. Additionally, they already had a more robust fix ready to be put in place to prevent this behavior (including your previously unrecognized attack), because we are not making any changes as a result of your report we can’t include it in the award program for credit or an award. Thank you for bringing this to our attention however.

So this is an issue, it is NOT FIXED in production, and yet it won’t be included in the program ?

Let’s have a look at what’s happening here:

Deniability
We have already noticed through various bug bounty programs that there is this tendency to deny the bugs initially. If it is not a well spread bug like XSS, CSRF, these programs somehow tend to reject weird unknown bugs that they think have a less likely chance of occurring. Anyone remembers XSS being denied as a security issue in the early 2000 ? Initial deniability can also cause the bug to be public without getting fixed. In some cases, it can be a disaster while in others it will help the public to protect themselves.

Bug Severity
Bounty programs have their way to define how severe a bug is. We have also seen a lot of marketing gone into this and people saying a bug can amount to high rewards. Some programs even take into account creativity. I think a bug should be measured solely based on severity because even a small flaw can cause a lot of damage.
Google has a pre-defined bounty amount for certain class of attacks and they say the award can be more based on severity. Google FAQ for application security says:

Q: How do I demonstrate the severity of the bug if Iā€™m not supposed to snoop around?
A: Please submit your report as soon as you have discovered a potential security issue. The panel will consider the maximum impact and will chose the reward accordingly. We have routinely paid higher rewards in cases where the attacker didn’t even begin to suspect the full consequences of a particular flaw.

I believe they are saying maximum impact on Google perhaps ? What about impact on others, do they take that into consideration ? In case of DDoS more impact is done via Google to external parties. It seems a lot of theoretical assumption about impact goes into evaluating a bug. However, if they have a certain set of process to evaluate the bug, this should be fine and it should only affect the end amount for bounty.

The Flaw in Bug Bounty

It is all good if a bug is measured on its merit but could there be other factors that affect the severity of the bug. I think the release plan of the organization affects the bug severity and is the undocumented feature used by organizations to evaluate a security bug. If a fix for the bug is already in the release plan, the bug is not measured on what it can do, it’s considered a non-severe bug and can ultimately be denied. Not treating the bug on its merit is the flaw in the bounty program. The bug bounty program is supposed to create an environment of trust among security researchers and organizations. This flaw causes a situation of mistrust. The security researchers have no way of knowing what these organizations are working on. If there is a production bug, it’s a bug from the researcher’s perspective. If there was a fix wouldn’t it be applied to production ? And if they are late for deployment, shouldn’t they still tag it as a valid bug in the bounty program ?

Here is a one line explanation from Google’s FAQ that explains on what happens when two people find the same bug. It does not go into detail about what happens if Google is working on the bug by themselves.

Q: What if somebody else also found the same bug?
A: First in, best dressed. You will qualify for a reward only if you were the first person to alert us to a previously unknown flaw.

Note that the bug has not been filed by any other security researcher in this case. So as the state of the bug bounty program currently stands, even if you file a valid bug that can have low to high impact and is not fixed in production, it can be denied.

I emailed Google security if they can tell me whether the spreadsheet DDoS issue has been fixed in production and got a reply that the fix has not been deployed yet.

Using Google to DDoS any website

Google uses its FeedFetcher crawler to cache anything that is put inside =image(“link”) in the spreadsheet.
For instance:
If we put =image(“http://example.com/image.jpg”) in one of the cells of Google spreadsheet, Google will send the FeedFetcher crawler to grab the image and cache it to display.

However, one can append random request parameter to the filename and tell FeedFetcher to crawl the same file multiple times. Say, for instance a website hosts a 10 mb file.pdf then pasting a list in the spreadsheet will cause Google’s crawler to fetch the same file 1000 times.

=image("http://targetname/file.pdf?r=0")
=image("http://targetname/file.pdf?r=1")
=image("http://targetname/file.pdf?r=2")
=image("http://targetname/file.pdf?r=3")
...
=image("http://targetname/file.pdf?r=1000")

Appending random parameter, each link is treated as different thus Google crawls it multiple times causing a loss of outbound traffic for the website owner. So anyone using a browser and opening just a few tabs on his PC can send huge HTTP GET flood to a web server.

Here, the attacker does not need a huge bandwidth at all. Attacker requests Google to put the image link in the spreadsheet, Google fetches 10 MB data from the server but since it’s a PDF(non-image file), the attacker gets N/A from Google. This type of traffic flow clearly allows amplification and can be disaster.

Using just a single laptop with multiple tabs open and merely copy pasting multiple links to a 10 mb file, Google managed to crawl the same file at 700+ mbps. This 600-700 mbps crawling by Google continued for 30-45 min. at which point I shut down the server. If I do the math correctly, I think it’s about 240 GB of traffic in 45 min.

I was surprised that I was getting such high outbound traffic. With only a little bit of more load, I think the outbound will reach Gbps and the inbound traffic will reach 50-100 mbps. I can only imagine the numbers when multiple attackers get into this. Google uses multiple IP addresses to crawl and although one can block the FeedFetcher User Agent to avoid these attacks, the victim will have to edit the server config and in many cases it might be too late if this attack goes un-noticed. The attack could be so easily prolonged for hours just because of its ease of use.

After I found this bug, I started Googling for any such incidents, and I found two:

The Google Attack explains how the blogger attacked himself and got a huge traffic bill.

Another article, Using Spreadsheet as a DDoS Weapon explains similar attack but points that an attacker must first crawl the entire target website and keep the links in spreadsheet using multiple accounts and as such.

I found it a bit weird that nobody tried appending random request variables. Even though the target website has 1 file, adding random request variables thousands of request can be sent to that website. It’s a bit scary actually. Anyone copy pasting few links in the browser tabs should not be able to do this.

I submitted this bug to Google yesterday and got a response today that this is not a security vulnerability. This is a brute force denial of service and will not be included in the bug bounty.
May be they knew about this before hand and don’t really think it’s a bug ?
I hope they do fix this issue though. It’s simply annoying that anyone could manipulate Google crawlers to do this. A simple fix will be just crawling the links without the request parameters so that we don’t have to suffer.

Join the discussion on HN