Follow by Email

Saturday, November 30, 2013

Squid Worm Discovered

This squid-like worm  is new to science.
As usual, you can also use this squid post to talk about the security stories in the news that I haven't covered.

More on Stuxnet

Ralph Langer has written the definitive analysis of Stuxnet: short, popular version, and long, technical version.
Stuxnet is not really one weapon, but two. The vast majority of the attention has been paid to Stuxnet's smaller and simpler attack routine -- the one that changes the speeds of the rotors in a centrifuge, which is used to enrich uranium. But the second and "forgotten" routine is about an order of magnitude more complex and stealthy. It qualifies as a nightmare for those who understand industrial control system security. And strangely, this more sophisticated attack came first. The simpler, more familiar routine followed only years later -- and was discovered in comparatively short order.
Stuxnet also provided a useful blueprint to future attackers by highlighting the royal road to infiltration of hard targets. Rather than trying to infiltrate directly by crawling through 15 firewalls, three data diodes, and an intrusion detection system, the attackers acted indirectly by infecting soft targets with legitimate access to ground zero: contractors. However seriously these contractors took their cybersecurity, it certainly was not on par with the protections at the Natanz fuel-enrichment facility. Getting the malware on the contractors' mobile devices and USB sticks proved good enough, as sooner or later they physically carried those on-site and connected them to Natanz's most critical systems, unchallenged by any guards. Any follow-up attacker will explore this infiltration method when thinking about hitting hard targets. The sober reality is that at a global scale, pretty much every single industrial or military facility that uses industrial control systems at some scale is dependent on its network of contractors, many of which are very good at narrowly defined engineering tasks, but lousy at cybersecurity. While experts in industrial control system security had discussed the insider threat for many years, insiders who unwittingly helped deploy a cyberweapon had been completely off the radar. Until Stuxnet.
And while Stuxnet was clearly the work of a nation-state -- requiring vast resources and considerable intelligence -- future attacks on industrial control and other so-called "cyber-physical" systems may not be. Stuxnet was particularly costly because of the attackers' self-imposed constraints. Damage was to be disguised as reliability problems. I estimate that well over 50 percent of Stuxnet's development cost went into efforts to hide the attack, with the bulk of that cost dedicated to the overpressure attack which represents the ultimate in disguise -- at the cost of having to build a fully-functional mockup IR-1 centrifuge cascade operating with real uranium hexafluoride. Stuxnet-inspired attackers will not necessarily place the same emphasis on disguise; they may want victims to know that they are under cyberattack and perhaps even want to publicly claim credit for it.

How to Verify the Rails CookieStore Session Termination Weakness

I want to try it out myself you say.
Here is a video explanation using as an example:

And here are the steps you take to verify the weakness yourself–using, as well as on other websites you suspect are using Rails’ CookieStore (such as those on this list):
  1. Install a Chrome plugin such as Edit This Cookie to make viewing and editing cookies easier.
  2. Go to a site such as (no SSL!) or one you suspect is using Rails’ CookieStore.
  3. Find a cookie whose value starts with “BAh7″. That’s a good indicator of Ruby on Rails CookieStore-based websites before version 4.0 of Rails, or those that don’t encrypt their CookieStore values. The session cookie will have a value starting with “BAh7″ then a separator of “–” then a hash digest.
  4. Open Edit This Cookie using the little icon in the top right of your browser. Find the cookie whose value starts with “BAh7″ and take note of the cookie’s name. In the case of it’s “_ksr_session”. Copy the entire cookie data (“BAh7…”).
  5. Screen Shot 2013-11-24 at 12.01.20 PM

  6. Log out of the website.
  7. Open Edit This Cookie and overwrite the session cookie’s current value (“_ksr_session” in this case) with the data you copied previously.
  8. Go to the website. You should be in again!

Thursday, November 28, 2013

Distributed SQL Query Engine for Big Data

Presto is 10 times faster than Hive for most queries,
Source: Facebook

Technologically, Hive and Presto are very different, namely because the former relies on MapReduce to carry out its processing and the latter does not. This is by and large the difference that makes Presto suitable for low-latency queries while the MapReduce-based Hive can take a long time — especially over Facebook’s many petabytes of data — because it must scan everything in the cluster and requires lots of disk writes. Presto also works with a variety of non-Hadoop-Distributed-File-System data sources and uses ANSI SQL compared with Hive’s SQL-like language.
Presto is currently running in numerous Facebook data centers and the company has scaled a single cluster up to 1,000 nodes. More than 1,000 employees run queries on Presto, and they do more than 30,000 of them per day over a petabyte of data. Traverso’s post gives a lot more details about how Presto works and how Facebook plans to improve its speed and functionality in the near term.
A Presto screenshot
A Presto screenshot
However, I think the most-interesting part about Presto might be less technological and more about its effects on the Hadoop industry, which is projected to be worth tens of billions of dollars in the next few years. The mere fact that Facebook chose to create a website for the project says something about how serious the company takes it. And although Facebook has technically open sourced quite a few Hadoop improvements over the years, this is the first since Hive where I’ve noticed such fast (if any) uptake from external companies.
It will be interesting to watch how, if at all, Presto affects adoption of Cloudera’s Impala, Hortonworks’ Stinger project, Pivotal’s HAWQ or any other of the myriad SQL-on-Hadoop engines currently making fighting for mindshare. The fact that Presto is open source and ready to use certainly has to be a big draw for some users, and could help it establish a solid user base while other technologies are still coming to be.
Facebook isn’t looking to compete with other projects and doesn’t have a horse in the race from a business perspective — it will likely go along using and improving Presto at its own pace regardless what happens — but serious uptake could inspire the Hadoop vendors to change their strategies when it comes to the SQL engines they support. Much of the early innovation from Hadoop came from power users (including Yahoo and Facebook) rather software companies, and it’s possible we haven’t seen the end of that trend.

RSA algorithm (Rivest-Shamir-Adleman)

RSA is an Internet encryption and authentication system that uses an algorithm developed in 1977 by Ron Rivest, Adi Shamir, and Leonard Adleman. The RSA algorithm is the most commonly used encryption and authentication algorithm and is included as part of the Web browsers from Microsoft and Netscape. It's also part of Lotus Notes, Intuit's Quicken, and many other products. The encryption system is owned by RSA Security. The company licenses the algorithm technologies and also sells development kits. The technologies are part of existing or proposed Web, Internet, and computing standards.

How the RSA System Works

The mathematical details of the algorithm used in obtaining the public and private keys are available at the RSA Web site. Briefly, the algorithm involves multiplying two large prime numbers (a prime number is a number divisible only by that number and 1) and through additional operations deriving a set of two numbers that constitutes the public key and another set that is the private key. Once the keys have been developed, the original prime numbers are no longer important and can be discarded. Both the public and the private keys are needed for encryption /decryption but only the owner of a private key ever needs to know it. Using the RSA system, the private key never needs to be sent across the Internet.
The private key is used to decrypt text that has been encrypted with the public key. Thus, if I send you a message, I can find out your public key (but not your private key) from a central administrator and encrypt a message to you using your public key. When you receive it, you decrypt it with your private key. In addition to encrypting messages (which ensures privacy), you can authenticate yourself to me (so I know that it is really you who sent the message) by using your private key to encrypt a digital certificate. When I receive it, I can use your public key to decrypt it. A table might help us remember this.

To do this Use whose Kind of key
Send an encrypted message Use the receiver's Public key
Send an encrypted signature Use the sender's Private key
Decrypt an encrypted message Use the receiver's Private key
Decrypt an encrypted signature (and authenticate the sender) Use the sender's Public key

What is Information Governance:

Information governance is a holistic approach to managing corporate information by implementing processes, roles, controls and metrics that treat information as a valuable business asset.
The goal of a holistic approach to information governance is to make information assets available to those who need it, while streamlining management, reducing storage costs and ensuring compliance.  This, in turn, allows the company to reduce the legal risks associated with unmanaged or inconsistently managed information and be more agile in response to a changing marketplace.
An important goal of information governance is to provide employees with data they can trust and easily access while making business decisions. In many organizations, responsibilities for data governance tasks are split among security, storage and database teams. Often, the need for a holistic approach to managing information does not become evident until a major event occurs such as a lawsuit, compliance audit or corporate merger.

Tuesday, November 26, 2013

The FBI Might Do More Domestic Surveillance than the NSA

This is a long article about the FBI's Data Intercept Technology Unit (DITU), which is basically its own internal NSA.
It carries out its own signals intelligence operations and is trying to collect huge amounts of email and Internet data from U.S. companies -- an operation that the NSA once conducted, was reprimanded for, and says it abandoned. [...]
The unit works closely with the "big three" U.S. telecommunications companies -- AT&T, Verizon, and Sprint -- to ensure its ability to intercept the telephone and Internet communications of its domestic targets, as well as the NSA's ability to intercept electronic communications transiting through the United States on fiber-optic cables.
After Prism was disclosed in the Washington Post and the Guardian, some technology company executives claimed they knew nothing about a collection program run by the NSA. And that may have been true. The companies would likely have interacted only with officials from the DITU and others in the FBI and the Justice Department, said sources who have worked with the unit to implement surveillance orders.
Recently, the DITU has helped construct data-filtering software that the FBI wants telecom carriers and Internet service providers to install on their networks so that the government can collect large volumes of data about emails and Internet traffic.
The software, known as a port reader, makes copies of emails as they flow through a network. Then, in practically an instant, the port reader dissects them, removing only the metadata that has been approved by a court.
The FBI has built metadata collection systems before. In the late 1990s, it deployed the Carnivore system, which the DITU helped manage, to pull header information out of emails. But the FBI today is after much more than just traditional metadata -- who sent a message and who received it. The FBI wants as many as 13 individual fields of information, according to the industry representative. The data include the route a message took over a network, Internet protocol addresses, and port numbers, which are used to handle different kinds of incoming and outgoing communications. Those last two pieces of information can reveal where a computer is physically located -- perhaps along with its user -- as well as what types of applications and operating system it's running. That information could be useful for government hackers who want to install spyware on a suspect's computer -- a secret task that the DITU also helps carry out.
Some federal prosecutors have gone to court to compel port reader adoption, the industry representative said. If a company failed to comply with a court order, it could be held in contempt.
It's not clear how many companies have installed the port reader, but at least two firms are pushing back, arguing that because it captures an entire email, including content, the government needs a warrant to get the information. The government counters that the emails are only copied for a fraction of a second and that no content is passed along to the government, only metadata. The port reader is designed also to collect information about the size of communications packets and traffic flows, which can help analysts better understand how communications are moving on a network. It's unclear whether this data is considered metadata or content; it appears to fall within a legal gray zone, experts said.
The Operational Technology Division also specializes in so-called black-bag jobs to install surveillance equipment, as well as computer hacking, referred to on the website as "covert entry/search capability," which is carried out under law enforcement and intelligence warrants.
But having the DITU act as a conduit provides a useful public relations benefit: Technology companies can claim -- correctly -- that they do not provide any information about their customers directly to the NSA, because they give it to the DITU, which in turn passes it to the NSA.
There is an enormous amount of information in the article, which exposes yet another piece of the vast US government surveillance infrastructure. It's good to read that "at least two" companies are fighting at least a part of this. Any legislation aimed at restoring security and trust in US Internet companies needs to address the whole problem, and not just a piece of it.

Inside America's Plan to Kill Online Privacy Rights Everywhere

The United States and its key intelligence allies are quietly working behind the scenes to kneecap a mounting movement in the United Nations to promote a universal human right to online privacy, according to diplomatic sources and an internal American government document obtained by The Cable.
The diplomatic battle is playing out in an obscure U.N. General Assembly committee that is considering a proposal by Brazil and Germany to place constraints on unchecked internet surveillance by the National Security Agency and other foreign intelligence services. American representatives have made it clear that they won't tolerate such checks on their global surveillance network. The stakes are high, particularly in Washington -- which is seeking to contain an international backlash against NSA spying -- and in Brasilia, where Brazilian President Dilma Roussef is personally involved in monitoring the U.N. negotiations.
The Brazilian and German initiative seeks to apply the right to privacy, which is enshrined in the International Covenant on Civil and Political Rights (ICCPR), to online communications. Their proposal, first revealed by The Cable, affirms a "right to privacy that is not to be subjected to arbitrary or unlawful interference with their privacy, family, home, or correspondence." It notes that while public safety may "justify the gathering and protection of certain sensitive information," nations "must ensure full compliance" with international human rights laws. A final version the text is scheduled to be presented to U.N. members on Wednesday evening and the resolution is expected to be adopted next week.
A draft of the resolution, which was obtained by The Cable, calls on states to "to respect and protect the right to privacy," asserting that the "same rights that people have offline must also be protected online, including the right to privacy." It also requests the U.N. high commissioner for human rights, Navi Pillay, present the U.N. General Assembly next year with a report on the protection and promotion of the right to privacy, a provision that will ensure the issue remains on the front burner.

Publicly, U.S. representatives say they're open to an affirmation of privacy rights. "The United States takes very seriously our international legal obligations, including those under the International Covenant on Civil and Political Rights," Kurtis Cooper, a spokesman for the U.S. mission to the United Nations, said in an email. "We have been actively and constructively negotiating to ensure that the resolution promotes human rights and is consistent with those obligations."
But privately, American diplomats are pushing hard to kill a provision of the Brazilian and German draft which states that "extraterritorial surveillance" and mass interception of communications, personal information, and metadata may constitute a violation of human rights. The United States and its allies, according to diplomats, outside observers, and documents, contend that the Covenant on Civil and Political Rights does not apply to foreign espionage.
In recent days, the United States circulated to its allies a confidential paper highlighting American objectives in the negotiations, "Right to Privacy in the Digital Age -- U.S. Redlines." It calls for changing the Brazilian and German text so "that references to privacy rights are referring explicitly to States' obligations under ICCPR and remove suggestion that such obligations apply extraterritorially." In other words: America wants to make sure it preserves the right to spy overseas.
The U.S. paper also calls on governments to promote amendments that would weaken Brazil's and Germany's contention that some "highly intrusive" acts of online espionage may constitute a violation of freedom of expression. Instead, the United States wants to limit the focus to illegal surveillance -- which the American government claims it never, ever does. Collecting information on tens of millions of people around the world is perfectly acceptable, the Obama administration has repeatedly said. It's authorized by U.S. statute, overseen by Congress, and approved by American courts.
"Recall that the USG's [U.S. government's] collection activities that have been disclosed are lawful collections done in a manner protective of privacy rights," the paper states. "So a paragraph expressing concern about illegal surveillance is one with which we would agree."
The privacy resolution, like most General Assembly decisions, is neither legally binding nor enforceable by any international court. But international lawyers say it is important because it creates the basis for an international consensus -- referred to as "soft law" -- that over time will make it harder and harder for the United States to argue that its mass collection of foreigners' data is lawful and in conformity with human rights norms.
"They want to be able to say ‘we haven't broken the law, we're not breaking the law, and we won't break the law,'" said Dinah PoKempner, the general counsel for Human Rights Watch, who has been tracking the negotiations. The United States, she added, wants to be able to maintain that "we have the freedom to scoop up anything we want through the massive surveillance of foreigners because we have no legal obligations."
The United States negotiators have been pressing their case behind the scenes, raising concerns that the assertion of extraterritorial human rights could constrain America's effort to go after international terrorists. But Washington has remained relatively muted about their concerns in the U.N. negotiating sessions. According to one diplomat, "the United States has been very much in the backseat," leaving it to its allies, Australia, Britain, and Canada, to take the lead.
There is no extraterritorial obligation on states "to comply with human rights," explained one diplomat who supports the U.S. position. "The obligation is on states to uphold the human rights of citizens within their territory and areas of their jurisdictions."
The position, according to Jamil Dakwar, the director of the American Civil Liberties Union's Human Rights Program, has little international backing. The International Court of Justice, the U.N. Human Rights Committee, and the European Court have all asserted that states do have an obligation to comply with human rights laws beyond their own borders, he noted. "Governments do have obligation beyond their territories," said Dakwar, particularly in situations, like the Guantanamo Bay detention center, where the United States exercises "effective control" over the lives of the detainees.
Both PoKempner and Dakwar suggested that courts may also judge that the U.S. dominance of the Internet places special legal obligations on it to ensure the protection of users' human rights.
"It's clear that when the United States is conducting surveillance, these decisions and operations start in the United States, the servers are at NSA headquarters, and the capabilities are mainly in the United States," he said. "To argue that they have no human rights obligations overseas is dangerous because it sends a message that there is void in terms of human rights protection outside countries territory. It's going back to the idea that you can create a legal black hole where there is no applicable law." There were signs emerging on Wednesday that America may have been making ground in pressing the Brazilians and Germans to back on one of its toughest provisions. In an effort to address the concerns of the U.S. and its allies, Brazil and Germany agreed to soften the language suggesting that mass surveillance may constitute a violation of human rights. Instead, it simply deep "concern at the negative impact" that extraterritorial surveillance "may have on the exercise of and enjoyment of human rights." The U.S., however, has not yet indicated it would support the revised proposal.
The concession "is regrettable. But it’s not the end of the battle by any means," said Human Rights Watch’s PoKempner. She added that there will soon be another opportunity to corral America's spies: a U.N. discussion on possible human rights violations as a result of extraterritorial surveillance will soon be taken up by the U.N. High commissioner.

Right to Privacy in the Digital Age Here the document is.

Monday, November 25, 2013

CAMP, a different approach

CAMP protects users from malware binaries without requiring (a-priori) knowledge of the binary augmenting whitelists and blacklists with a content-agnostic reputation system.

CAMP is composed of two parts: client (Google Chrome Web Browser) and Google Servers where client connect to download blacklist, whitelist and sends a request to CAMP's reputation service.

How the client works

  1. The browser tries to determine if a download came from a malicious site by checking the download URL against a list of URLs known as "malware distribution" using Google's SafeBrowsing API.
  2. The browser checks locally against a dynamically updated list of trusted domains and trusted binary signers to determine if the downloads are benign.
  3. The browser extracts content-agnostic features from the download and sends a request to CAMP's reputation service for downloads that don't match any of the local lists. 
  4. If a malicious download is requested and detected, Google Chrome warning the users giving her two options: Block or Pass the download.
The features sends to Google CAMP Server will be:
  • The URL and IP of the server hosting the download.
  • Any referrer URL and IP encountered when starting the download.
  • The size of the download and her hash.
  • The signature attached to the download including the signer and any certificate chain leading to it.
  • The browser will never send the binary itself reducing the privacy impact.
How the CAMP servers works

The reputation system makes a decision based if either the URL or the content hash is known to be malicious for each client request.

"The reputation verdict is computed by a reputation metric calculated by a binary circuit from the client request and any reputation data that is referenced by the reputation system, , e.g. how many known benign or
malicious binaries are hosted on a given IP address, etc"


Google selected 2200 binaries unknown by VirusTotal and were processed on a single day. As you know, VirusTotal can check a file with more than 40 antivirus solutions. 

Of these 2200, 1100 were labeled malicious on a single day. They submitted the binaries to VirusTotal, and they waited 10 days.

After 10 days, 99% of the binaries detected by CAMP were detected by 20% or more of AV engines on VirusTotal. Only 12% of the binaries that they detected as clean were also detected by 20% or more of the AV engines.

"The URL classification services mostly disagreed with CAMP when presented with the set of malicious URLs. Trend-Micro identified about 11% as malicious, Safe Browsing about 8.5%, Symantec about 8% and Site Advisor about 2.5%. The Malware Domain List did not flag any of them as malicious. However, as with the benign URLs, many of the malicious URLs were not known to the web services. For example, TrendMicro did not know 65% of the URLs that CAMP found to be malicious."

Our personal conclusion

In our opinion, CAMP is a really interesting project and has a new approach to fight with malware but we think it is not an Antivirus as we know it. Camp cannot avoid infection from malicious attachments send by mails, USB infections, frontal attacks using exploits...

Only works with Google Chrome... What's happen if we use Mozilla, Internet Explorer or Safari in our companies or our homes?

Despite this, we believe it is a great achievement to get prevent millions of infections and it is a step forward in the fight against malware

Understanding the Various Compression, Encryption and Archive Formats

In computer term, archive is a single file that stores within itself different files and folders. There are several archive formats available and each comes with its own pros and cons. Some archive formats come with compression support (which makes your file size smaller) while others support encryption. Yes, and you guessed it, some archive formats do support both compression and encryption. Let’s find out more about the compression and encryption algorithms used and the various archive formats.
Compression algorithm is the method used by the archive to compress the files and make the overall file size smaller.
compression format and algorithm


Lempel–Ziv–Markov (LZMA) chain algorithm is a lossless data compression algorithm. LZMA uses a dictionary compression algorithm which makes use of complex data structures to encode one bit at a time.
LZMA2 is a container which contains both the uncompressed and LZMA-compressed data. It supports multi-threaded compression and decompression of data. It can also compress data that is not compressible with other compression algorithms.

2. Burrows-Wheeler Transform Algorithm (BWT)

BWT works by permuting a string of text in order and then compress them by replacing the repeating characters into symbols.

3. PPM

Prediction by partial matching (PPM) is a statistical data compression method which works by using set of previous symbols in the uncompressed symbol stream to predict the next symbol in the stream.

4. Deflate

Deflate is a popular data compression algorithm which uses a combination of LZ77 and Huffman coding algorithms to compress data (combining LZMA and PPM algorithms to produce more compression). Since Deflate does not contain implementations restricted by patents, it has become very popular and is widely used, especially in Linux.
Now let’s go through some of the popular encryption methods:

1. DES

Data Encryption Standard uses private secret keys to encrypt and decrypt data. The secret key is selected randomly from a 56 to 64-bit address space.

2. AES

Advanced Encryption Standard is an encryption algorithm used by the US agencies to secure sensitive data. You can encrypt data using 128, 192 and 256 bits of encryption. AES uses a symmetric key algorithm which means that a common key is used for encrypting and then decrypting the data.

3. Blowfish

Blowfish encryption algorithm encrypts the archives with a 64-bit block size and a variable key length of 32 to 448-bits.
Note: There are several other encryption algorithms but the above-mentioned three are the most used ones.
There are various archive formats available. Below, we will evaluate each archive format using three parameters – whether it supports compression and encryption, which Operating System and software is available for its usage.

1. Tar

Tape Archive (Tar) is one of the oldest archive formats. Initially, it was used to combine and write data to sequential tape drives but was later standardized as a compression format. Tar is mostly used in Linux and it doesn’t support compression or encryption. You can also use it on Windows with installation of additional software. Most of the modern archiving utilities support this format. The exceptions include Disk Archiver and KGB Archiver.

2. GZ

GZ or GZip is one of the most popular compression formats used in both Windows and Linux. GZip used the Deflate compression algorithm to compress the archived files. GZip also supports multi-part file transfers meaning that you can create smaller parts of a large GZip file for easy sharing and transfer. Since GZip is quite popular, most of the modern archiving utilities have support for compressing and decompressing files using the GZip format including 7-Zip, BetterZip, PKZip, WinZip and WinRAR.

3. BZ/BZ2

BZ is very similar to GZ but uses Burrows-Wheelers Transform Algorithm, which results in a little more compression and smaller file size. Although the compression is slow, decompression is quite fast. Most of the software which support GZ also support BZ.

4. Zip

Zip is probably the most well-known and used archiving format. Zip uses the Deflate algorithm and supports lossless compression. It also supports AES and DES encryption. Most modern Operating Systems come with built-in support for Zip format, so you don’t need a separate software for archiving and un-archiving Zip files.

5. 7Z

7Z archiving format was introduced with a free and open source utility called 7-Zip. It is the most advanced general compression and archiving format which supports most of the data compression and encryption algorithms, including the ones we have discussed above. 7Z format compresses the files more than any other format but is relatively slower in processing. Another limitation is that the 7-Zip software is only available for Windows. There is no visual support for Mac or Linux. 7Z also supports multi-part archiving.

6. RAR

RAR is a proprietary archiving format. While it can be read and extracted by other utilities like 7-Zip and WinZip, it can only be created using WinRAR utility. RAR was the most popular format for multi-part archiving before 7Z was released. Now 7Z can do the same task for free which RAR does by making its users pay for the WinRAR software. RAR supports AES encryption.
Here are some of the relatively lesser known formats:
XZ is a lossless data compression format which uses LZMA2 compression algorithm. It can be thought as a stripped down version of 7Z.
LHA, previously known as LHarc, is primarily used for compressing installation files and games (mostly used in Japan). Interestingly, the Japanese version of Windows 7 comes with the built-in support for LHA archives.
ACE is a proprietary data compression archive file format which was a competitor to RAR format in the early days of 2000.
StuffIt was primarily released for Mac but versions for Windows, Linux and Solaris were released afterwards. This is a proprietary compression format used by StuffIt utilities.
In Linux, the most commonly used format is gz (or tar.gz), followed by bz, whereas in Windows or Mac, the most commonly used format is Zip. For cross-platform compatibility, Zip format is the one to go for. If you want features like security, high compression and multi-part archiving, go for 7Z format. RAR is similar to 7Z except that it comes with a price tag. Avoid it as much as possible.

Which file format and utility do you use for compression?

Wednesday, November 20, 2013

How to Avoid Getting Arrested

The tips are more psychological than security.

Tuesday, November 19, 2013


Fokirtor is a Linux Trojan that exfiltrates traffic by inserting it into SSH connections. It looks very well-designed and -constructed.

Monday, November 18, 2013

Explaining and Speculating About QUANTUM

Nicholas Weaver has a great essay explaining how the NSA's QUANTUM packet injection system works, what we know it does, what else it can possibly do, and how to defend against it. Remember that while QUANTUM is an NSA program, other countries engage in these sorts of attacks as well. By securing the Internet against QUANTUM, we protect ourselves against any government or criminal use of these sorts of techniques.

How to improve IRCTC is the only available option for people to book train tickets online in India. Even though there are other good websites like, they are not allowed to book tickets until 12 noon.
The website is poorly engineered and can’t handle the load it recieves, esp. around 10:00AM when the tatkal tickets start. The sites stops responsding in the middle of ticket booking and it dones’t allow the user to press browser back button and retry.

The numbers

Lets try to see how many requests gets per second.
There are about 5000 trains in India. Assuming 20 coaches per train and 100 seats per coach, it is 10 million people travelling per day. Assuming all of them are booking tickets online, there are about 10M reservations/day or 115 reservations/second. These numbers are on the higher side and the actual numbers will be far less than this.
Assuming 4 page views per reservation, it is about 500 requests/second.

How to improve

Here are my suggestions to improve the performance and stability of irctc website without modifying too much of the existing setup.

1. Serve static resources using a different webserver

It takes very very long time to load even the static resources like images, css and javascript. They are not main part of the app and can be moved out to a new webserver.
It’ll also be helpful to make the homepage a static resource and let it be served like other static resources.
I suggest using Nginx. Nginx is light weight webserver and it is very efficient in serving static files.

2. Make searching for trains independent of ticket booking

Booking the ticket is very small part of the whole process. Most of the time is spent in finding which train to take. This can be handled completely independently. The information about trains and stations is small enough to keep in memory completely.
Since this data is not very big, most of this can be packed into javascript and station autocomplete and searching for trains can be completely implemented in Javascript. That will reduce the load on the server substantially.

3. Cache the availability info in memory

Cache the seat availability info in memory.
The seat availability information changes so fast that it is fine to show slightly stale info. By the time, the user fills in the passenger details, the availabilty will change anyway.
To make sure the availability info is not too stale, run a background job to update it frequenty or update it whenever a ticket is booked.
This will make sure that searching for trains and their availability will not hit the reservation system.

Friday, November 15, 2013

Security Tents

The US government sets up secure tents for the president and other officials to deal with classified material while traveling abroad.
Even when Obama travels to allied nations, aides quickly set up the security tent -- which has opaque sides and noise-making devices inside -- in a room near his hotel suite. When the president needs to read a classified document or have a sensitive conversation, he ducks into the tent to shield himself from secret video cameras and listening devices.

Following a several-hundred-page classified manual, the rooms are lined with foil and soundproofed. An interior location, preferably with no windows, is recommended.

Thursday, November 14, 2013

A Fraying of the Public/Private Surveillance Partnership

The public/private surveillance partnership between the NSA and corporate data collectors is starting to fray. The reason is sunlight. The publicity resulting from the Snowden documents has made companies think twice before allowing the NSA access to their users' and customers' data.
Pre-Snowden, there was no downside to cooperating with the NSA. If the NSA asked you for copies of all your Internet traffic, or to put backdoors into your security software, you could assume that your cooperation would forever remain secret. To be fair, not every corporation cooperated willingly. Some fought in court. But it seems that a lot of them, telcos and backbone providers especially, were happy to give the NSA unfettered access to everything. Post-Snowden, this is changing. Now that many companies' cooperation has become public, they're facing a PR backlash from customers and users who are upset that their data is flowing to the NSA. And this is costing those companies business.
How much is unclear. In July, right after the PRISM revelations, the Cloud Security Alliance reported that US cloud companies could lose $35 billion over the next three years, mostly due to losses of foreign sales. Surely that number has increased as outrage over NSA spying continues to build in Europe and elsewhere. There is no similar report for software sales, although I have attended private meetings where several large US software companies complained about the loss of foreign sales. On the hardware side, IBM is losing business in China. The US telecom companies are also suffering: AT&T is losing business worldwide.
This is the new reality. The rules of secrecy are different, and companies have to assume that their responses to NSA data demands will become public. This means there is now a significant cost to cooperating, and a corresponding benefit to fighting.
Over the past few months, more companies have woken up to the fact that the NSA is basically treating them as adversaries, and are responding as such. In mid-October, it became public that the NSA was collecting e-mail address books and buddy lists from Internet users logging into different service providers. Yahoo, which didn't encrypt those user connections by default, allowed the NSA to collect much more of its data than Google, which did. That same day, Yahoo announced that it would implement SSL encryption by default for all of its users. Two weeks later, when it became public that the NSA was collecting data on Google users by eavesdropping on the company's trunk connections between its data centers, Google announced that it would encrypt those connections.
We recently learned that Yahoo fought a government order to turn over data. Lavabit fought its order as well. Apple is now tweaking the government. And we think better of those companies because of it.
Now Lavabit, which closed down its e-mail service rather than comply with the NSA's request for the master keys that would compromise all of its customers, has teamed with Silent Circle to develop a secure e-mail standard that is resistant to these kinds of tactics.
The Snowden documents made it clear how much the NSA relies on corporations to eavesdrop on the Internet. The NSA didn't build a massive Internet eavesdropping system from scratch. It noticed that the corporate world was already eavesdropping on every Internet user -- surveillance is the business model of the Internet, after all -- and simply got copies for itself.
Now, that secret ecosystem is breaking down. Supreme Court Justice Louis Brandeis wrote about transparency, saying "Sunlight is said to be the best of disinfectants." In this case, it seems to be working.
These developments will only help security. Remember that while Edward Snowden has given us a window into the NSA's activities, these sorts of tactics are probably also used by other intelligence services around the world. And today's secret NSA programs become tomorrow's PhD theses, and the next day's criminal hacker tools. It's impossible to build an Internet where the good guys can eavesdrop, and the bad guys cannot. We have a choice between an Internet that is vulnerable to all attackers, or an Internet that is safe from all attackers. And a safe and secure Internet is in everyone's best interests, including the US's.

Gmail Automation: 5 Useful Google Scripts to Automate Your Gmail

Gmail, by itself, is already a very powerful email client. With the help of filter, you can even set up automation to better organize your inbox. However, for power user, the filter is not sufficient. Here are 5 Google scripts that you can use to further automate your Gmail.

Very often, after we read the email, we will just keep it in our inbox, regardless whether it is useful or not. While Google gives you tons of space to store your emails, you might still want to clean up your inbox and get rid of those useless emails. The following script can check emails with the “Delete Me” label and delete them after “x” number of days.
1. Go to Google Scripts and create a blank project (make sure you are logged into your Google account).
Paste the following script and save it.
function auto_delete_mails() {  
  var label = GmailApp.getUserLabelByName("Delete Me");  
  if(label == null){
    GmailApp.createLabel('Delete Me');
    var delayDays = 2 // Enter # of days before messages are moved to trash   
    var maxDate = new Date(); 
    var threads = label.getThreads();  
    for (var i = 0; i < threads.length; i++) {  
      if (threads[i].getLastMessageDate()<maxDate){  
You can change the number of days (under delayDays) to pass before it deletes that email from your inbox. Set a trigger (Resources -> Current Project’s Triggers -> Add one now) to run it daily.
Once activated, it will create a label “Delete Me” in your Gmail account. All you have to do is to tag the unwanted emails with this label and they will be deleted after the expiry day (as set in delayDays.
Sometime, after reading an email, you want it to return to your inbox after a few days. With the following Google script, you can do so:
1. Create a new Google script with the following code:
var MARK_UNREAD = true;
function getLabelName(i) {
  return "Snooze/Snooze " + i + " days";
function setup() {
  // Create the labels we’ll need for snoozing
  for (var i = 1; i <= 7; ++i) {
function moveSnoozes() {
  var oldLabel, newLabel, page;
  for (var i = 1; i <= 7; ++i) {
    newLabel = oldLabel;
    oldLabel = GmailApp.getUserLabelByName(getLabelName(i));
    page = null;
    // Get threads in "pages" of 100 at a time
    while(!page || page.length == 100) {
      page = oldLabel.getThreads(0, 100);
      if (page.length > 0) {
        if (newLabel) {
          // Move the threads into "today’s" label
        } else {
          // Unless it’s time to unsnooze it
          if (MARK_UNREAD) {
          if (ADD_UNSNOOZED_LABEL) {
        // Move the threads out of "yesterday’s" label
Next, save it and run the “Setup” function. This will add several new label to your Gmails (such as “Snooze for 2 days”, “Snooze for 7 days” etc.) Lastly, just add a trigger for “moveSnoozes” to run everyday. Now, emails marked with the “Snooze” label will return to the inbox with unread status after the number of days have passed. (via Gmail blog)
This Google script make use of the Google Calendar’s SMS feature to send you SMS for important emails.
1. Create a new Google script with the following code:
function Gmail_send_sms(){
  var label = GmailApp.getUserLabelByName("Send Text");  
  if(label == null){
    GmailApp.createLabel('Send Text');
    var threads = label.getThreads();  
    var now = new Date().getTime();
    for (var i = 0; i < threads.length; i++) {  
      var message = threads[i].getMessages()[0];
      var from = message.getFrom();
      var subject = message.getSubject();
      CalendarApp.createEvent(subject, new Date(now+60000), new Date(now+60000), {location: from}).addSmsReminder(0);
2. Save it and set a trigger for it to run every 5 minutes.
3. Lastly, you have to set a filter to add the “Send Text” label to all important incoming emails. The script will scan your inbox every 5 minutes and when it detects an email with the “Send Text” label, it will create an immediate event in Google Calender which will then trigger the SMS.
Boomerang is one web service that you can use to schedule emails to send at a later date, but that requires you to install a browser extension. Gmail Delay Send is a Google Script that can do the same task.
1. Go to this link and click the “Install” link. Once you have authorized the script to access your Gmail, it will redirect you to another page where you can configure the script.
2. Once configured, you can then proceed to draft an email and include the future date/time for it to send and save it as draft with the “GmailDelaySend/ToSend” label.
If you have an email that you want to archive in Google Drive, you can use Google script to save it as PDF in your Google Drive account. The following script will save all the messages in an email thread as one PDF file in your Google Drive. If it comes with attachments, it will create a folder and store the messages and attachments within.
1. Create a new Google script with the following code:
function save_Gmail_as_PDF(){
  var label = GmailApp.getUserLabelByName("Save As PDF");  
  if(label == null){
    GmailApp.createLabel('Save As PDF');
    var threads = label.getThreads();  
    for (var i = 0; i < threads.length; i++) {  
      var messages = threads[i].getMessages();  
      var message = messages[0];
      var body    = message.getBody();
      var subject = message.getSubject();
      var attachments  = message.getAttachments();
      for(var j = 1;j<messages.length;j++){
        body += messages[j].getBody();
        var temp_attach = messages[j].getAttachments();
          for(var k =0;k<temp_attach.length;k++){
      // Create an HTML File from the Message Body
      var bodydochtml = DocsList.createFile(subject+'.html', body, "text/html")
      var bodyId=bodydochtml.getId();
      // Convert the HTML to PDF
      var bodydocpdf = bodydochtml.getAs('application/pdf');
      if(attachments.length > 0){
        var folder = DocsList.getFolder(subject);
        for (var j = 0; j < attachments.length; j++) {
2. Save it and set a trigger for it to run at regular interval. Whenver you want to save an email and its attachments to Google Drive, simply tag it with the “Save to PDF” label.
With Google Script, there are tons of things that you can do to your Gmail, Google Docs, Calendar and various Google Apps. If you have any other Google script that you use to make your life better, feel free to share them with us in the comment.

Microsoft Retiring SHA-1 in 2016

I think this is a good move on Microsoft's part:
Microsoft is recommending that customers and CA's stop using SHA-1 for cryptographic applications, including use in SSL/TLS and code signing. Microsoft Security Advisory 2880823 has been released along with the policy announcement that Microsoft will stop recognizing the validity of SHA-1 based certificates after 2016.
More news.
SHA-1 isn't broken yet in a practical sense, but the algorithm is barely hanging on and attacks will only get worse. Migrating away from SHA-1 is the smart thing to do.

When Will We See Collisions for SHA-1?

On a NIST-sponsored hash function mailing list, Jesse Walker (from Intel; ) did some back-of-the-envelope calculations to estimate how long it will be before we see a practical collision attack against SHA-1. I'm reprinting his analysis here, so it reaches a broader audience.
According to E-BASH, the cost of one block of a SHA-1 operation on already deployed commodity microprocessors is about 214 cycles. If Stevens' attack of 260 SHA-1 operations serves as the baseline, then finding a collision costs about 214 * 260 ~ 274 cycles. A core today provides about 231 cycles/sec; the state of the art is 8 = 23 cores per processor for a total of 23 * 231 = 234 cycles/sec. A server typically has 4 processors, increasing the total to 22 * 234 = 236 cycles/sec. Since there are about 225 sec/year, this means one server delivers about 225 * 236 = 261 cycles per year, which we can call a "server year."
There is ample evidence that Moore's law will continue through the mid 2020s. Hence the number of doublings in processor power we can expect between now and 2021 is:
3/1.5 = 2 times by 2015 (3 = 2015 - 2012) 6/1.5 = 4 times by 2018 (6 = 2018 - 2012)
9/1.5 = 6 times by 2021 (9 = 2021 - 2012)
So a commodity server year should be about:
261 cycles/year in 2012 22 * 261 = 263 cycles/year by 2015
24 * 261 = 265 cycles/year by 2018
26 * 261 = 267 cycles/year by 2021
Therefore, on commodity hardware, Stevens' attack should cost approximately:
274 / 261 = 213 server years in 2012 274 / 263 = 211 server years by 2015
274 / 265 = 29 server years by 2018
274 / 267 = 27 server years by 2021
Today Amazon rents compute time on commodity servers for about $0.04 / hour ~ $350 /year. Assume compute rental fees remain fixed while server capacity keeps pace with Moore's law. Then, since log2(350) ~ 8.4 the cost of the attack will be approximately:
213 * 28.4 = 221.4 ~ $2.77M in 2012 211 * 28.4 = 219.4 ~ $700K by 2015
29 * 28.4 = 217.4 ~ $173K by 2018
27 * 28.4 = 215.4 ~ $43K by 2021
A collision attack is therefore well within the range of what an organized crime syndicate can practically budget by 2018, and a university research project by 2021.
Since this argument only takes into account commodity hardware and not instruction set improvements (e.g., ARM 8 specifies a SHA-1 instruction), other commodity computing devices with even greater processing power (e.g., GPUs), and custom hardware, the need to transition from SHA-1 for collision resistance functions is probably more urgent than this back-of-the-envelope analysis suggests.
Any increase in the number of cores per CPU, or the number of CPUs per server, also affects these calculations. Also, any improvements in cryptanalysis will further reduce the complexity of this attack.
The point is that we in the community need to start the migration away from SHA-1 and to SHA-2/SHA-3 now.

Wednesday, November 13, 2013

Another QUANTUMINSERT Attack Example

Der Speigel is reporting that the GCHQ used QUANTUMINSERT to direct users to fake LinkedIn and Slashdot pages run by -- this code name is not in the article -- FOXACID servers. There's not a lot technically new in the article, but we do get some information about popularity and jargon.
According to other secret documents, Quantum is an extremely sophisticated exploitation tool developed by the NSA and comes in various versions. The Quantum Insert method used with Belgacom is especially popular among British and US spies. It was also used by GCHQ to infiltrate the computer network of OPEC's Vienna headquarters. The injection attempts are known internally as "shots," and they have apparently been relatively successful, especially the LinkedIn version. "For LinkedIn the success rate per shot is looking to be greater than 50 percent," states a 2012 document.
Slashdot has reacted to the story.

Cryptographic Blunders Revealed by Adobe's Password Leak

Adobe lost 150 million customer passwords. Even worse, they had a pretty dumb cryptographic hash system protecting those passwords.

One month ago today, we wrote about Adobe's giant data breach.
As far as anyone knew, including Adobe, it affected about 3,000,000 customer records, which made it sound pretty bad right from the start.
But worse was to come, as recent updates to the story bumped the number of affected customers to a whopping 38,000,000.
We took Adobe to task for a lack of clarity in its breach notification

Our complaint

One of our complaints was that Adobe said that it had lost encrypted passwords, when we thought the company ought to have said that it had lost hashed and salted passwords.
As we explained at the time:
[T]he passwords probably weren't encrypted, which would imply that Adobe could decrypt them and thus learn what password you had chosen.
Today's norms for password storage use a one-way mathematical function called a hash that [...] uniquely depends on the password. [...] This means that you never actually store the password at all, encrypted or not.
[...And] you also usually add some salt: a random string that you store with the user's ID and mix into the password when you compute the hash. Even if two users choose the same password, their salts will be different, so they'll end up with different hashes, which makes things much harder for an attacker.
It seems we got it all wrong, in more than one way.
Here's how, and why.

The breach data

A huge dump of the offending customer database was recently published online, weighing in at 4GB compressed, or just a shade under 10GB uncompressed, listing not just 38,000,000 breached records, but 150,000,000 of them.
As breaches go, you may very well see this one in the book of Guinness World Records next year, which would make it astonishing enough on its own.
But there's more.
We used a sample of 1,000,000 items from the published dump to help you understand just how much more.
→ Our sample wasn't selected strictly randomly. We took every tenth record from the first 300MB of the compressed dump until we reached 1,000,000 records. We think this provided a representative sample without requiring us to fetch all 150 million records.
The dump looks like this:
By inspection, the fields are as follows:
Fewer than one in 10,000 of the entries have a username - those that do are almost exclusively limited to accounts at and (a web analytics company).
The user IDs, the email addresses and the usernames were unnecessary for our purpose, so we ignored them, simplifying the data as shown below.
We kept the password hints, because they were very handy indeed, and converted the password data from base64 encoding to straight hexadecimal, making the length of each entry more obvious, like this:

Encryption versus hashing

The first question is, "Was Adobe telling the truth, after all, calling the passwords encrypted and not hashed?"
Remember that hashes produce a fixed amount of output, regardless of how long the input is, so a table of the password data lengths strongly suggests that they aren't hashed:
The password data certainly looks pseudorandom, as though it has been scrambled in some way, and since Adobe officially said it was encrypted, not hashed, we shall now take that claim at face value.

The encryption algorithm

The next question is, "What encryption algorithm?"
We can rule out a stream cipher such as RC4 or Salsa-20, where encrypted strings are the same length as the plaintext.
Stream ciphers are commonly used in network protocols so you can encrypt one byte at a time, without having to keep padding your input length to a multiple of a fixed number of bytes.
With all data lengths a multiple of eight, we're almost certainly looking at a block cipher that works eight bytes (64 bits) at a time.
That, in turn, suggests that we're looking at DES, or its more resilient modern derivative, Triple DES, usually abbreviated to 3DES.
→ Other 64-bit block ciphers, such as IDEA, were once common, and the ineptitude we are about to reveal certainly doesn't rule out a home-made cipher of Adobe's own devising. But DES or 3DES are the most likely suspects.
The use of a symmetric cipher here, assuming we're right, is an astonishing blunder, not least because it is both unnecessary and dangerous.
Anyone who computes, guesses or acquires the decryption key immediately gets access to all the passwords in the database.
On the other hand, a cryptographic hash would protect each password individually, with no "one size fits all" master key that could unscramble every password in one go - which is why UNIX systems have been storing passwords that way for about 40 years already.

The encryption mode

Now we need to ask ourselves, "What cipher mode was used?"
There are two modes we're interested in: the fundamental 'raw block cipher mode' known as Electronic Code Book (ECB), where patterns in the plaintext are revealed in the ciphertext; and all the others, which mask input patterns even when the same input data is encrypted by the same key.
The reason that ECB is never used other than as the basis for the more complex encryption modes is that the same input block encrypted with the same key always gives the same output.
Even repetitions that aren't aligned with the blocksize retain astonishingly recognisable patterns, as the following images show.
We took an RGB image of the Sophos logo, where each pixel (most of which are some sort of white or some sort of blue) takes three bytes, divided it into 8-byte blocks, and encrypted each one using DES in ECB mode.
Treating the resulting output file as another RGB image delivers almost no disguise:
Cipher modes that disguise plaintext patterns require more than just a key to get them started - they need a unique initialisation vector, or nonce (number used once), for each encrypted item.
The nonce is combined with the key and the plaintext in some way, so that that the same input leads to a different output every time.
If the shortest password data length above had been, say, 16 bytes, a good guess would have been that each password data item contained an 8-byte nonce and then at least one block's worth - another eight bytes - of encrypted data.
Since the shortest password data blob is exactly one block length, leaving no room for a nonce, that clearly isn't how it works.
Perhaps the encryption used the User ID of each entry, which we can assume is unique, as a counter-type nonce?
But we can quickly tell that Adobe didn't do that by looking for plaintext patterns that are repeated in the encrypted blobs.
Because there are 264 - close to 20 million million million - possible 64-bit values for each cipertext block, we should expect no repeated blocks anywhere in the 1,000,000 records of our sample set.
That's not what we find, as the following repetition counts reveal:
Remember that if ECB mode were not used, each block would be expected to appear just once every 264 times, for a minuscule prevalence of about 5 x 10-18%.

Password recovery

Now let's work out, "What is the password that encrypts as 110edf2294fb8bf4 and the other common repeats?"
If the past, all other things being equal, is the best indicator of the present, we might as well start with some statistics from a previous breach.
When Gawker Media got hacked three years ago, for example, the top passwords that were extracted from the stolen hashes came out like this:
(The word lifehack is a special case here - Lifehacker being one of Gawker's brands - but the others are easily-typed and commonly chosen, if very poor, passwords.)
This previous data combined with the password hints leaked by Adobe makes building a crib sheet pretty easy:
Note that the 8-character passwords 12345678 and password are actually encrypted into 16 bytes, denoting that the plaintext was at least 9 bytes long.
A highly likely explanation for this is that the input text consisted of: the password, followed by a zero byte (ASCII NUL), used to denote the end of a string in C, followed by seven NUL bytes to pad the input out to a multiple of 8 bytes to match the encryption's block size.
In other words, we are on safe ground if we infer that e2a311ba09ab4707 is the ciphertext that signals an input block of eight zero bytes.
That data shows up in the second ciphertext block in a whopping 27% of all passwords, which, if our assumption is correct, immediately leaks to us that all those 27% are exactly eight characters long.

The scale of the blunder

With very little effort, we have already recovered an awful lot of information about the breached passwords, including: identifying the top five passwords precisely, plus the 2.75% of users who chose them; and determining the exact password length of nearly one third of the database.
So, now we've showed you how to get started in a case like this, you can probably imagine how much more is waiting to be squeezed out of "the greatest crossword puzzle in the history of the world," as satirical IT cartoon site XKCD dubbed it.
Bear in mind that salted hashes - the recommended programmatic approach here - wouldn't have yielded up any such information - and you appreciate the magnitude of Adobe's blunder.
There's more to concern youself with.
Adobe also decribed the customer credit card data and other PII (Personally Identifiable Information) that was stolen in the same attack as "encrypted."
And, as fellow Naked Security writer Mark Stockley asked, "Was that data encrypted with similar care and expertise, do you think?
If you were on Adobe's breach list (and the silver lining is that all passwords have now been reset, forcing you to pick a new one), why not get in touch and ask for clarification?