This post is the 5th in a series that started with the 10 practices that every developer needs to start right now
When writing software, we often don’t think about the security implications of our actions. Probably because we write software to do something, we’re not always aware of what it shouldn’t do. Their are a lot of guidelines for writing secure code, and designing secure systems. Rather than going in to all of the areas, let me just hit on some of the especially important topics that I’ve come across…
In addition to this post, I’ve included a slide deck that I use when I give talks about writing secure code. A lot of the original slides I got from a talk that Ron Jacobs did at TechEd. I hope you enjoy both!
Buffer Overflows and Overruns
OK… so I’m mostly going to deal with issues that affect .NET developers. .NET prevents Buffer overflows by not giving your code direct access to memory addresses and instead by managing memory access for you and by making sure that everything is type safe.
Here’s my non-technical version of what a Buffer Overflow is. First, a Buffer overflow is something that affects unmanaged code (or unsafe C#). Let’s say that a memory address is designed to hold 9 bits of user input, and instead the user forces 10 bits a information in to it. Normally, the last bit of memory is a return address and tells the code where to go next. In a Buffer overflow attack, a different return address is forced in to that last slot so that the attack can control the flow of the code.
For example, the code might say something like, “If the user is not authorized return to login” and instead the attack forces a return code so that it ends up doing something like this “If the user is not authorized go to the bank account withdrawal screen”. By simply changing the flow of an application, and attacker can do really bad things.
- Use .NET (and get out of that unmanaged C++ code )
- Use safe libraries. Many of the C++ common libraries have been re-written to help prevent Buffer Overflow exposure. Make sure that you are using the updated libraries.
- Check out the “Banned.h” header file from Microsoft. It’s is a sanitizing resource which supports the SDL requirement to remove banned functions from a code. It lists all banned APIs Download.
- Use the /GS Compiler switch. This was introduced by Microsoft to automatically add safety checking to your code when it compiles.
XSS is an abbreviation for Cross Site Scripting attack. (I know, but CSS was already taken ) XSS attacks are something that has affected every major web site at one time or another.
Background: When you connect to a website, like amazon or ebay (or any other site that you log in to) it often uses a session cookie to know who you are, and what you are allowed to see (your account info for example). Cookies are not a problem in and of them selves, in fact, your browser makes sure that it only send cookies to the web site that it was issued from. See – your browser trusts the site that you are on.
How it works: Have you ever searched for a random product on a site, like foo, and received a response message that said something like “your search for foo was not found.”? Try searching for “<b>foo</b>”. What happened? If the message looks like this: “your search for foo was now found.”, then they are probably not sufficiently checking the user input. Now image searching for this:
Fix: All user input is considered evil until proven otherwise. The problem is that we haven’t traditionally considered search forms and product reviews as user input in to our systems, but they are. You can scrub user input easily enough by doing something like string SafeToDisplay = Server.HTMLEncode(userInput); but really you should look at incorporating some of the libraries that are specifically designed to handle these scenarios. Check out Microsoft’s Anti-Cross Site Scripting Library, it’s very comprehensive and covers many more scenarios.
All user input should be considered evil until proven otherwise. This has never been more true than it is with SQL Injection vulnerabilities.
How it works: Imagine that you have an application with a log in: User name and password.
Pretty simple so for huh? Now think about the SQL that you would write to validate a user… It *might* start off looking something like this:
So far so good… as long as everyone enters a user name and password in to the correct textbox on the screen this should validate them perfectly…
the problem here is that the text from the form is now being executed as part of the SQL statement itself. SQL injection just allowed this person to operate this application with the username BillG… I’m sure that wouldn’t be a problem!
Not just Log In Screens. Any place that user input is translated in to a query to the database is open to attack. Search fields are a notoriously overlooked place for SQL Injections, and not just for logging, at this point the attack can do anything that the application can do. Even worse, many application run as SA (Sql Administrator) just to make “life easy” on the developer. That opens up a whole new problem. Imagine a random user being able to log in to your system, add themselves as an administrator, shut down your server, rewrite your website, reformat your hard rive all from a search box. See the problem?
Fix: the fix is easy, don’t let user input run as SQL. You can prevent SQL Injection by moving away from concatenated string for building sql queries. if you need the flexibility of ad-hoc sql, then write your adhoc sql using parameterized SQL. Otherwise you can move to stored procedures or an ORM like Linq to SQL, Microsoft’s Entity Framework, or nHibernate that will automatically use parameterized sql for you.
White List vs Black List Principle
One thing that I want to call out at this point. It is very tempting to try and “sanitized” every user input instead of moving to one of the more robust solutions mentioned above. I had a friend (a very good developer) that was sanitizing all user input for “bad” words before he would process it. In his words, “we don’t have any products called drop, delete, execute… so I should be able to do a string.Replace on those words and then be fine.
Here’s the problem with that. In Security there is a concept of White List vs Black Lists. A white list approach says, here is what I will allow, and throw away anything else. A Black list take the approach that says, “here is what I won’t allow, I’ll let in anything else that’s not on this list.” The problem with the black list approach is that security is a moving target, there are vulnerabilities today that we didn’t know about yesterday, there will be more tomorrow that I don’t know about today. Just because something isn’t on my “bad” list today, doesn’t mean that it shouldn’t be.
I went to my friends bad word scrubber and entered this : ‘deldeleteete” do you see the word delete in there? What will happen after your scrubber removes it… you’ll be left with “delete”. He started using parameterized SQL.
I once interviewed a really smart computer science guy that wanted to come work for our consulting company (primarily focus on business applications). Saying this guy was smart is an understatement. He was a Computer Science PhD candidate with cross disciplines in artificial intelligence and game theory. wow! The problem was that he had very little knowledge or experience writing actual applications. When I asked him about if we should write out own encryption for some application that we were working on, he got all excited and started to go in the details about what it would take to implement out own encryption. I’m pretty sure he had taken a class on this, wrote some thesis on it or something because he was really excited that I had asked him about this topic. Here the thing, never write your own encryption.
Getting encryption done right is hard.. like really hard. In fact, if you are good at it, maybe you should go work for the government, or a university, or RSA directly, but you have no business trying to do that for a business application. Use the tried and true, multiple encryption, publically available libraries to do it right.
3-Types of encryption.
Private Private Key, also known as a symmetrical encryption uses the same key to encrypt and unencrypt. Symmetrical encryption is very fast, so it’s great for encryption transmissions and it used for things like secure communication and SSL. the problem is that it’s less secure because you have to have a secure way to hand out the private key.
Public Private Key, also known as asymmetrical encryption uses two different keys. One key (the public key) is used for encryption, the private key is used for decryption. The benefit here, is that you can yell out for the world to here your public key, but only the person with the private key can do the decrypting. The problem is that it’s very slow and computationally expensive.
So how can two computers talk securely to each other in an open environment like the Internet? The answer is a combination of the above. SSL, or secure socket layer uses a public key to securely transmit a “session” key that will be used for symmetrical encryption for the rest of the communication.
1-Way Hashes cannot be encrypted. How is that helpful? It’s very helpful. A hash can me used to make sure that two values are equal without actually knowing what the values are. For example, my application should never store plain text passwords. If they did it might be possible for those passwords to become compromised. By storing a 1-way hash instead the password cannot be retrieved even if the database (or a backup of the database) is compromised. How do I log user in to the system then? Simple, I take the password they give me, I has it using the same method and compare the two results.
Digital Certificates – Digital Certificate use a combination of the above concepts to support secure communication and identification. We’re just not going to go in to all of that now.
Least Privilege Principle
Reduce your Attack Surface – if you don’t need a service, turn it off. If your application doesn’t need permission to do something, don’t supply it. By limiting the scope of what can be done, you also limit what can be broken if and when things go bad.
Default to Fail – Here’s an example.
What’s wrong with this code? (ok, a lot ) But what we want to focus on is that if there is an exception along the way, it will default to the user being authenticated. The [valid] should have been set to false, until proven otherwise.
Don’t reveal more than is helpful to the user. Be helpful, but you don’t have to thrown up every SQL exception on your users.. log that stuff, let the debug team look at it, but knowing what version of SQL server you’re running or what the stack address is completely useless to your users… but bad people love that stuff.
Don’t give away your system internals, at the same time make sure that your user errors are helpful!
OK.. as you can imagine there’s a lot more that we could cover, but instead take a look at some of these resources.
Threat Modeling Tool – Threat modeling is all about identifying assets, vulnerabilities, resolutions, evaluating a business value on assets, then balance the cost of a resolution with real business value.
SDL Process Template for Team System – SDL is the Security Development Lifecycle, it’s a set of practices and tools that help integrate secure development in to all aspects of SDLC (Software Development Life Cycle). This is Microsoft’s template to integrate SDL with Team Foundation Server (Microsoft’s Application Lifecycle Management Server).
SDL Process Tools – There’s a ton here, check it out.
MSDN Security – Read the blogs, latest news, and downloads regarding Microsoft security and development.
Happy Coding (securely)!
Images Credit: http://www.flickr.com/photos/carbonnyc/2294144289/