Code Review Guidelines (draft)
A few weeks back, I posted a request for source code review guidelines. I got about 50 me-toos, but no guidelines. So I wrote some I think are decent. They're still in draft format. I'd appreciate feedback & commentary on them. http://www.homeport.org/~adam/review.html Adam PS: Someone did pay me to do this, but doesn't want their name associated with it, because there are shoulds and musts in it. I have their permission to post the anonymized document. -- "It is seldom that liberty of any kind is lost all at once." -Hume
From: ichudov@algebra.com (Igor Chudov @ home) Date: Tue, 27 Aug 1996 11:20:56 -0500 (CDT)
Look at your sendmail.cf file for a humongous amount of email parsing rules.
Much better, look at rfc822. (I wouldn't consider *anything* that has the word "sendmail" in it a good reference). Hostnames will match the regexp [-A-Za-z0-9.]; those are the only legal characters in the hostname portion. Usernames ("domain-dependent local string") are much harder to what is and isn't legal. Read rfc822.
Hi,
Much better, look at rfc822. (I wouldn't consider *anything* that has the word "sendmail" in it a good reference).
its much better if you dont rely on the content of the string at all. Dont use sh -c or system and you will be save. Simply asume that all characters are valid in user suplied strings and treat them exactly that way... If they need to be exporeted then unfortunately they need to be 'untainted' and this should be done by positive not negative lists as mentioned in the guidelines. Greetings Bernd PS: I have collected the references on http://www.inka.de/sites/lina/freefire-l/ -- (OO) -- Bernd_Eckenfels@Wittumstrasse13.76646Bruchsal.de -- ( .. ) ecki@{lina.inka.de,linux.de} http://home.pages.de/~eckes/ o--o *plush* 2048/A2C51749 eckes@irc +4972573817 *plush* (O____O) If privacy is outlawed only Outlaws have privacy
Bernd Eckenfels wrote: | > Much better, look at rfc822. (I wouldn't consider *anything* that | > has the word "sendmail" in it a good reference). | | its much better if you dont rely on the content of the string at all. Dont | use sh -c or system and you will be save. Simply asume that all characters | are valid in user suplied strings and treat them exactly that way... If they | need to be exporeted then unfortunately they need to be 'untainted' and this | should be done by positive not negative lists as mentioned in the | guidelines. Not passing untainted data to system is clearly a very good idea. Less clear is how much other parsing should be done. I like extreme parsing (when its cheap; as Marcus Watts pointed out, verifying remote hostnames & usernames can be expensive). If you don't strongly verify data on the way in, it can get to other places not so careful about its contents. This is why I chose to recommend against accepting a wide variety of legit email address formats; because they will be passed back to a database that assumes that the addresses have been sanitized, and are in user@foo.net format, and not treat them with the care they deserve. In an ideal world, programmers would be careful with the data they get, but we don't live in an ideal world. I choose to suggest paranoia over inclusiveness, but am adding an appendix discussing issues of mail addressing. Adam -- "It is seldom that liberty of any kind is lost all at once." -Hume
Adam Shostack wrote:
A few weeks back, I posted a request for source code review guidelines. I got about 50 me-toos, but no guidelines. So I wrote some I think are decent. They're still in draft format. I'd appreciate feedback & commentary on them.
Thanks for an interesting paper. In part " V.Code (Security Issues)/3.Data Checking" you say the following: `` Data coming in to Acme Widgets should be checked very carefully for appropriateness. This check should be to see if the data is what is expected (length, characters). Making a list of bad characters is not the way to go; the lists are rarely complete. A secure program should know what it expects, and reject other input. (For example, if you are looking for an email address, don't check to see if it contains a semi-colon or a newline, check to see if it contains anything other than a [A-Za-z0-9._] followed by an @, followed by a hostname [A-Za-z0-9._].)'' END QUOTE That is not entirely correct. An email address is much more than that, it can contain "!", several "@" characters (not next to each other though), "%", and so on. x400 mail addresses (?) can contain "/", "=", and all emails can have "+" and "-" and "_" in them. Some of the valid email addresses are user_name@company.com alex+@pitt.edu mi%aldan.UUCP@algebra.com user%host.domain@anon.penet.fi host1!host2!user Look at your sendmail.cf file for a humongous amount of email parsing rules. Thanks for an excellent document though, I put a link to it from my intranet page. - Igor "Code Obscurity Creates Job Security" Chudov.
On Tue, 27 Aug 1996, Igor Chudov @ home wrote:
Adam Shostack wrote:
A few weeks back, I posted a request for source code review guidelines. I got about 50 me-toos, but no guidelines. So I wrote some I think are decent. They're still in draft format. I'd appreciate feedback & commentary on them.
Sorry. I missed your first post. The Security Engineering CMM effort has also been looking at methods that are used to create assurances in trusted systems/components/products. One of these is, of course, code examination and quality reviews. You may want to check out what they've done. There are not necessarily "steps" to be followed, but rather how the PA (process area) relates to the ability of an organization to perform security engineering (i.e., it's maturity). I haven't been in the PA's for awhile, but there *may* be something there that you can use. GRCI sits on both the authoring group and the steering committee for the SSE CMM. If you need more info, let me know and I'll hook you up with someone. The group is always looking for someone to test the implementation of the security engineering CMM products through pilot testing. Point your browser at http://www.ssecmm.ashton.csc.com/ and then rummage. There's stuff buried all over the server, but you probably will be most interested in the peer review, security vulnerability analysis, and quality management portions. As I recall (I can't get to the site right now), a lot of stuff is in RTF and not HTML, so you may have to DL it instead of look at it online. ------------------------------------------------------------------------- |And if Dole wins and dies in office, they| Mark Aldrich | |could just pickle him and no one would | GRCI INFOSEC Engineering | |notice. It wouldn't be the first time we| maldrich@grci.com | |had a dill-dole running the country. | MAldrich@dockmaster.ncsc.mil| | -- Alan Olsen | | |_______________________________________________________________________| |The author is PGP Empowered. Public key at: finger maldrich@grci.com | | The opinions expressed herein are strictly those of the author | | and my employer gets no credit for them whatsoever. | -------------------------------------------------------------------------
Igor, and many others who commented on the fact that many characters are legal in email are correct. However, with the exception of '-' and '+', I'm not sure if I'll be changing the body of the guidelines. My issue is that dealing with a wide variety of characters that are legitamate, such as "cat ../../../etc/passwd"@foo.com is more dangerous than only accepting the common case of user@host.net. The number of addresses such as harvard!adam is dropping as the number of 'normal' addresses grows. Igor Chudov @ home wrote: | Adam Shostack wrote: | > http://www.homeport.org/~adam/review.html | In part " V.Code (Security Issues)/3.Data Checking" you say the following: | | `` Data coming in to Acme Widgets should be checked very carefully for | appropriateness. This check should be to see if the data is what | is expected (length, characters). Making a list of bad | characters is not the way to go; the lists are rarely complete. | A secure program should know what it expects, and reject other | input. (For example, if you are looking for an email address, | don't check to see if it contains a semi-colon or a newline, | check to see if it contains anything other than a [A-Za-z0-9._] | followed by an @, followed by a hostname [A-Za-z0-9._].)'' | END QUOTE | | That is not entirely correct. An email address is much more than | that, it can contain "!", several "@" characters (not next to each other | though), "%", and so on. x400 mail addresses (?) can contain "/", "=", | and all emails can have "+" and "-" and "_" in them. | | Some of the valid email addresses are | | user_name@company.com | alex+@pitt.edu | mi%aldan.UUCP@algebra.com | user%host.domain@anon.penet.fi | host1!host2!user | | Look at your sendmail.cf file for a humongous amount of | email parsing rules. | | Thanks for an excellent document though, I put a link to it from my | intranet page. You're welcome. | - Igor "Code Obscurity Creates Job Security" Chudov. | Adam -- "It is seldom that liberty of any kind is lost all at once." -Hume
Adam, The decision that have just made is not a technical decision, it is a business decision. You just decided that the needs of security outweight the need to be able to deal with 100% of potential customers. For example, suppose that you wrote your report for Gizmo International, a company that sells a variety of widgets and gadgets to users in the world. Their current setup is that the users can visit www.gizmo.com and ask the server to send them notifications about new products. Based on your report's suggestions, Gizmo will have to cut off all users with x.400 mail addresses, all UUCP users with bangs in their addresses, all people with funky addresses provided by SPRINT, and so on. For example, my moderation bot received a message from the following person: From: /G=JAMBYL/S=KIWANIS/O=CUSTOMER/ADMD=KAZMAIL/C=KZ/@gateway.sprint.com (my eyes just popped when I saw such address) There are a lot of international people using this sprint gateway. This would potentially represent a loss of s significant number of customers who will be bitching about gizmo.com to all their friends. This is a bad decision from the marketing standpoint. I see this as a compelling reason to allow all possible email addresses to be processed correctly, even if it means that there is more work for code proofreading. At least the management responsible for marketing must understand and approve your email handling guidelines. A computer programmer cannot make such decisions himself. igor Adam Shostack wrote:
Igor, and many others who commented on the fact that many characters are legal in email are correct. However, with the exception of '-' and '+', I'm not sure if I'll be changing the body of the guidelines. My issue is that dealing with a wide variety of characters that are legitamate, such as "cat ../../../etc/passwd"@foo.com is more dangerous than only accepting the common case of user@host.net.
The number of addresses such as harvard!adam is dropping as the number of 'normal' addresses grows.
Igor Chudov @ home wrote: | Adam Shostack wrote: | > http://www.homeport.org/~adam/review.html
| In part " V.Code (Security Issues)/3.Data Checking" you say the following: | | `` Data coming in to Acme Widgets should be checked very carefully for | appropriateness. This check should be to see if the data is what | is expected (length, characters). Making a list of bad | characters is not the way to go; the lists are rarely complete. | A secure program should know what it expects, and reject other | input. (For example, if you are looking for an email address, | don't check to see if it contains a semi-colon or a newline, | check to see if it contains anything other than a [A-Za-z0-9._] | followed by an @, followed by a hostname [A-Za-z0-9._].)'' | END QUOTE | | That is not entirely correct. An email address is much more than | that, it can contain "!", several "@" characters (not next to each other | though), "%", and so on. x400 mail addresses (?) can contain "/", "=", | and all emails can have "+" and "-" and "_" in them. | | Some of the valid email addresses are | | user_name@company.com | alex+@pitt.edu | mi%aldan.UUCP@algebra.com | user%host.domain@anon.penet.fi | host1!host2!user | | Look at your sendmail.cf file for a humongous amount of | email parsing rules. | | Thanks for an excellent document though, I put a link to it from my | intranet page.
You're welcome.
| - Igor "Code Obscurity Creates Job Security" Chudov. |
Adam
-- "It is seldom that liberty of any kind is lost all at once." -Hume
- Igor.
Igor Chudov @ home wrote: | The decision that have just made is not a technical decision, it is | a business decision. You just decided that the needs of security | outweight the need to be able to deal with 100% of potential customers. You're mostly right. (I happen to know that we're expecting all customers to have IP based connectivity for the suite of applications these guidelines are being written for, but you're right that this is a business decision). | For example, suppose that you wrote your report for Gizmo International, | a company that sells a variety of widgets and gadgets to users in the | world. Their current setup is that the users can visit www.gizmo.com | and ask the server to send them notifications about new products. | | Based on your report's suggestions, Gizmo will have to cut off | all users with x.400 mail addresses, all UUCP users with bangs in their | addresses, all people with funky addresses provided by SPRINT, | and so on. For example, my moderation bot received a message | from the following person: | | From: /G=JAMBYL/S=KIWANIS/O=CUSTOMER/ADMD=KAZMAIL/C=KZ/@gateway.sprint.com | | (my eyes just popped when I saw such address) | | There are a lot of international people using this sprint gateway. | | This would potentially represent a loss of s significant number of | customers who will be bitching about gizmo.com to all their friends. | This is a bad decision from the marketing standpoint. | | I see this as a compelling reason to allow all possible email addresses | to be processed correctly, even if it means that there is more work | for code proofreading. At least the management responsible for | marketing must understand and approve your email handling guidelines. A | computer programmer cannot make such decisions himself. You're again correct; the document is undergoing review internally. May I have permission to quote you? I'm a big advocate of open debate when things are in a draft stage. Also, there are issues of what happens if an unusual address gets past the firewall and mishandled by some legacy code. Adam | igor | | | Adam Shostack wrote: | > | > Igor, and many others who commented on the fact that many characters | > are legal in email are correct. However, with the exception of '-' | > and '+', I'm not sure if I'll be changing the body of the guidelines. | > My issue is that dealing with a wide variety of characters that are | > legitamate, such as "cat ../../../etc/passwd"@foo.com is more | > dangerous than only accepting the common case of user@host.net. | > | > The number of addresses such as harvard!adam is dropping as the number | > of 'normal' addresses grows. | > | > | > Igor Chudov @ home wrote: | > | Adam Shostack wrote: | > | > http://www.homeport.org/~adam/review.html | > | > | In part " V.Code (Security Issues)/3.Data Checking" you say the following: | > | | > | `` Data coming in to Acme Widgets should be checked very carefully for | > | appropriateness. This check should be to see if the data is what | > | is expected (length, characters). Making a list of bad | > | characters is not the way to go; the lists are rarely complete. | > | A secure program should know what it expects, and reject other | > | input. (For example, if you are looking for an email address, | > | don't check to see if it contains a semi-colon or a newline, | > | check to see if it contains anything other than a [A-Za-z0-9._] | > | followed by an @, followed by a hostname [A-Za-z0-9._].)'' | > | END QUOTE | > | | > | That is not entirely correct. An email address is much more than | > | that, it can contain "!", several "@" characters (not next to each other | > | though), "%", and so on. x400 mail addresses (?) can contain "/", "=", | > | and all emails can have "+" and "-" and "_" in them. | > | | > | Some of the valid email addresses are | > | | > | user_name@company.com | > | alex+@pitt.edu | > | mi%aldan.UUCP@algebra.com | > | user%host.domain@anon.penet.fi | > | host1!host2!user | > | | > | Look at your sendmail.cf file for a humongous amount of | > | email parsing rules. | > | | > | Thanks for an excellent document though, I put a link to it from my | > | intranet page. | > | > You're welcome. | > | > | - Igor "Code Obscurity Creates Job Security" Chudov. | > | | > | > Adam | > | > -- | > "It is seldom that liberty of any kind is lost all at once." | > -Hume | > | | | | - Igor. | -- "It is seldom that liberty of any kind is lost all at once." -Hume
participants (5)
-
Adam Shostack -
Daniel Hagerty -
ichudov@algebra.com -
lists@lina.inka.de -
Mark O. Aldrich