Spring Web MVC Form Tags and Cross-site Scripting ================================================= Sverre H. Huseby shh@thathost.com 2007-03-06 Test code is available at http://shh.thathost.com/secadv/spring-form-xss/spring-form-xss.tar.gz [ Update 2008-01-04: Both problems appear to be fixed. The problem described under form:input was fixed in verison 2.1 M4 on 2007-09-09 by making htmlEscape="true" the default. Hurray! The one described under form:errors was fixed in version 2.0.3 on 2007-03-09. ] I just started playing with the Spring Framework 2.0 Form tags that are supposed to simplify Web MVC. Maybe I misunderstand something, but I get the impression that the designers haven't given enough attention to the Cross-site Scripting (XSS) problem. Let me give you a couple of examples, both of them based on the following form inside a JSP document:
Your name
form:input ---------- First, let's look at the form:input tag. What happens if the name property of the userData object contains the following String value? "> The form:input tag will be expanded to the following (line break added for readability): "/> The result is that what should be an editable string in fact makes the browser run a small JavaScript. A web application using the above form code would risk being vulnerable to XSS. This particular problem can be solved by adding htmlEscape="true" to the form:input tag, or by setting the defaultHtmlEscape context-param to true in web.xml. But then my question is: When would anyone _ever_ want to allow the value of a command object field to interfere with the HTML of the input field? I guess the answer would be "never", or "almost never". The question that follows naturally then, is: Why isn't htmlEscape="true" the default for this tag? Why should the programmer _always_ have to remember to add it in order to avoid XSS? form:errors ----------- So for the next example: The SimpleFormController that handles the incoming form, uses a Validator that populates an Errors object with I18N-ified messages. Here's a snippet from the validator: if (name.length() > MAX_NAME_LEN) { errors.rejectValue("name", "err.tooLong", new Object[] { name, new Integer(MAX_NAME_LEN) }, "Too long"); } The messages.properties file contains the following property: err.tooLong=The name "{0}" must be at most {1} characters long Note how the invalid input is included as part of the error message. Now, let's say that MAX_NAME_LEN is 10, so that the following 29 character long input would lead to the error message being displayed: The message included in the generated HTML will look like this: The name "" must be at most 10 characters long Again, note how HTML that is dictated by the user (or an attacker) is included in the web page without being escaped. The funny thing is that even though the ErrorsTag class responsible for rendering the form:errors indirectly extends HtmlEscapingAwareTag, it doesn't appear to do any escaping of the messages even if htmlEscape="true" is given as an attribute, or if defaultHtmlEscape is set to true in web.xml. I suspect that this problem is a bug rather than "by design", since if form:errors doesn't do any escaping whatsoever, who is responsible for avoiding XSS? My Dream -------- I have a dream that one day the Spring Framework will add "secure by default" to its list of fundamental design principles. In particular, I have a dream that one day defaultHtmlEscape will be true by default, so that programmers will have to switch security off, rather than to switch it on. This makes sense (to me), since programmers will easily see when they need to disable it, but most of them won't realize when they need to enable it. It also makes sense (to me) because the model doesn't deal with HTML. The model contains person names, email addresses, customer numbers, textual descriptions, and stuff like that. The view is responsible for correctly presenting the model data on the selected target view platform. Often this view platform is a browser, meaning that HTML is used for rendering. But it need not be. When we have to keep telling the view that "this is _not_ HTML", it means (as I see it) that the designers of the view technology thinks that most of the model data contains HTML. Which makes no sense, particularly if we strive for a clean separation between the model and the view. Or something like that.