-
Notifications
You must be signed in to change notification settings - Fork 47.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How Much XSS Vulnerability Protection is React Responsible For? #3473
Comments
Hi, Albert from Yahoo Security team here. I was working with @mridgway on studying this issue. Just wanna share more thoughts. Re: disabling dangeriouslySetInnerHTML, it is a good move, but it would not be sufficient. React allow inject "script" and "style" tag, or "style" attributes. All of those would allow script execution. Also, in general, we worried about UI redressing attack that attackers can create arbitrary overlay on a page that could steal all the input submitted. |
I don't think "script" tags will execute because we use innerHTML to create them. Do the "style" tag and "style" attributes allow you to execute (in context) scripts in modern browsers? I think it is only older versions of IE. Regardless, the UI redressing attack is still an equally valid concern. |
I was also having the same doubt on script tag when I saw that it was supported in facebook.github.io/react/docs/tags-and-attributes.html, and then I figured out that script tag is used in server side react and it would be executable. And indeed most style attacks only works in IE8 or minor. But it still contributes to quite some amount of traffic. |
I am more inclined to see _isReactElement be used as a validation mechanism. The issue here is that we can't easily distinguish if an JSON object would have execution context or data context. While it is true that server side validation could help, it only works if every time we get an JSON we have a schema to validate it. The nature of JSON, as you suggested, is plain and free form. Otherwise it will be falling back as XML validation that would discourage developers to use it. In a nutshell, we really need a way to distinguish if a JSON object has executable context (i.e. _isReactElement). Thanks! |
None... I think React should stay as light and lean as it can be. Unix philosophy fan here. |
Note that JSON rendering of a virtual DOM tree is a perfectly valid use case. E.g. you can prerender a high level component into a smaller level virtual DOM tree on the server and then render the result on the client. This is also one of the ideas we had for web worker rendering where it would construct a virtual DOM that can be postMessaged and rendered on the other side. Therefore, we also have to consider what options we have if we chose not to protect ourselves against JSON data. How much can we mitigate that scenario? |
Possible solution, use different syntax for inserting strings as children:
Feels ugly but would solve the problem, wouldn't it? |
Coercion doesn't catch mistakes where you miss it. We would need to make that a requirement, e.g. by wrapping strings in some placeholder that is distinct from elements: |
That's why I suggested making it an error to pass a string to the non-coercing syntax (plain |
I see. That would be difficult to enforce since they can come nested arrays or non-JSX sources. |
|
@sebmarkbage As schemaless storages are becoming somewhat popular, this is going to become more prevalent. At some level it is a repeat of inserting foreign HTML markup as a result of lack of escaping (React made it a goal to make it a thing of the past), but instead the vulnerability moved to React components where wrongly typed values can lead to insertion of foreign hierarchies. I'd argue that even if we could somehow guarantee that insertion of foreign hierarchies wouldn't be dangerous, it would always be very bad. Malicious data will always wreak havoc on a system and React is just one part of it. The danger I see with React being targeted (apart from being easily exploited due to being debuggable client-side) is malicious data being rendered for other users of the same system, it's hard to guarantee that the malcious data won't be able to act freely under the assumed identity of those users. I think the way forward is to drop implicit wrapping of primitive values. Printable values should be elements like everything else from React's perspective, they should not be implicitly wrapped like they are today. A printable value should look something like While it may seem rather draconian in a sense, I think this is the sensible way forward, HTML introduced a lot of conviences that turned out to be really inconvenient (for user interfaces), this is another one. I think the practical implication of this is not as bad as it seems; any sensible user interface backend would expose labels as |
I think to provide safety it would actually have to be the reverse - it should be easier to render a primitive value than to render a component. This is the approach that other front-end frameworks have converged on over time. For instance
These are conflicting goals, so I think React would have to make a clear choice here, the options I can think of are:
For point 3, to support the JSON cases, as well as As proved by the bug which prompted the discussion, a boolean property is not enough to mark some data as executable, as I understand it, the only "safe" option would be an object reference, perhaps something like A bit of a brain dump, (1) seems simplest but unsafe. (2) is safest but a large change, (3) seems fiddly and error prone. Hopefully someone else has a better idea! |
Having discussed this with several people now, I'm leaning towards the "m 8000 ark as trusted" solution. It doesn't have to be a reference identity. It can be a unique enough token string or number. This token has to be shared between environments in a similar way as CSRF. By default we may share it with the realm through the global. It could also be completely disabled if you chose to.
|
By explicitly having to put all primitive values in a value element it becomes inherently safe, primitive values cannot render outside of it and non-primitive values are stringified inside of it (or w/e). Personally this makes a lot of sense to me, we already implicitly wrap primitive values internally, this is also how any non-markup/style-inheriting backend would work. There's no such thing as rendering inline text in iOS and most other user interface libraries, it only really makes sense for rendering rich-text content. This is also how you'd do it if it was implemented in say C/C++ where you would want a single type for all children. |
I see what you mean now, as long as the primitive and non-primitive syntaxes are separate then this is safe. Which is subtly different from HTML escaping of strings, where most strings work correctly when escaping is forgotten. |
Yeah, but then again, @sebmarkbage So it seems weird to me to add another "feature" to work around this when it isn't an inherent issue, it's introduced by implicit wrapping of primitive values (which we borrow from HTML, it still doesn't make sense for me for the purpose of user interfaces). I'm quite sure that performance implications of this should be minimal, but it obviously does significantly affect the code in one way or another. That is no doubt a very important consideration, but React has challenged many of the wrong-doings and inconvenient conveniences of HTML so far, I think this is another one that should be. |
This sounds like it should work - is the idea that by default it would be randomly generated on the client. but there'd be a top-level API to read/write it? |
I like the idea of trust, and React.createElement being the mechanism to declare trust in the current environment. const trust = element => typeof element === "object" && element
? React.createElement(element.type, element.props, ...[].concat(element.children))
: !element ? false
: String(element); // throw if not a string or trusted element
<div>{stringOrDie}</div>
// this span is trusted because React.createElement
<div><span>hey</span></div>
// explicitly React.createElement the POJO
<div>{trust(stringOrElement)}</div> The simplest way to do this is to have each element have a common property on the prototype. This can be exposed globally and/or allowed to be set externally for multiple React instances and integration with other libraries. Alternatively it could be an own non-enumerable property, or R.cE could set a toJSON that excludes this property. It'll cause confusion if you end up sending this to the server, and get an error about an incorrect 'token', rather than a missing one. Adding extra syntax for this would be unfortunate. A simple helper function exposed on React would be good, because this is an explicit opt-out to a (future) react security feature, like angular's $sce.trustAsHtml. The case where you have a serialized element is cool, but I don't think asking for a little more explicitness will harm anyone... users, or the person maintaining the code. "Oh, hey this is rendering something we get from the server, I need to be careful with this code." Also... please throw, don't warn or String(it). |
I might be missing something, but why does React need to complicate inserting stuff like conditional JSX elements by forcing users to wrap them, when it's not issue of React nor of 99% of use-cases when server is written properly. As far as we understood from original report, the "issue" was that server was accepting JSON object of any schema (wat?) and saving/sending it as-is (wat??), while most use-cases pick needed fields and, of course, validate them against type/range expectations. That's the golden rule of code security - never trust data from client as-is and perform server validation. If your server accepts any random data objects and sends them over to clients, you have much bigger problems than you think, whether you're using React or not. |
Also, @sebmarkbage are there any links with demo code to reproduce the issue? |
There is my attempt to reproduce this dangerous issue. As you can see, HTML code is added to body, but script doesn't work. Client-side rendering is protected. Ignore that difficult string concatenation, JSFiddle uses client-side version of JSXTransformer. |
@RReverser That applies to HTML as well and look at how that turned out :) React is obviously less susceptible as it relies on objects, but one missing/insufficient check somewhere (and a susceptible backend, say JSON schema) and an attacker can potentially act on behalf of all users visiting the site, that's the only reason I see why React should make an effort to prevent it (like it did with HTML markup). |
@alexeyraspopov |
@brigand okay :) And what can we tell about backend developer which allows server to receive JSON without fixed schema? UPD: this type of XSS works in every JS framework/lib which allows you to render HTML. |
Consider mustache, it provides two flavours of template iterpolation: <p>{{name}}</p> <!-- no XSS if name contains HTML -->
<p>{{{name}}}</p> <!-- XSS if name contains HTML --> The default is safe, and template authors can opt-in to unsafe behaviour, with the understanding that the variable used in the unsafe portion will be checked more stringently. React also has a feature, where the unsafe version is even more awkward - so it cannot be missed: <p>{name}</p> {// safe}
<p dangerouslySetInnerHTML={{__html: name}} /> {// unsafe} However, because React is now translating createElement calls into literals, the "safe" version is not safe at all. This creates a false sense of security. Good security is like an onion, it should have many layers. So yes, developers shouldn't allow arbitrary data - but there's a reason the top entries on the OWASP top 10 are related to untrusted data passing through the system - it happens in practice. Any reasonable steps popular libraries can take which makes it hard to makes these mistakes has a very high net gain. |
@sebmarkbage Perhaps I wasn't entirely clear (I think). It seems to me that "everything is rich-text" model of HTML is flawed; it's great for documents but it's immensely fragile for user interfaces. Documents can be rendered within a user interface, but you don't render a user interface within a document, so it doesn't make sense to me why we should be using the document model of HTML as the fundamental design language if it weren't for it being the current render target. I would argue that any mature React user interface should do its best to avoid the HTML style inheritance/document model. I.e.
...and not...
In a more simpler more traditional setup I would write that as:
...or if you want a more generic button implementation (whatever is in children is the label)...
There is no longer a universal understanding of how to render a string. Components are inherently isolated and reusable from a style perspective, primitives cannot be rendered so the this XSS issue goes out the window and even white-space rules become irrelevant. There's an inherent technical beauty in that if you will that's as unambigious as it gets (for whatever it's worth), it could be a very formal no nonsense syntax that would be hard to argue with... it could even be marketed as a new general purpose primitive type that's not limited to user interfaces (ahem, we're basically talking about an XML subset now...). That should be our starting point in my opinion. The only thing that's stands out is rich-text components which usually finds themselves useful in most user interfaces, so they should definitely not be forgotten. But it doesn't necessarily mean that rich-text should have a dedicated syntax or perhaps not the current syntax. React is currently targeting HTML which makes this a daunting aspect, but that's only because HTML provides all these conveniences, if we would have targeted anything else I'm quite sure React/JSX wouldn't have walked down this path without significant consideration. I'm not sure what the solution is, but HTML sets itself squarely apart from all other user interface targets and doesn't really prove itself exceptionally competent so I think it's only sane to question whether HTML is really an ideal role model for React/JSX user interfaces. |
React Native's |
@syranide Mid/Long term I think you're right that this is the kind of architecture that React and its users should be moving towards. However it is not an easy shift to do immediately. I don't see that the trusted source solution needs to block any important features that we would've gotten, by shifting to the newer architecture. So it seems like a safe short term solution that also doesn't block anything we were planning to do anyway. |
@sebmarkbage Agree and agree. |
I've been thinking about the priorities here. As I see it there are three threats: A) XSS through dangerouslySetInnerHTML, If we agree that React is the secondary layer to an already broken issue, and the real solution always is to fix the JSON pass-through hole. Then it follows that React is not responsible for providing the permanent solution. React is only responsible for mitigating the impact, IF this hole occurs. Full XSS (Threat A) is clearly a too high of a cost of this bug and even if you fix it quickly the harm is already done. Getting an embarrassing redressing message or phishing attempt through a static message (Threat C) might not be as bad. Even if they do happen, there's no guarantee that there is any significant harm done. They can then be fixed by addressing the pass-through hole, before any real harm is done. If we agree that we only need to protect against A and possibly B, then we have a lot more possibilities to limit the scope of the problem to a few sensitive attributes. Another thing that I would like to start thinking about is.... What if we ran React in a worker? What if we allowed a cross-origin script to communicate through the worker and render arbitrary The point of that scenario is explicitly to allow arbitrary UI elements in a subtree of the page (probably with |
@sebmarkbage As far as I'm concerned those are all related to HTML and not React (except for this particular issue). So while they should be addressed, doing so without implementing a much safer abstraction of HTML (which it seems we won't) doesn't seem worthwhile and I'm sure there would still be many tricks that ReactDOM might simply not be able to guard against. I think publicly documenting and explaining all the known pitfalls would go a long long way. It seems to me that the biggest issue right now is the general lack of awareness, not mistakes. E.g. what are all the dangers of a user supplied href? It's also important to document that |
Fixes facebook#3473 I tag each React element with `$$typeof: Symbol.for('react.element')`. We need this to be able to safely distinguish these from plain objects that might have come from user provided JSON. The idiomatic JavaScript way of tagging an object is for it to inherent some prototype and then use `instanceof` to test for it. However, this has limitations since it doesn't work with value types which require `typeof` checks. They also don't work across realms. Which is why there are alternative tag checks like `Array.isArray` or the `toStringTag`. Another problem is that different instances of React that might have been created not knowing about eachother. npm tends to make this kind of problem occur a lot. Additionally, it is our hope that ReactElement will one day be specified in terms of a "Value Type" style record instead of a plain Object. This Value Types proposal by @nikomatsakis is currently on hold but does satisfy all these requirements: https://github.com/nikomatsakis/typed-objects-explainer/blob/master/valuetypes.md#the-typeof-operator Additionally, there is already a system for coordinating tags across module systems and even realms in ES6. Namely using `Symbol.for`. Currently these objects are not able to transfer between Workers but there is nothing preventing that from being possible in the future. You could imagine even `Symbol.for` working across Worker boundaries. You could also build a system that coordinates Symbols and Value Types from server to client or through serialized forms. That's beyond the scope of React itself, and if it was built it seems like it would belong with the `Symbol` system. A system could override the `Symbol.for('react.element')` to return a plain yet cryptographically random or unique number. That would allow ReactElements to pass through JSON without risking the XSS issue. The fallback solution is a plain well-known number. This makes it unsafe with regard to the XSS issue described in facebook#3473. We could have used a much more convoluted solution to protect against JSON specifically but that would require some kind of significant coordination, or change the check to do a `typeof element.$$typeof === 'function'` check which would not make it unique to React. It seems cleaner to just use a fixed number since the protection is just a secondary layer anyway. I'm not sure if this is the right tradeoff. In short, if you want the XSS protection, use a proper Symbol polyfill. Finally, the reason for calling it `$$typeof` is to avoid confusion with `.type` and the use case is to add a tag that the `typeof` operator would refer to. I would use `@@typeof` but that seems to deopt in JSC. I also don't use `__typeof` because this is more than a framework private. It should really be part of the polyfilling layer.
Fixes facebook#3473 I tag each React element with `$$typeof: Symbol.for('react.element')`. We need this to be able to safely distinguish these from plain objects that might have come from user provided JSON. The idiomatic JavaScript way of tagging an object is for it to inherent some prototype and then use `instanceof` to test for it. However, this has limitations since it doesn't work with value types which require `typeof` checks. They also don't work across realms. Which is why there are alternative tag checks like `Array.isArray` or the `toStringTag`. Another problem is that different instances of React that might have been created not knowing about eachother. npm tends to make this kind of problem occur a lot. Additionally, it is our hope that ReactElement will one day be specified in terms of a "Value Type" style record instead of a plain Object. This Value Types proposal by @nikomatsakis is currently on hold but does satisfy all these requirements: https://github.com/nikomatsakis/typed-objects-explainer/blob/master/valuetypes.md#the-typeof-operator Additionally, there is already a system for coordinating tags across module systems and even realms in ES6. Namely using `Symbol.for`. Currently these objects are not able to transfer between Workers but there is nothing preventing that from being possible in the future. You could imagine even `Symbol.for` working across Worker boundaries. You could also build a system that coordinates Symbols and Value Types from server to client or through serialized forms. That's beyond the scope of React itself, and if it was built it seems like it would belong with the `Symbol` system. A system could override the `Symbol.for('react.element')` to return a plain yet cryptographically random or unique number. That would allow ReactElements to pass through JSON without risking the XSS issue. The fallback solution is a plain well-known number. This makes it unsafe with regard to the XSS issue described in facebook#3473. We could have used a much more convoluted solution to protect against JSON specifically but that would require some kind of significant coordination, or change the check to do a `typeof element.$$typeof === 'function'` check which would not make it unique to React. It seems cleaner to just use a fixed number since the protection is just a secondary layer anyway. I'm not sure if this is the right tradeoff. In short, if you want the XSS protection, use a proper Symbol polyfill. Finally, the reason for calling it `$$typeof` is to avoid confusion with `.type` and the use case is to add a tag that the `typeof` operator would refer to. I would use `@@typeof` but that seems to deopt in JSC. I also don't use `__typeof` because this is more than a framework private. It should really be part of the polyfilling layer.
Fixes facebook#3473 I tag each React element with `$$typeof: Symbol.for('react.element')`. We need this to be able to safely distinguish these from plain objects that might have come from user provided JSON. The idiomatic JavaScript way of tagging an object is for it to inherent some prototype and then use `instanceof` to test for it. However, this has limitations since it doesn't work with value types which require `typeof` checks. They also don't work across realms. Which is why there are alternative tag checks like `Array.isArray` or the `toStringTag`. Another problem is that different instances of React that might have been created not knowing about eachother. npm tends to make this kind of problem occur a lot. Additionally, it is our hope that ReactElement will one day be specified in terms of a "Value Type" style record instead of a plain Object. This Value Types proposal by @nikomatsakis is currently on hold but does satisfy all these requirements: https://github.com/nikomatsakis/typed-objects-explainer/blob/master/valuetypes.md#the-typeof-operator Additionally, there is already a system for coordinating tags across module systems and even realms in ES6. Namely using `Symbol.for`. Currently these objects are not able to transfer between Workers but there is nothing preventing that from being possible in the future. You could imagine even `Symbol.for` working across Worker boundaries. You could also build a system that coordinates Symbols and Value Types from server to client or through serialized forms. That's beyond the scope of React itself, and if it was built it seems like it would belong with the `Symbol` system. A system could override the `Symbol.for('react.element')` to return a plain yet cryptographically random or unique number. That would allow ReactElements to pass through JSON without risking the XSS issue. The fallback solution is a plain well-known number. This makes it unsafe with regard to the XSS issue described in facebook#3473. We could have used a much more convoluted solution to protect against JSON specifically but that would require some kind of significant coordination, or change the check to do a `typeof element.$$typeof === 'function'` check which would not make it unique to React. It seems cleaner to just use a fixed number since the protection is just a secondary layer anyway. I'm not sure if this is the right tradeoff. In short, if you want the XSS protection, use a proper Symbol polyfill. Finally, the reason for calling it `$$typeof` is to avoid confusion with `.type` and the use case is to add a tag that the `typeof` operator would refer to. I would use `@@typeof` but that seems to deopt in JSC. I also don't use `__typeof` because this is more than a framework private. It should really be part of the polyfilling layer.
How about taking another approach here? Instead of wrapping createElement wrap component functions and classes. that way the output of these functions will be plain objects and in big trees of elements it will significantly reduce the function calls. const MyComponent = (props) => <div />; to const MyComponent = component((props) => <div />); component() will handle adding signatures for the generated objects. |
The problem is that wrapping after |
When is this needed @spicyj ? |
I think the original post explains the issue in detail. If this is problematic for you for some reason, please open a new issue. |
Do you mind explaining why other, modern browsers cannot be coerced into XSS via CSS styles? |
There was a security hack that mentions React here: http://danlec.com/blog/xss-via-a-spoofed-react-element
Ultimately this is a server-side bug and NOT a bug in React itself. This issue is about figuring out if there is something we can do to mitigate issues when you have a JSON parsing bug or some server-side issue.
isValidElement
React is designed to work with plain objects as input, and in fact, we're even getting rid of the
_isReactElement
as a way to verify this. We'll allow any JSON object. IMO, there is no problem with the verification here.All string values are sanitized before inserted into the DOM (except for CSS styles which is a known wontfix issue for IE8).
In earlier versions we used
instanceof
checks but that didn't work well with multiple Reacts, it makes it difficult to optimize (inline objects are much faster) and couples JSX permanently to React, which we would like to avoid.dangerouslySetInnerHTML
One possible solution is to disable this feature and require it to be used imperatively (
React.findDOMNode(ref).innerHTML = ''
) which makes for worse performance at insertion time.However, I don't believe this is the only bad thing once you can insert arbitrary HTML tags. It is certainly the easiest way to gain access to XSS though. You can also insert arbitrary Web Components which could expose data. You can render form elements that can potentially pass data.
What else can we do?
Ultimately this is an issue where
<div>{userData}</div>
seems like a valid use case, but if your userData is compromised, it becomes dangerous.Should React be responsible for protecting itself against arbitrary JSON as children?
The text was updated successfully, but these errors were encountered: