AWS, S3 Signing: Fix leaked credentials when contacting multiple catalogs#14178
Conversation
|
FYI @nastra |
| sessionCache = newSessionCache(name, properties); | ||
| } | ||
|
|
||
| String oauth2ServerUri = properties.get(OAuth2Properties.OAUTH2_SERVER_URI); |
There was a problem hiding this comment.
My rationale here is: two auth sessions for 2 different catalogs, but with the same credential, would be wrongly conflated since they would share the same cache key.
There was a problem hiding this comment.
isn't that a separate issue from the signer one?
There was a problem hiding this comment.
It's connected imo: in the original issue reported by @c-thiel, if both catalogs use the same credential we would still get a 401 from the second one without this change.
There was a problem hiding this comment.
is this something we could reproduce in a small test?
| sessionCache = newSessionCache(name, properties); | ||
| } | ||
|
|
||
| String oauth2ServerUri = properties.get(OAuth2Properties.OAUTH2_SERVER_URI); |
There was a problem hiding this comment.
this should be falling back to ResourcePaths.tokens() because this property is not mandatory
| void standaloneTableSessionEmptyProperties() { | ||
| Map<String, String> properties = Map.of(); | ||
| Map<String, String> properties = | ||
| Map.of(OAuth2Properties.OAUTH2_SERVER_URI, "https://auth-server.com/v1/token"); |
There was a problem hiding this comment.
I guess this can be reverted, since the test says "empty properties" and we should be falling back to the tokens endpoint if the property isn't specified
| } | ||
|
|
||
| String oauth2ServerUri = | ||
| properties.getOrDefault(OAuth2Properties.OAUTH2_SERVER_URI, ResourcePaths.tokens()); |
There was a problem hiding this comment.
@nastra while adapting the Dremio AuthManager to this, I realized that there might be a better cache key:
the signer endpoint URI itself.
String cacheKey = properties.getOrDefault("s3.signer.uri", properties.get(CatalogProperties.URI));
if (config.token() != null) {
return sessionCache.cachedSession(
cacheKey, k -> newSessionFromAccessToken(config.token(), properties, parent));
}
if (config.credential() != null && !config.credential().isEmpty()) {
return sessionCache.cachedSession(cacheKey, k -> newSessionFromTokenResponse(config, parent));
}WDYT? Caching by signer endpoint seems more natural to me, since that's what we want conceptually: one session for each signing catalog.
There was a problem hiding this comment.
I guess the only issue with that is that we're leaking something s3-specific into this
There was a problem hiding this comment.
True. I'll leave the decision up to you 🙂
There was a problem hiding this comment.
I'm not sure that signer endpoint works in isolation. You can have two different tokens/sessions from two table loads and they're going to have the same sigher uri (the uri can be the same for all signing requests).
There was a problem hiding this comment.
The signer endpoint could be just the first part of the key. The full cache key could be <uri>:<token> or <uri>:<credential>.
e7c33df to
56b1aa7
Compare
| if (refreshClient == null) { | ||
| synchronized (this) { | ||
| if (refreshClient == null) { | ||
| refreshClient = sharedClient.withAuthSession(session); |
There was a problem hiding this comment.
| refreshClient = sharedClient.withAuthSession(session); | |
| this.refreshClient = sharedClient.withAuthSession(session); |
| } | ||
|
|
||
| private AuthSessionCache getOrCreateSessionCache(Map<String, String> properties) { | ||
| AuthSessionCache cache = this.sessionCache; |
There was a problem hiding this comment.
I think this can be simplified to
private AuthSessionCache sessionCache(Map<String, String> properties) {
if (sessionCache == null) {
synchronized (this) {
if (sessionCache == null) {
this.sessionCache = newSessionCache(name, properties);
}
}
}
return sessionCache;
}
There was a problem hiding this comment.
I was trying to save one volatile read.
There was a problem hiding this comment.
also all places need to call sessionCache()
There was a problem hiding this comment.
This synchronization is not needed when calling catalogSession().
There was a problem hiding this comment.
I did some simplification, PTAL again.
| properties.getOrDefault(OAuth2Properties.OAUTH2_SERVER_URI, ResourcePaths.tokens()); | ||
|
|
||
| if (config.token() != null) { | ||
| refreshClient(sharedClient, parent); |
There was a problem hiding this comment.
sorry I'm just realizing now that the refresh client and the cache is used differently than I was expecting, which actually makes just calling refreshClient(..) here weird (since I was assuming we would update all places to just call refreshClient(..) and that method would return the initialized client.
Let me think a bit about this
|
|
||
| protected OAuth2Util.AuthSession newSessionFromTokenResponse( | ||
| AuthConfig config, OAuth2Util.AuthSession parent) { | ||
| RESTClient client = refreshClient; |
There was a problem hiding this comment.
nit: seems like an unrelated change
There was a problem hiding this comment.
Was just trying to avoid 2 volatile reads while 1 is enough. I will revert.
| } | ||
|
|
||
| @Value.Lazy | ||
| public String endpointUri() { |
There was a problem hiding this comment.
Why aren't we just changing the endpoint() behavior? It looks like it's now only being used by endpointUri(), so can't we just update how we return the endpoint?
There was a problem hiding this comment.
This method is public, so I didn't want to change its semantics.
There was a problem hiding this comment.
Is this actually changing the behavior though? We're just returning the endpoint constructed at request time rather than at creation time? I think the expectation is that whatever comes from the endpoint() call should reflect the url to be called.
If we think it's a significant behavior change, then we should deprecate the other so that we don't have two different paths for getting the endpoint (that's awkward to carry forward)
|
@danielcweeks PTAL again 🙏 |
|
Thanks @adutra ! |
|
@adutra Could you please back-port this fix to 1.10.x? Thanks! |
|
@huaxingao here it is: #14586 |
…logs (apache#14178) * AWS, S3 Signing: Fix leaked credentials when contacting multiple catalogs Fixes apache#14100. * review * make refresh client and auth cache volatile * review * review * fix test * revert * review
…logs (apache#14178) * AWS, S3 Signing: Fix leaked credentials when contacting multiple catalogs Fixes apache#14100. * review * make refresh client and auth cache volatile * review * review * fix test * revert * review
Fixes #14100.