8000 Clean out all STs immediately when TGT is destroyed by mduszyk · Pull Request #1189 · apereo/cas · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Clean out all STs immediately when TGT is destroyed #1189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 11, 2015

Conversation

mduszyk
Copy link
@mduszyk mduszyk commented Oct 6, 2015

Closes https://github.com/Jasig/cas/issues/1183

Changes:

  • added updateTicket to registry api (made it public - it was there already)
  • deleteTicket deletes service tickets as well (implemented for hazelcast, memcache, ehcache)
  • added tests for updateTicket and deleteTicket
  • removed service ticket registry from CentralAuthenticationServiceImpl
  • removed local deletion of TGT's service tickets in performLogout method

@@ -328,7 +316,8 @@ public ServiceTicket grantServiceTicket(
this.serviceTicketExpirationPolicy,
currentAuthentication != null);

this.serviceTicketRegistry.addTicket(serviceTicket);
this.ticketRegistry.updateTicket(ticketGrantingTicket);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, if we have a new ST, the TGT must be updated, but I thought it was internally done by the AbstractDistributedRegistry by using the delegators: https://github.com/Jasig/cas/blob/master/cas-server-core/src/main/java/org/jasig/cas/ticket/registry/AbstractDistributedTicketRegistry.java#L230. Am I missing the point?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I have missed this. The problem was that hazelcast registry which I was testing wasn't returning proxied ticket instance from getTicket method. Fixed this and rolled back changes related to addition of updateTicket:
mduszyk@3496d4f#diff-8f60343d0dc3c9f1ea4d75ddcfef314fR127

@@ -384,7 +372,7 @@ public ServiceTicket grantServiceTicket(final String ticketGrantingTicketId,
public TicketGrantingTicket delegateTicketGrantingTicket(final String serviceTicketId, final Credential... credentials)
throws AuthenticationException, TicketException {

final ServiceTicket serviceTicket = this.serviceTicketRegistry.getTicket(serviceTicketId, ServiceTicket.class);
final ServiceTicket serviceTicket = this.ticketRegistry.getTicket(serviceTicketId, ServiceTicket.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. So the idea here is to have only one tickets registry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The separate registry for STs was only added as part of a JBoss ticket registry when I traced the feature, and we no longer support that option, and I have never ever seen this separate registry used anywhere. Safe bet to remove it.

@mmoayyed
Copy link
Member
mmoayyed commented Oct 7, 2015

This is looking very good. Thank you. We do seem to have a build failure here, but I am inclined to blame it on Travis. Restarted the build, so we'll see what happens :)

@mduszyk Have you by the way, tested this changeset under load in a HA environment? Would be great to have some real test results.

@mduszyk
Copy link
Author
mduszyk commented Oct 7, 2015

Haven't tested it yet in HA env, only local tests on dev machine. I am going to push this to double instance CAS env with hazelcast cache as first stage of tests.

@mmoayyed
Copy link
Member
mmoayyed commented Oct 7, 2015

Very good. I think we'd want to leave the change open here until we have some solid results from you or others. Since it's a substantial change to the ticket data model, would be great to confirm the behavior under load.

}
ticket.markTicketExpired();

final Map<String, Service> services = ticket.getServices();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think this deserves a comment, to explain why sync block is removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issue though. No biggie :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I removed this because this only deletes service tickets on local copy of TGT which has effect on registry only in case of memory registry (direct modification of the object kept in registry map). Second, this sync block works as intended only in case of memory registry because for cache based registry the TGT object pulled from registry will be diff object each time getTicket is called. Cache based registry will pull serialized data from cache and then build new object - this behaviour may be diff depending on actual cache used but I think it is safest to assume this scenario.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The similar locking issue is here:
https://github.com/Jasig/cas/blob/4.1.x/cas-server-core/src/main/java/org/jasig/cas/CentralAuthenticationServiceImpl.java#L437
But it seems to be separate topic and it does no harm in my opinion, except this locking doesn't serialize any access in case of cache based registries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the comment. Would you mind turning this into a Java comment, or better yet, push a commit that turns that into a javadoc?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added javadoc for 2 methods I found, which synchronize on tickets pulled from registry.

@mmoayyed
Copy link
Member
mmoayyed commented Oct 8, 2015

@leleuj you OK with the changeset?

@mmoayyed mmoayyed changed the title 4.1.x - fix different behaviour for memory ticket registry and cache based registry Clean out all STs immediately when TGT is destroyed Oct 8, 2015
@leleuj
Copy link
Contributor
leleuj commented Oct 8, 2015

I'll be ok when the load testing will prove that performance are not compromised by the change (I'm thinking of the children tickets cleaning).

@mmoayyed
Copy link
Member
mmoayyed commented Oct 8, 2015

Great. So @mduszyk the sooner you can post results, the better. This is because we have other pending pulls that are going to affect this changeset here, and we would like to finalize this before moving on. otherwise, it will be more work for you start merging with master later.

Thanks again.

@mduszyk
Copy link
Author
mduszyk commented Oct 9, 2015

I tested this in HA env, 2 CAS servers using hazelcast registry. Simulated 100 login sessions each created ST ticket and validated on both servers. Then each session performed logout and validated again after logout expecting failure. It worked fine for each session.

@leleuj
Copy link
Contributor
leleuj commented Oct 9, 2015

Sounds good news! Did you already performed benchmarks without the change? What were the maximum you reach?

@mmoayyed
Copy link
Member
mmoayyed commented Oct 9, 2015

Could you actually post the results? As in, how did you simulate logins? which tools? which parameters? what sort of environment? etc.

@mmoayyed
Copy link
Member
mmoayyed commented Oct 9, 2015

I found out that some cache-based registries support bulk updates and removals. So potentially performance would be improved. Also, Even if we weren't removing all STs when TGT goes away, the cache at some point would be doing the same exact thing at a later interval, so whatever perf we have today remains the same with this change.

@mmoayyed
Copy link
Member
mmoayyed commented Oct 9, 2015

P.S: We can work on bulk operations later for those caches that support it.

@mduszyk
Copy link
Author
mduszyk commented Oct 9, 2015

The test was python script which was doing following:
1 Login

POST https://host1/cas/v1/tickets

2 Generate ST, (this step was repeated when more ST was generated)

POST https://host1/cas/v1/tickets/TGT-1192-7L07Zqkh5bq0WigEHVfKopyvoRXkvQXi1QXtCAcqCIXNJjjCQD-cas

3 Validate ST ticket - expect success

POST https://host2/cas/p3/serviceValidate

4 Logout (this request's time was measured)

DELETE https://host1/cas/v1/tickets/TGT-1192-7L07Zqkh5bq0WigEHVfKopyvoRXkvQXi1QXtCAcqCIXNJjjCQD-cas

5 Try to generate ST - expect failure

POST https://host1/cas/v1/tickets/TGT-1192-7L07Zqkh5bq0WigEHVfKopyvoRXkvQXi1QXtCAcqCIXNJjjCQD-cas

6 Try to validate ST generated in step 2 - expect failure:

POST https://host2/cas/p3/serviceValidate

It was run multiple times (multiple processes spawned: 20, 50, 100, 200) each step here repeated 3 times. So 3 times 20 processes, 3 times 50 processes etc. All those tests were run against:

  • orginal 4.1.0 with 1 ST generated (number of ST doesn't matter here - they are not deleted anyway)
  • fixed 4.1.0 with 1 ST
  • fixed 4.1.0 with 5 ST (python script was modified)

CAS instances were run on 2 vms which are part of the test env for the project I am working on. Both CAS were configured to use hazelcast registry and were talking to postgres db which is creds store. The test env wasn't isolated and there were possible other requests to CAS servers and DB during the test, so following results may not be very accurate.

Here are results:

Orig, 1 ST

N: avg: min: max:
20 0.311616 0.067281961441 0.772251844406
50 0.59084 0.0837497711182 2.61216592789
100 1.11429 0.145431995392 4.51663804054
200 2.20963 0.0674710273743 5.38854503632

Fixed, 1 ST

N: avg: min: max:
20 0.448173 0.145998001099 0.876653909683
50 0.736638 0.119976043701 2.04009699821
100 1.36184 0.167469024658 3.79975485802
200 2.71995 0.122474908829 10.1401410103

Fixed, 5 ST

N: avg: min: max:
20 0.487304 0.159667015076 0.849447011948
50 0.940349 0.256824970245 2.3297059536
100 1.57153 0.364641904831 4.43553686142
200 2.65689 0.343318939209 8.06444787979

@mmoayyed
Copy link
Member
mmoayyed commented Oct 9, 2015

Cool. Those are fairly excellent results. @leleuj do you concur?

@leleuj
Copy link
Contributor
leleuj commented Oct 11, 2015

Yes, the performance is not as good as currently, but the degradation is fairly limited.

A big thanks to @mduszyk for his tenacity on this.

👍

mmoayyed pushed a commit that referenced this pull request Oct 11, 2015
Clean out all STs immediately when TGT is destroyed
@mmoayyed mmoayyed merged commit c8c3890 into apereo:4.1.x Oct 11, 2015
frett pushed a commit to frett/cas that referenced this pull request Sep 19, 2016
Clean out all STs immediately when TGT is destroyed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0