-
Notifications
You must be signed in to change notification settings - Fork 672
Fix CPU hog in config save for large users.conf #1364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CPU hog in config save for large users.conf #1364
Conversation
Hi Ingo! Thanks you for bringing this to attention and providing a fix. I will have a look at it as soon as I find the time. |
Gitblit is used at the company I work at and this company is indeed a very big global company. We use a forked version of the Gitblit Atlassian Crowd Plugin which syncs users from Crowd to Gitblit. Unfortunately the plugin only ever adds new users and never removes dead ones, so over time the users.conf has grown quite large (~62000 lines). Now the gory details ;) What makes the whole situation even worse is that the write method of ConfigUserService is synchronized! So if this method takes a minute, all other attempts to write need to wait this minute. And the next write of course takes another minute again. It's quite deadly ;) So that's why I decided to replace the config save logic with a much simpler one. Now saving takes 200ms or so. |
Great! Have you tested your class and could you provide some unit tests for it? I am trying to increase the test coverage to make supporting Gitblit easier in the future. That sounds like an impressive setup. If you have forked the Crowd plugin, would it be easy to make it also delete users? With a quick search I found two plugins, but didn't have a closer look. |
I have made the following tests, that were all successful:
But you are quite right, this may not be enough testing. After all, our users do not have all properties of the UserModel set. So yes, unit tests are probably a very good idea. However, I am not that deep in Gitblit source that I may catch all cases - and I will have to find the time to write those. So this might take a while. About issue #938: About the crowd plugin: I haven't had a look at the code there yet, but yes, it should be possible to delete users again. However, fixing the config saving seemed like easiest thing to do :) About the developer base: Hard to say, but in the last 7 days there were commits from ~180 different committers. Using git shortlog and some bash scripting: ~380 commiters in the past year. Amount of repos: 660 So far, Gitblit has served well for the past couple of years, I'd say. This config save thing was the only major issue that I ways aware of in the past 3 years that I work there. |
I just noticed: The UserServiceTest class already implicitly tests pretty much everything. It creates various attributes in UserModel and TeamModel, saves them to the file individually, reads the file again and checks if those attributes are still the same. Would that not be sufficient testing then? |
Thanks for your answer and for enduring my questions. :) |
Really nice catch and cool implementation, Ingo! |
We have built gitblit with this patch and deployed it on prod a week ago. This has solved the CPU hogging problem completely and now gitblit runs just fine. |
This change has now been active for over 3 months in our company and it runs perfectly fine and without any issues. Any chance this gets merged? |
Ya, I had an issue with it, why it didn't use the |
About the Yes, I know this is not ideal, but actually it is the fault of JGit which does not provide clean interfaces for its Oh, and thanks a lot for merging :) |
Ah, yes, that rings a bell. I guess I had found that out, too, but then for some reason forgot about the PR again. So thanks for the reminder. |
Or, checking the builds, maybe it was because it uses Java 8, so it would need to be part of 1.10, as I keep 1.9 on Java 7. |
If the realm file (users.conf) exceeds a certain size (about 30000 lines) saving this file causes a massive CPU hog. I have provided a branch that demonstrates the issue:
https://github.com/Curly060/gitblit/tree/show-user-config-save-cpu-hog-bug
master...Curly060:show-user-config-save-cpu-hog-bug
So simply check out this branch and run the ConfigUserService class from there. But beware: Depending on the CPU you have, this might take from 20s up to several minutes!
This patch here simply replaces the usage of JGits Config classes with a much more efficient and far more simple class.
I know that Gitblit was not made for big workgroups and clearly this is a big users.conf. Nevertheless, saving 60000 lines should not hog a CPU like that.