Skip to content

fix: refining startup, and adding a log on async start error#48733

Open
shawkins wants to merge 3 commits intokeycloak:mainfrom
shawkins:iss48438
Open

fix: refining startup, and adding a log on async start error#48733
shawkins wants to merge 3 commits intokeycloak:mainfrom
shawkins:iss48438

Conversation

@shawkins
Copy link
Copy Markdown
Contributor

@shawkins shawkins commented May 5, 2026

closes: #48438

@pruivo the analysis of @vramik is correct, we're at least missing a log on an async startup error.

For the backport that could be a one line change, but moving forward I'd like to consider doing something like this PR to further simplify how we're doing the async starts. There is a lot of coupling between what thread to use (I think we should be using a managed one for full continutity with non-async starts), the exit handling, and the determination of whether async should be used, such that it would be clearer just to have all that logic in the same place.

Not added yet, but to test this scenario currently we either need a failing migration or to add a custom provider via the quarkus test logic that fails in postInit.

startup().run();
} else {
ManagedExecutor executor = Arc.container().instance(ManagedExecutor.class).get();
CompletableFuture.runAsync(startup(), executor).exceptionally(cause -> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can keep the startup signature and use

Suggested change
CompletableFuture.runAsync(startup(), executor).exceptionally(cause -> {
CompletableFuture.runAsync(this::startup, executor).exceptionally(cause -> {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added that change - it does broaden what is being run in an async manner, but that should be fine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be a good idea to add a javadoc, or throw an exception, about KeycloakApplication.getSessionFactory to warn that the instance won't always be available - and in general not to use it. I think after #48471 we'll have removed all of our non-test related usage of that method, so it's more of warning to those who may be trying to use services logic on their own.

final var executor = Executors.newSingleThreadExecutor();
CompletableFuture.runAsync(() -> runBootstrap(KeycloakApplication.sessionFactory), executor)
.exceptionally(throwable -> {
exit(throwable);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the idea that I logged the exception here. I'm surprised to see it missing.

pruivo
pruivo previously approved these changes May 6, 2026
Copy link
Copy Markdown
Member

@pruivo pruivo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @shawkins and sorry for the trouble.

@shawkins shawkins marked this pull request as ready for review May 6, 2026 14:35
@shawkins shawkins requested review from a team as code owners May 6, 2026 14:35
@shawkins shawkins marked this pull request as draft May 6, 2026 14:36
shawkins added 2 commits May 6, 2026 11:38
closes: keycloak#48438

Signed-off-by: Steve Hawkins <shawkins@redhat.com>
Signed-off-by: Steve Hawkins <shawkins@redhat.com>
@shawkins
Copy link
Copy Markdown
Contributor Author

shawkins commented May 6, 2026

LGTM, thanks @shawkins and sorry for the trouble.

It's no trouble. Let me see if I can come up with a test for this, then move it out of draft. I tired to come up with a QuarkusTest, but that seemed didn't seem well-suited.

@vramik if you are okay with it, I'd let this close #48438, then the user can open a new issue if there's an underlying migration problem that needs addressed.

@vramik
Copy link
Copy Markdown
Contributor

vramik commented May 6, 2026

@vramik if you are okay with it, I'd let this close #48438, then the user can open a new issue if there's an underlying migration problem that needs addressed.

Works for me, thanks @shawkins for taking care of it.

@shawkins shawkins marked this pull request as ready for review May 7, 2026 14:46
@shawkins shawkins requested review from pruivo, vmuzikar and vramik May 7, 2026 14:47
@shawkins
Copy link
Copy Markdown
Contributor Author

shawkins commented May 7, 2026

Should be ready for reivew now that there is a test. The exception handling is a little different between the async and non-async launches. The cli exception handler is in charge of the non-async exceptions, with the truncated presentation reminding users to specify --verbose.

In async mode this pr is instead logging a fatal exception.

For backport, I can pare down these changes, or if noone objects I'll just include everything.

@vmuzikar
Copy link
Copy Markdown
Contributor

vmuzikar commented May 7, 2026

The exception handling is a little different between the async and non-async launches. The cli exception handler is in charge of the non-async exceptions, with the truncated presentation reminding users to specify --verbose.

In async mode this pr is instead logging a fatal exception.

Does this mean the exception looks different (truncated vs full) based on the async mode? That is not very ideal.

@shawkins
Copy link
Copy Markdown
Contributor Author

shawkins commented May 7, 2026

Does this mean the exception looks different (truncated vs full) based on the async mode? That is not very ideal.

Yes, since I don't see a way to have quarkus treat the exception as if it were happening on the main thread, you have a couple of choices currently:

  1. Just live with the difference.
  2. Try to make the ExecutionExceptionHandler available to the async logic

Longer term we could log a request for quarkus to support a call like Quarkus.asyncExit(code, throwable) - such that the throwable is forwarded to the Quarkus.run exit handler.

@vmuzikar
Copy link
Copy Markdown
Contributor

vmuzikar commented May 7, 2026

2. Try to make the ExecutionExceptionHandler available to the async logic

What would be the estimated effort for this? I'd really want to avoid inconsistent error handling. Potentially it could be a follow-up.

@shawkins
Copy link
Copy Markdown
Contributor Author

shawkins commented May 7, 2026

What would be the estimated effort for this? I'd really want to avoid inconsistent error handling. Potentially it could be a follow-up.

Mostly just ugly. We are crossing over something from picocli to quarkus, so that means we need to use a static variable.

@pruivo
Copy link
Copy Markdown
Member

pruivo commented May 7, 2026

@shawkins can't you fetch the CompletableFuture from here?

@Override
public int run(String... args) throws Exception {
if (COMMAND != null) {
QuarkusKeycloakApplication application = Arc.container().instance(QuarkusKeycloakApplication.class).get();
QuarkusKeycloakSessionFactory sessionFactory = Arc.container().instance(QuarkusKeycloakSessionFactory.class).get();
COMMAND.onStart(application, sessionFactory);
}
if (isTestLaunchMode() || isNonServerMode()) {
// in test mode we exit immediately
// we should be managing this behavior more dynamically depending on the tests requirements (short/long lived)
Quarkus.asyncExit(ApplicationLifecycleManager.getExitCode());
} else {
Quarkus.waitForExit();
}
return ApplicationLifecycleManager.getExitCode();
}

On org.keycloak.quarkus.runtime.integration.jaxrs.QuarkusKeycloakApplication#onStartupEvent, you can store the CompletableFuture and the KeycloakMain would handle the Exception.

I think we can merge this as is and revisit it later.

@shawkins
Copy link
Copy Markdown
Contributor Author

shawkins commented May 7, 2026

@shawkins can't you fetch the CompletableFuture from here?

The place we actually want it is KeycloakMain.start - one way or the other that would currently require a static variable.

We could think about going the other direction - making the non-async handling more like this pr.

There's really no reason to supply --verbose to see the exceptions happening at this point. Anything past the session factory init will be something that's best to already have the stack trace for.

pruivo
pruivo previously approved these changes May 7, 2026
@shawkins
Copy link
Copy Markdown
Contributor Author

shawkins commented May 9, 2026

We could think about going the other direction - making the non-async handling more like this pr.

Doing that causes a couple of test failures: https://github.com/shawkins/keycloak/actions/runs/25582732362/job/75105599369

That is a little verbose for something like a wrong db password.

I'm open to doing that or I've opened
quarkusio/quarkus#54044 if we want to go the other direction.

Signed-off-by: Steve Hawkins <shawkins@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Keycloak 26.6.0/26.6.1 exits (code 1) ~100ms after async realm migration completes; migrations not persisted

4 participants